
This is a simple circuit to calculate:
w_i*x_i where i can be anything up to about 40 (insn=2)insn=0)insn=1)It has been designed as a coprocessor. The data is first added by setting load=1 and then supplying the data
for the dot product the index and data. Each set is a w,x pair. Its a 4 bit system and runs when run=1 and needs at least 16 clock cycles produce the answer. The answer is 12 bit value.
I've tested this using a verilator simulation included below - I like the cpp workbench for this. The testing has been mainly for numerical stability.
I intend for this to be driven by the RP2040 and to work as a "coprocessor" for vector calculations Other.
| # | Input | Output | Bidirectional |
|---|---|---|---|
| 0 | index[0] | out[0] | out[8] |
| 1 | index[1] | out[1] | out[9] |
| 2 | index[2] | out[2] | out[10] |
| 3 | index[3] | out[3] | out[11] |
| 4 | data[0] | out[4] | instruction [0] |
| 5 | data[1] | out[5] | instruction [1] |
| 6 | data[2] | out[6] | load |
| 7 | data[3] | out[7] | run |