
Credits: We gratefully acknowledge the COE in Integrated Circuits and Systems (ICAS) and Department of ECE. Our special thanks to Dr K S Geetha (Vice Principal) and, Dr. K N Subramanya (principal) for their constant support and encouragement to do TAPEOUT in Tiny Tapeout 8.
The tt_um_mac module is a Multiply-Accumulate (MAC) unit designed for high-performance digital signal processing and embedded system applications. This module integrates a Dadda multiplier and a Kogge-Stone adder to achieve efficient and fast computations. The MAC unit performs a sequence of multiplication and accumulation operations, which are essential in various digital signal processing tasks, such as filtering and convolution.
The Dadda multiplier is a high-speed multiplier designed for efficient computation. It reduces the partial products in a sequence of reduction stages until the final product is obtained. In this design, a 4x4 Dadda multiplier is used to compute the 8-bit product of the two 4-bit operands, A and B.
Pipeline registers are implemented to enhance the performance of the MAC unit by storing intermediate results at each stage of the operation. This design uses two pipeline registers:
The Kogge-Stone adder is a parallel-prefix form of a carry-lookahead adder, known for its high speed and efficiency in handling large bit-width additions. It computes the sum of the product and the current accumulator value (Acc), which is stored in the Sum_stage register.
The accumulator (Acc) is a key component that stores the ongoing sum of the products. It is updated with the result from the Kogge-Stone adder on each clock cycle, allowing the MAC unit to perform repeated accumulation operations.
When the reset signal (rst_n) is asserted low, the pipeline registers (Prod_stage, Sum_stage) and the accumulator (Acc) are cleared, resetting the MAC unit to its initial state.
To verify the functionality of the tt_um_mac module, a testbench (tt_um_mac_tb) has been provided. The testbench simulates different input scenarios and observes the output behavior of the tt_um_mac module to ensure that it works correctly.
tt_um_mac module behaves as expected.Below is a summary of the test cases used in the tt_um_mac_tb testbench, along with their expected results.
| Time (ns) | ui_in (Input A) |
uio_in (Input B) |
Operation | Expected uo_out (Output) |
|---|---|---|---|---|
| 0-10 | 00000000 (0) |
00000000 (0) |
Reset | 00000000 (0) |
| 10-30 | 00000011 (3) |
00000010 (2) |
Multiply, Accumulate | 00000110 (6) |
| 30-50 | 00000001 (1) |
00000100 (4) |
Multiply, Accumulate | 00001010 (10) |
| 50-70 | 00000101 (5) |
00000011 (3) |
Multiply, Accumulate | 00011001 (25) |
| 70-90 | 00000111 (7) |
00000010 (2) |
Multiply, Accumulate | 00100111 (39) |
| 90-110 | 00000000 (0) |
00000000 (0) |
No Operation (Idle) | 00100111 (39) |
| 110-130 | 00000001 (1) |
00000001 (1) |
Multiply, Accumulate | 00101000 (40) |
During the simulation, you can monitor the console or waveform outputs for detailed step-by-step results. The testbench uses $monitor to display real-time updates of the inputs and the resulting output.
initial begin
$monitor("Time=%0d | ui_in=%b, uio_in=%b | uo_out=%b", $time, ui_in, uio_in, uo_out);
end
This will provide you with a detailed trace of how the tt_um_mac module processes the inputs to generate the expected outputs.
| # | Input | Output | Bidirectional |
|---|---|---|---|
| 0 | ui_in[[0] | uo_out[0] | uio_in[0] |
| 1 | ui_in[[1] | uo_out[1] | uio_in[1] |
| 2 | ui_in[[2] | uo_out[2] | uio_in[2] |
| 3 | ui_in[[3] | uo_out[3] | uio_in[3] |
| 4 | uo_out[4] | ||
| 5 | uo_out[5] | ||
| 6 | uo_out[6] | ||
| 7 | uo_out[7] |