
Note: after the messy tapeout on iHP26a, porting to CMOS5L PDK.
IMPORTANT: This custom circuit and protocol is not at all compliant or even compatible, even remotely linked to any 802.3 standard. It's all explained and detailed on Hackaday at https://hackaday.io/project/198914
The miniMAC is a (currently partial) Media Access Controller for a simplified data link over twisted pairs. It provides error detection and scrambling of 16-bit data words, which are combined with a 17th bit for data/control framing (C/D). The 18-bit result is suitable for sending to a (custom) PHY (see https://hackaday.io/project/203186 ) for serialisation and line coding. This unit chains two sophisticated circuits:


Conveniently, the same sea-of-XOR is identical, both for encoding and decoding, and the decoding side is "recursive" such that it amplifies any transmission error at the receiving end. The sorted avalanche for a single bitflip is : 7 8 8 8 8 9 9 9 9 1 12 13 14 15 15 15 15 15 (total=200). The 64 XOR2 gates have a propagation delay of 10 gates, yet the effective latency in the system is just one XOR in the critical datapath:

These very different types of circuits are complementary, together they provide very strong scrambling, eliminate problems inherent with classical LFSRs, and detect errors very early. With an equivalent of 56 bits of state and uncrashable mathematics, the system remains fast, compact and tailored for safety and early retransmission to save bandwidth/latency and reduce buffer sizes (and cost).
An external circuit is required to implement the higher-level protocol, buffering and retransmission logic.
The gPEAC requires two cycles, two passes through the adder: first to compute the sums, then to adjust the modulus. OTOH the Hammer18 circuit requires one depth of XOR, but at different places:
For encoding, the input data goes through gPEAC then Hammer is inserted at the end of the last cycle.
For decoding, the scrambled data goes through Hammer at the start of the first cycle of gPEAC descrambling.
Due to pin constraints, the 18-bit data words are transmitted in two cycles with 9-bit half-words. Counting input and output (2 cycles each), the overall latency is 5 cycles, following a sequence that is internally started when data is initially input with Den=1. Even at the low default 50MHz clock speed, that's still a bandwidth of 25M×18=450Mbps: fast enough to oversaturate a Cat5 twisted pair.
This tile contains four main pipelined units, sequenced by a shift register:
The encode and decode units can be tested separately or together in the "loopback" mode.

First let's examine the pinout. The inputs:
The outputs:

Notes :
Custom boards or adapters will be made. I will try to get a pair of chips to connect together, such that I can verify a whole transmission chain.
This started as a VHDL to Verilog+IHP PDK port but it will likely grow to a more-featured project.
I'll try to get 2 boards to test both coder and decoder in a chain, to simulate noisy communications.
In parallel I try to design a decent PHY, a much more difficult endeavour.
Logically, the MAC must be completed with a FSM, a buffer and a host interface, likely in the next tapeout so I can play with memory blocks.
| # | Input | Output | Bidirectional |
|---|---|---|---|
| 0 | DI0 | DO0 | D08 |
| 1 | DI1 | DO1 | QEN |
| 2 | DI2 | DO2 | CLK_out |
| 3 | DI3 | DO3 | Zero |
| 4 | DI4 | DO4 | Enc |
| 5 | DI5 | DO5 | Dec |
| 6 | DI6 | DO6 | DEN |
| 7 | DI7 | DO7 | DI8 |