
Nibble is a minimalistic 4-bit accumulator CPU with a Harvard architecture, designed to fit in a single tile on the TinyTapeout Sky130 process. It executes a simple 16-instruction ISA over a 2-cycle fetch-execute pipeline.
The CPU consists of:
The CPU decodes 8-bit instructions with a 4-bit opcode [7:4] and 4-bit immediate/address [3:0]:
| Opcode | Instruction | Operation | C | Z |
|---|---|---|---|---|
| 0x0 | NOP | No operation | - | - |
| 0x1 | LDI imm | A ← imm | - | ✓ |
| 0x2 | ADD imm | A ← A + imm | ✓ | ✓ |
| 0x3 | SUB imm | A ← A − imm | ✓ | ✓ |
| 0x4 | AND imm | A ← A & imm | - | ✓ |
| 0x5 | OR imm | A ← A | imm | - | ✓ |
| 0x6 | XOR imm | A ← A ^ imm | - | ✓ |
| 0x7 | NOT | A ← ~A | - | ✓ |
| 0x8 | SHL | {C,A} ← {A[3],A<<1} | ✓ | ✓ |
| 0x9 | SHR | {A,C} ← {A>>1,A[0]} | ✓ | ✓ |
| 0xA | JMP addr | PC ← addr | - | - |
| 0xB | JZ addr | if Z: PC ← addr | - | - |
| 0xC | JC addr | if C: PC ← addr | - | - |
| 0xD | JNZ addr | if ¬Z: PC ← addr | - | - |
| 0xE | IN | A ← port_in | - | ✓ |
| 0xF | HLT | Halt execution | - | - |
Legend: ✓ = flag updated, - = flag unchanged
The CPU operates in a 2-cycle fetch-execute pipeline:
FETCH Phase: The current PC value is output on uio[3:0]. External memory (RP2040 or EEPROM) responds with the instruction on ui_in[7:0], which is latched into the Instruction Register (IR).
EXECUTE Phase: The latched instruction is decoded and executed. The ALU computes results, flags are updated, and the PC is incremented (or modified by branch operations).
Each instruction takes exactly 2 clock cycles. The phase output (uo[7]) toggles between 0 (FETCH) and 1 (EXECUTE) so that the external memory controller can respond synchronously.
Inputs (ui[7:0]):
ui[3:0] - Instruction bits 0–3ui[7:4] - Instruction bits 4–7 (opcode)Outputs (uo[7:0]):
uo[3:0] - Accumulator value (connect to LEDs!)uo[4] - Carry flaguo[5] - Zero flaguo[6] - Halted flaguo[7] - Phase (0=FETCH, 1=EXECUTE)Bidirectional (uio[7:0]):
uio[3:0] - Program Counter output (address bus for external ROM)uio[7:4] - General-purpose input port (for IN instruction)uio[3:0] (PC output) to determine which instruction address is being requested.uo[7] (phase) is 0 (FETCH), drive the corresponding instruction onto ui[7:0] within one clock cycle.uo[3:0] to visualize the accumulator value. Optionally monitor uo[4:7] for flags and status.Load the following 16 instructions into external memory:
0x0 (PC=0): 0x10 LDI 0 // A ← 0
0x1 (PC=1): 0x21 ADD 1 // A ← A + 1
0x2 (PC=2): 0x5F OR 15 // A ← A | 0xF (ensure A ≤ 15)
0x3 (PC=3): 0xB0 JZ 0 // if Z: jump to 0 (never from OR)
...
0xF (PC=15): 0xF0 HLT // Halt
After reset, the accumulator will increment from 0 to 15 on each execution, with LEDs reflecting the count.
The TinyTapeout Sky130 demo board integrates an RP2040 that can act as the program memory controller:
uio[3:0] using GPIO pins to determine the current PC.ui[7:0].uo[7] (phase) toggles, allowing the RP2040 to synchronize instruction delivery.Load your program into the RP2040 firmware, and the CPU will execute it automatically with LEDs showing the accumulator state.
Program Memory: 16×8-bit instruction ROM (can be implemented as:
Clock Source: Any clock generator capable of 1 MHz (can vary up to a few MHz with timing analysis)
LEDs (optional): 4 LEDs connected to uo[3:0] to visualize the accumulator. Additional LEDs can monitor uo[4:6] for carry, zero, and halted flags.
Input Port (optional): 4-bit input or button circuit connected to uio[7:4] for the IN instruction.
| # | Input | Output | Bidirectional |
|---|---|---|---|
| 0 | instruction bit 0 | accumulator bit 0 | program counter bit 0 (output) |
| 1 | instruction bit 1 | accumulator bit 1 | program counter bit 1 (output) |
| 2 | instruction bit 2 | accumulator bit 2 | program counter bit 2 (output) |
| 3 | instruction bit 3 | accumulator bit 3 | program counter bit 3 (output) |
| 4 | instruction bit 4 (opcode LSB) | carry flag | input port bit 0 (input) |
| 5 | instruction bit 5 | zero flag | input port bit 1 (input) |
| 6 | instruction bit 6 | halted | input port bit 2 (input) |
| 7 | instruction bit 7 (opcode MSB) | phase (0 = fetch, 1 = execute) | input port bit 3 (input) |