295 Tiny Shader

295 : Tiny Shader

  • Author: Leo Moser
  • Description: With Tiny Shader you can write a small program to create different images and even animations.
  • GitHub repository
  • GDS submitted
  • HDL project
  • Extra docs
  • Clock: 25175000 Hz

How it works

Modern GPUs use fragment shaders to determine the final color for each pixel. Thousands of shading units run in parallel to speed up this process and ensure that a high FPS ratio can be achieved.

Tiny Shader mimics such a shading unit and executes a shader with 10 instructions for each pixel. No framebuffer is used, the color values are generated on the fly. Tiny Shader also offers an SPI interface via which a new shader can be loaded. The final result can be viewed via the VGA output at 640x480 @ 60 Hz, although at an internal resolution of 64x48 pixel.


These images and many more can be generated with Tiny Shader. Note, that shaders can even be animated by acessing the user or time register.

test2.png test4.png test5.png test7.png

The shader for the last image is shown here:

# Shader to display a rainbow colored sine wave

# Clear R3

# Get the sine value for x and add the user value

# Set default color to R0

# Get the sine value for R0

# Get y coord

# If the sine value is greater
# or equal y, set color to black


Tiny Shader has four (mostly) general purpose registers, REG0 to REG3. REG0 is special in a way as it is the target or destination register for some instructions. All registers are 6 bit wide.


The shader has four sources to get input from:

  • X - X position of the current pixel
  • Y - Y position of the current pixel
  • TIME - Increments at 7.5 Hz, before it overflow it counts down again.
  • USER - Register that can be set via the SPI interface.

The goal of the shader is to determine the final output color:

  • RGB - The output color for the current pixel. Channel R, G and B can be set individually. If not set, the color of the previous pixel is used.
Sine Look Up Table

Tiny Shader contains a LUT with 16 6-bit sine values for a quarter of a sine wave. When accesing the LUT, the entries are automatically mirrored to form one half of a sine wave with a total of 32 6-bit values.


The following instructions are supported by Tiny Shader. A program consists of 10 instructions and is executed for each pixel individually. The actual resolution is therefore one tenth of the VGA resolution (64x48 pixel).

Instruction Operation Description
SETRGB RA RGB <= RA Set the output color to the value of the specified register.
SETR RA R <= RA[1:0] Set the red channel of the output color to the lower two bits of the specified register.
SETG RA G <= RA[1:0] Set the green channel of the output color to the lower two bits of the specified register.
SETB RA B <= RA[1:0] Set the blue channel of the output color to the lower two bits of the specified register.
Instruction Operation Description
GETX RA RA <= X Set the specified register to the x position of the current pixel.
GETY RA RA <= Y Set the specified register to the y position of the current pixel.
GETTIME RA RA <= TIME Set the specified register to the current time value, increases with each frame.
GETUSER RA RA <= USER Set the specified register to the user value, can be set via the SPI interface.
Instruction Operation Description
IFEQ RA TAKE <= RA == R0 Execute the next instruction if RA equals R0.
IFNE RA TAKE <= RA != R0 Execute the next instruction if RA does not equal R0.
IFGE RA TAKE <= RA >= R0 Execute the next instruction if RA is greater then or equal R0.
IFLT RA TAKE <= RA < R0 Execute the next instruction if RA is less than R0.
Instruction Operation Description
DOUBLE RA RA <= RA * 2 Double the value of RA.
HALF RA RA <= RA / 2 Half the value of RA.
ADD RA RB RA <= RA + RB Add RA and RB, result written into RA.
Instruction Operation Description
CLEAR RA RA <= 0 Clear RA by writing 0.
LDI IMMEDIATE RA <= IMMEDIATE Load an immediate value into RA.
Instruction Operation Description
SINE RA RA <= SINE[R0[4:0]] Get the sine value for R0 and write into RA. The sine value LUT has 32 entries.
Instruction Operation Description
AND RA RB RA <= RA & RB Boolean AND of RA and RB, result written into RA.
NOT RA RB RA <= ~RB Invert all bits of RB, result written into RA.
XOR RA RB RA <= RA ^ RB XOR of RA and RB, result written into RA.
Instruction Operation Description
MOV RA RB RA <= RB Move value of RB into RA.
Instruction Operation Description
SHIFTL RA RB RA <= RA « RB Shift RA with RB to the left, result written into RA.
SHIFTR RA RB RA <= RA » RB Shift RA with RB to the right, result written into RA.
Instruction Operation Description
NOP R0 <= R0 & R0 No operation.

How to test

First set the clock to 25.175 MHz and reset the design. For a simple test, simply connect a Tiny VGA to the output Pmod. A shader is loaded by default and an image should be displayed via VGA.

For advanced features, connect an SPI controller to the bidir pmod. If ui[0], the mode signal, is set to 0, you can write to the user register via SPI. Note that only the last 6 bit are used.

If the mode signal is 1, all bytes transmitted via SPI are shifted into the shader memory. This way you can load a new shader program. Have fun!

External hardware

  • Tiny VGA or similar VGA Pmod
  • Optional: SPI controller to write the user register and new shaders


# Input Output Bidirectional
0 mode R[1] CS
1 debug_i[0] G[1] MOSI
2 debug_i[1] B[1] MISO
3 vsync SCK
4 R[0] next_vertical
5 G[0] next_frame
6 B[0] debug_o[0]
7 hsync debug_o[1]

Chip location

Controller Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Analog Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Analog Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux tt_um_chip_rom (Chip ROM) tt_um_factory_test (TinyTapeout 06 Factory Test) tt_um_analog_factory_test (TT06 Analog Factory Test) tt_um_analog_factory_test (TT06 Analog Factory Test) tt_um_urish_charge_pump (Dickson Charge Pump) tt_um_psychogenic_wowa (WoWA) tt_um_oscillating_bones (Oscillating Bones) tt_um_kevinwguan (Crossbar Array) tt_um_coloquinte_moosic (Moosic logic-locked design) tt_um_alexsegura_pong (Pong) tt_um_iron_violet_simon (Iron Violet) tt_um_tomkeddie_a (VGA Experiments in Tennis) tt_um_MichaelBell_tinyQV (TinyQV Risc-V SoC) tt_um_andychip1_sn74169 (sn74169) tt_um_mattvenn_r2r_dac (Analog 8bit R2R DAC) tt_um_thorkn_audiochip_v2 (AudioChip_V2) tt_um_faramire_gate_guesser (Gate Guesser) tt_um_urish_simon (Simon Says game) tt_um_TT06_SAR_wulffern (TT06 8-bit SAR ADC) tt_um_soundgen (soundgen) tt_um_ledcontroller_Gatsch (ledcontroller) tt_um_digitaler_filter_rathmayr (Digitaler Filter) tt_um_histefan_top (Snake Game) tt_um_mayrmichael_wave_generator (Wave Generator) tt_um_advanced_counter (jku-tt06-advanced-counter) tt_um_FanCTRL_DomnikBrandstetter (PI-Based Fan Controller) tt_um_ps2_morse_encoder_top (PS/2 Keyboard to Morse Code Encoder) tt_um_calculator_muehlbb (16-bit calculator) tt_um_hpretl_tt06_tempsens (Temperature Sensor NG) tt_um_haeuslermarkus_fir_filter (FIR Filter with adaptable coefficients) tt_um_mattvenn_rgb_mixer (RGB Mixer demo) tt_um_analog_loopback (Analog loopback) tt_um_entwurf_integrierter_schaltungen_hadner (Projekt KEIS Hadner Thomas) tt_um_seven_segment_fun1 (7-segment-FUN) tt_um_moving_average_master (Moving average filter) tt_um_rgbled_decoder (SPI to RGBLED Decoder/Driver) tt_um_4bit_cpu_with_fsm (4-Bit CPU mit FSM) tt_um_flappy_bird (Flappy Bird) tt_um_drops (drops) tt_um_enieman (UART-Programmable RISC-V 32I Core) tt_um_gabejessil_timer (2 Player Game) tt_um_wokwi_384804985843168257 (playwithnumbers) tt_um_wokwi_384711264596377601 (luckyCube) tt_um_hpretl_tt06_tdc (Synthesized Time-to-Digital Converter (TDC)) tt_um_wokwi_384437973887503361 (Asynchronous Down Counter) tt_um_spi_pwm_djuara (spi_pwm) tt_um_SteffenReith_PiMACTop (PiMAC) tt_um_mattvenn_relax_osc (Relaxation oscillator) tt_um_jv_sigdel (1st passive Sigma Delta ADC) tt_um_wokwi_392873974467527681 (PILIPINAS) tt_um_scorbetta_goa (GOA - grogu on ASIC) tt_um_sanojn_ttrpg_dice (TTRPG Dice + simple I2C peripheral) tt_um_urish_dffram (DFFRAM Example (128 bytes)) tt_um_lucaz97_monobit (Monobit Test) tt_um_noritsuna_i4004 (i4004 for TinyTapeout) tt_um_hpretl_tt06_tdc_v2 (Synthesized Time-to-Digital Converter (TDC) v2) tt_um_vaf_555_timer (A 555-Timer Clone for Tiny Tapeout 6) tt_um_obriensp_be8 (8-bit CPU with Debugger (Lite)) tt_um_toivoh_retro_console (Retro Console) tt_um_mattvenn_inverter (Double Inverter) tt_um_SteffenReith_ASGTop (ASG) tt_um_lucaz97_rng_tests (rng Test) tt_um_dieroller_nathangross1 (Die Roller) tt_um_kwilke_cdc_fifo (Clock Domain Crossing FIFO) tt_um_spiff42_exp_led_pwm (LED PWM controller) tt_um_devinatkin_fastreadout (Fast Readout Image Sensor Prototype) tt_um_ja1tye_tiny_cpu (Tiny 8-bit CPU) tt_um_7seg_animated (Animated 7-segment character display) tt_um_neurocore (Neurocore) tt_um_zhwa_rgb_mixer (RGB Mixer) tt_um_wokwi_394704587372210177 (Cambio de giro de motor CD) tt_um_ian_keypad_controller (Keypad controller) tt_um_urish_spell (SPELL) tt_um_vks_pll (PLL blocks) tt_um_fountaincoder_top (multimac) tt_um_dsatizabal_opamp (Simple FET OpAmp with Sky130.) tt_um_obriensp_be8_nomacro (8-bit CPU with Debugger) tt_um_LFSR_shivam (10-bit Linear feedback shift register) tt_um_shivam (Pulse Width Modulation) tt_um_algofoogle_tt06_grab_bag (TT06 Grab Bag) tt_um_meiniKi_tt06_fazyrv_exotiny (FazyRV-ExoTiny) tt_um_wokwi_394888799427677185 (4-bit stochastic multiplier traditional) tt_um_QIF_8bit (8 Bit Digital QIF) tt_um_MATTHIAS_M_PAL_TOP_WRAPPER (easy PAL) tt_um_andrewtron3000 (Rule 30 Engine!) tt_um_tommythorn_4b_cpu_v2 (Silly 4b CPU v2) tt_um_aerox2_jrb8_computer (The James Retro Byte 8 computer) tt_um_wokwi_394898807123828737 (4-bit Stochastic Multiplier Compact with Stochastic Resonator) tt_um_argunda_tiny_opamp (Tiny Opamp) tt_um_fdc_chip (Frequency to digital converters (asynchronous and synchronous)) tt_um_8bit_cpu (8-Bit CPU In a Week) tt_um_mitssdd (co processor for precision farming) tt_um_fstolzcode (Tiny Zuse) tt_um_liu3hao_rv32e_min_mcu (tt06-RV32E_MinMCU) tt_um_kianV_rv32ima_uLinux_SoC (KianV uLinux SoC) tt_um_wokwi_395444977868278785 (*NOT WORKING* HP 5082-7500 Decoder) tt_um_wokwi_394618582085551105 (Keypad Decoder) tt_um_wokwi_395054820631340033 (Workshop Hackaday Juli) tt_um_wokwi_395055035944909825 (Some_LEDs) tt_um_wokwi_395055351144787969 (Hack a day Tiny Tapeout project) tt_um_wokwi_395054823569451009 (First TT Project) tt_um_wokwi_395054823837887489 (Dice) tt_um_wokwi_395055341723330561 (Workshop_chip) tt_um_jduchniewicz_prng (8-bit PRNG) tt_um_wokwi_395054564978002945 (Bestagon LED matrix driver) tt_um_wokwi_395054466384583681 (1-Bit ALU 2) tt_um_wokwi_395058308283408385 (test for tiny tapeout hackaday) tt_um_s1pu11i_simple_nco (Simple NCO) tt_um_wokwi_395055359324730369 (Tiny_Tapeout_6_Frank) tt_um_disp1 (Display test 1) tt_um_pckys_game (PCKY´s Successive Approximation Game) tt_um_tiny_shader_mole99 (Tiny Shader) tt_um_wokwi_393815624518031361 (My Chip) tt_um_minibyte (Minibyte CPU) tt_um_emilian_rf_playground (IDAC8 based on divide current by 2) tt_um_triple_watchdog (Triple Watchdog) tt_um_wokwi_395142547244224513 (EFAB Demo 2) tt_um_chisel_hello_schoeberl (Chisel Hello World) tt_um_aiju_8080 (8080 CPU) tt_um_wokwi_395134712676183041 (Inverters) tt_um_nubcore_default_tape (DEFAULT) tt_um_wuehr1999_servotester (Servotester) tt_um_wokwi_395055722430895105 (Servo Signal Tester) tt_um_exai_izhikevich_neuron (Izhikevich Neuron) tt_um_lisa (LISA 8-Bit Microcontroller) tt_um_wokwi_394707429798790145 (32-Bit Galois Linear Feedback Shift Register) tt_um_CKPope_top (X/Y Controller) tt_um_MNSLab_BLDC (Universal Motor and Actuator Controller) tt_um_couchand_dual_deque (Dual Deque) tt_um_JamesTimothyMeech_inverter (Programmable Thing) tt_um_signed_unsigned_4x4_bit_multiplier (Signed Unsigned multiplyer) tt_um_lipsi_schoeberl (Lipsi: Probably the Smallest Processor in the World) tt_um_i_tree_batzolislefteris (Anomaly Detection using Isolation trees) tt_um_wokwi_394830069681034241 (Cyclic Redundancy Check 8 bit) tt_um_rng_3_lucaz97 (RNG3) tt_um_wokwi_395263962779770881 (Bivium-B Non-Linear Feedback Shift Register) tt_um_dvxf_dj8 (DJ8 8-bit CPU) tt_um_silicon_tinytapeout_lm07 (Digital Temperature Monitor) tt_um_htfab_flash_adc (Flash ADC) tt_um_chisel_pong (Chisel Pong) tt_um_wokwi_395414987024660481 (HELP for tinyTapeout) tt_um_jorgenkraghjakobsen_toi2s (SPDIF to I2S decoder) tt_um_cmerrill_pdm (Parallel / SPI modulation tester) tt_um_csit_luks (CSIT-Luks) tt_um_wokwi_395357890431011841 (Trivium Non-Linear Feedback Shift Register) tt_um_drburke3_top (SADdiff_v1) tt_um_cejmu_riscv (TinyRV1 CPU) tt_um_rejunity_current_cmp (Analog Current Comparator) tt_um_loco_choco (BF Processor) tt_um_qubitbytes_alive (It's Alive) tt_um_wokwi_395061443288867841 (BCD to single 7 segment display Converter) tt_um_SJ (SiliconJackets_Systolic_Array) tt_um_ejfogleman_smsdac (8-bit DEM R2R DAC) tt_um_wokwi_395055455727667201 (Hardware Trojan Part II) tt_um_ericsmi_weste_problem_4_11 (Measurement of CMOS VLSI Design Problem 4.11) tt_um_wokwi_395034561853515777 (2 bit Binary Calculator) tt_um_mw73_pmic (Power Management IC) tt_um_Counter_1_shivam (8-bit Binary Counter) tt_um_wokwi_395054508867644417 (SynchMux) tt_um_otp_encryptor (TT06 OTP Encryptor) tt_um_wokwi_395514572866576385 (Parity Generator) tt_um_ADPCM_COMPRESSOR (ADPCM Encoder Audio Compressor) tt_um_3515_sequenceDetector (Sequence detector using 7-segment) tt_um_faramire_stopwatch (Simple Stopwatch) tt_um_ks_pyamnihc (Karplus-Strong String Synthesis) tt_um_dlmiles_muldiv8 (MULDIV unit (8-bit signed/unsigned)) tt_um_dlmiles_muldiv8_sky130faha (MULDIV unit (8-bit signed/unsigned) with sky130 HA/FA cells) tt_um_tommythorn_ncl_lfsr (NCL LFSR) tt_um_lk_ans_top (ANS Encoder/Decoder) tt_um_MichaelBell_latch_mem (Latch RAM (64 bytes)) tt_um_wokwi_395179352683141121 (Combination Lock) tt_um_Uart_Transciver (UART Transceiver) tt_um_dgkaminski (4-Bit ALU) tt_um_DigitalClockTop (TDM Digital Clock) tt_um_wokwi_394640918790880257 (IFSC Keypad Locker) tt_um_wokwi_395355133883896833 (BIT COMPARATOR) tt_um_alu (SumLatchUART_System) tt_um_alfiero88_VCII (VCII) tt_um_ALU (3-bit ALU) tt_um_topTDC (Convertidor de Tiempo a Digital (TDC)) tt_um_UABCReloj (24 H Clock) tt_um_CDMA_Santiago (CDMA_2024) tt_um_dr_skyler_clock (Clock) tt_um_motor (motor a pasos) tt_um_mult_2b (mult_2b) tt_um_CodHex7seg (Decodificador binario a display 7 segmentos hexadecimal) tt_um_S2P (Serial to Parallel Register) tt_um_PWM (PWM) tt_um_ss_register (serie_serie_register) tt_um_stepper (Stepper) tt_um_g3f (Generador digital trifásico) tt_um_ALU_DECODERS (ALU with a Gray and Octal decoders) tt_um_ram (4 bit RAM) tt_um_sap_1 (SAP-1 Computer) tt_um_guitar_pedal (Integrated Distorion Pedal) tt_um_mbalestrini_usb_cdc_devices (Two ports USB CDC device) tt_um_adammaj (Tiny ALU) tt_um_wokwi_395567106413190145 (4-Bit Full Adder and Subtractor with Hardware Trojan) tt_um_gak25_8bit_cpu_ext (Most minimal extension of friend's 'CPU In a Week' in a day) tt_um_hsc_tdc (UCSC HW Systems Collective, TDC) tt_um_BoothMulti_hhrb98 (UACJ-MIE-Booth 4) tt_um_dlmiles_poc_fskmodem_hdlctrx (FSK Modem +HDLC +UART (PoC)) tt_um_simplez_rcoeurjoly (tt6-simplez) tt_um_nurirfansyah_alits01 (Analog Test Circuit ITS: VCO) tt_um_ppca (drEEm tEEm PPCA) tt_um_wokwi_395522292785089537 (Displays CIt) tt_um_fpu (Dgrid_FPU) tt_um_duk_lif (Leaky Integrate and fire neuron(LIF)) tt_um_bomba1 (Latin_bomba) tt_um_chatgpt_rsnn_paolaunisa (ChatGPT designed Recurrent Spiking Neural Network) tt_um_bit_ctrl (Bit Control) tt_um_array_multiplier_hhrb98 (Array Multiplier) tt_um_wallace_hhrb98 (UACJ-Wallace multiplier) tt_um_I2C_to_SPI (TinyTapeout SPI Master) tt_um_rng (Random number generator) tt_um_wokwi_395599496098067457 (EVEN AND ODD COUNTERS) tt_um_8bitALU (8bit ALU) tt_um_aleena (Analog Sigmoid) tt_um_rejunity_1_58bit (Ternary 1.58-bit x 8-bit matrix multiplier) tt_um_rejunity_fp4_mul_i8 (FP4 x 8-bit matrix multiplier) tt_um_PWM_Controller (PWM Controller) tt_um_couchand_cora16 (CORA-16) tt_um_frq_divider (clk frequency divider controled by rom) tt_um_wokwi_390913889347409921 (Notre Dame Dorms LED) tt_um_timer_counter_UGM (4-Digit Scanning Digital Timer Counter) tt_um_koconnor_kstep (kstep) tt_um_lancemitrex (DIP Switch to HEX 7-segment Display) tt_um_PWM_Sine_UART (PWM_Sinewave_UART) tt_um_nicklausthompson_twi_monitor (TWI Monitor) tt_um_wokwi_395615790979120129 (Cambio de giro de motor CD) tt_um_ancho (Circuito PWM con ciclo de trabajo configurable) tt_um_wokwi_395618714068432897 (32b Fibonacci Original) tt_um_voting_thingey (Voting thingey) tt_um_hsc_tdc_buf (UCSC HW Systems Collective, TDC - BUF2x1) tt_um_hsc_tdc_mux (UCSC HW Systems Collective, TDC - MUX2x1) tt_um_petersn_micro1 (14 Hour Simple Computer) tt_um_sanojn_tlv2556_interface (UART interface to ADC TLV2556 (VHDL Test)) tt_um_gray_sobel (Gray scale and Sobel filter) tt_um_wokwi_395614106833794049 (Universal gates) Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available