846 Rounding error

846 : Rounding error

Design render
  • Author: Edwin Török
  • Description: Competition entry
  • GitHub repository
  • Clock: 25250000 Hz

How it works

Idea

This started out as an attempt to implement a ray tracer in 2 TT tiles. However, there isn't enough room for a proper one, precision has to be limited, which leads to unavoidable rounding errors.

So embrace rounding errors, and make them the primary feature!

The end result doesn't resemble a 3D scene, or a sphere, or in fact not even a properly rounded circle, but it has rounding errors! And that is the goal of this project now!

HardCaml

The RTL was written using HardCaml, an OCaml DSL that emits Verilog. For convenience the generated Verilog is committed into the source tree, so no additional tools are needed.

I used registers with asynchronous reset, in theory it should be better for an area constrained design.

VGA signal generation

ModeLine

VGA signal timing is described in "3. DMT Video Timing Parameter Definitions" in "VESA Display Monitor Timing Standard Version 1.0, Rev. 13", and is implemented in src/generator/modeline.ml. Examples on how to implement them on an FPGA are available in several places.

The code supports several resolutions, however to conserve area for the demo I've chosen only [email protected], which has negative hsync/vsync polarities. This resolution would need a 25.175 MHz pixel clock, however that can't be produced exactly by the TT08 board, it can only approximate it using a PWM. Therefore, the design is configured to run at the nearest frequency that can be exactly generated: 25.25 MHz, which should be within the 0.5% acceptable by the standard. The ModeLine implemented is: ModeLine "640x480_59.94" 25.175 640 656 752 800 480 490 492 525 -hsync -vsync. (This has 59.94 refresh rate and not 60Hz due to the standard preferring NTSC and its 1.001 adjustment).

The design itself runs off the VGA pixel clock, as I didn't want to deal with potential clock domain crossing issues.

Counters

There are 2 counters: one for H, and one for V synchronization pulses. When the H counter overflows it enables and increments the V counter for 1 cycle. This is implemented in generator/vga.ml, together with waveform expectation tests.

Both H and V counters start out in the visible area for convenience (we can directly use these counters as x/y coordinates, without needing to perform arithmetic in the circuit), then blank the colour signals for the duration of the front porch, synchronization signal and back porch. Although the monitor would recognize the hsync+vsync low as the start of a frame, this is equivalent, but offset by a few clocks.

R, G, B colours

The demo supports 2-bit colours, and as usual these would be sRGB colours, not a linear scale. So we define an internal table indexed by 3 bits representing a linear RGB value, mapping to the sRGB bits.

A register is used for the output, both to avoid logic glitches becoming visible to the monitor, and to provide a reg to reg path that OpenSTA can use to compute setup/hold times.

Generating the colours

When test mode is used (pin ui[0] set to 1) the design outputs vertical colour bars with a white-black-white border. This doesn't have rounding errors, everything is sharp.

In normal mode (pin ui[0] is 0) the "rounding error graphics" is rendered, see below.

Ray marching

For an explanation of how ray marching works, see this ray marching tutorial. The "scene" is represented using signed distance functions. The "eye" Z coordinate is animated between 3.5 and 4.5 in 256 steps, where each frame is one step.

CORDIC

Fixed point arithmetic with 9 bits of precision is used in the HDL, with the exponent tracked by the generator code to reduce register width (though this is not as good as tracking it in hardware, but that'd require more area). Vector normalization is implemented using the CORDIC implementation provided by HardCaml, configured to use 10 bits, and a limited number of iterations (4) to fit into the desired area. This works by rotating the vector until its angle is 0, and then rotating a second unit vector to match the rotation of the original. Or equivalently transform the original from rectangular to polar coordinates, overwrite the length with 1, and convert back from polar to rectangular. CORDIC is defined for 2D in the library, and I define a 3D wrapper based on rectangular to spheric coordinate conversions, although there would be ways to directly compute a 3D version of CORDIC, that is not implemented here.

This is implemented in src/vecmath.

GLSL ES "emulation"

The low level operations are wrapped by a higher level embedded DSL that allows writing code quite similar to GLSL ES, with a very small number of operators: arithmetic (+, -, *, /), comparison (==, <>), abs, min, max, clamp, length, distance, dot, normalize, reflect.

Unfortunately the full renderer didn't fit into 2 tiles, so had to comment out quite a lot of the "GLSL" code (only 1 step of ray marching, no clamping, very simple gradient approximation), what is remaining does not resemble a sphere, or in fact it doesn't even look 3D.

OpenLane configuration

The target density had to be increased to 98% to fit, and the setup slack margin setting had to be increased, see config.json. There are max slew and max fanout violations at 100C and 1.6V, but that shouldn't prevent the design from working at 25C and 1.8V.

The design was simulated using both tt-vgaviz and vgasim, although had to adjust the modeline for vgasim to recognize the standard one. A simple cocotb test which checked vsync/hsync generation was added post submission.

Lack of audio

Audio is enabled, but is only a very simple test signal based on hsync and vsync.

Simulating

There is a src/sim/vgasim.ml, which generates a demo.v compatible with vgasim, this uses a different resolution though. vgasim has to be called with -g 640x480, and videomode.h needs to be edited to use 480 490 492 525 (don't know why it wants 521, that doesn't seem to be the standard timing).

Alternatively the cocotb test in test/ can be run with make -B WAVES=1, and then tt-vgaviz can be used: tt-vgaviz tb.vcd (actually in FST format).

How to test

Configuration

  • Provide a 25.25 MHz clock on the clk pin (RP2040 should be able to provide this with no jitter). Or if you can try 25.175 MHz instead, but this will have some jitter. YMMV.

  • Power the design with at least 1.8V

Main demo

  • Set pin ui[0] to 0 to run the default demo.

  • Reset the design

  • You should see circles moving slowly and large rounding errors:

circles

Test mode

  • Set pin ui[0] to 1 to show a test image with color bars.

  • Reset the design again if desired

  • You should see:

color bars.

The "audio" out is connected, but is not expected to result in anything audible.

External hardware

Connect according to the Demoscene rules

IO

#InputOutputBidirectional
0test mode (0=no, 1=yes)r1
1g1
2b1
3vsync
4r0
5g0
6b0
7hsyncPWM output

Chip location

Controller Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Analog Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux Analog Mux Mux Mux Mux Mux Mux Mux Mux Mux Mux tt_um_chip_rom (Chip ROM) tt_um_factory_test (TinyTapeout 8 Factory Test) tt_um_oscillating_bones (Oscillating Bones) tt_um_urish_charge_pump (Dickson Charge Pump) tt_um_bgr_agolmanesh (Bandgap Reference) tt_um_tnt_diff_rx (TT08 Differential Receiver test) tt_um_rejunity_vga_logo (VGA Tiny Logo (1 tile)) tt_um_tommythorn_maxbw (Asynchronous Multiplier) tt_um_mattvenn_r2r_dac_3v3 (Analog 8 bit 3.3v R2R DAC) tt_um_urish_simon (Simon Says memory game) tt_um_rebeccargb_universal_decoder (Universal Binary to Segment Decoder) tt_um_mattvenn_rgb_mixer (RGB Mixer demo5) tt_um_rebeccargb_hardware_utf8 (Hardware UTF Encoder/Decoder) tt_um_find_the_damn_issue (Find The Damn Issue) tt_um_brandonramos_VGA_Pong_with_NES_Controllers (VGA Pong with NES Controllers) tt_um_kb2ghz_xalu (4-bit minicomputer ALU) tt_um_rebeccargb_intercal_alu (INTERCAL ALU) tt_um_a1k0n_demo (Demo by a1k0n) tt_um_rburt16_bias_generator (Bias Generator) tt_um_zec_square1 ("SQUARE-1": VGA/audio demo) tt_um_jmack2201 (Sprite Bouncer with Looping Background Options) tt_um_ran_DanielZhu (Dice) tt_um_gfg_development_tinymandelbrot (TinyMandelbrot) tt_um_LnL_SoC (Lab and Lectures SoC) tt_um_htfab_pi_snake (Pi Snake) tt_um_tt08_aicd_playground (AICD Playground) tt_um_toivoh_demo (Sequential Shadows [TT08 demo competition]) tt_um_quarren42_demoscene_top (asic design is my passion) tt_um_crispy_vga (Crispy VGA) tt_um_MichaelBell_canon (TT08 Pachelbel's Canon demo) tt_um_shuangyu_top (Calculator) tt_um_wokwi_407306064811090945 (DDR throughput and flop aperature test) tt_um_08_sws (Sine Wave Synthesizer) tt_um_favoritohjs_scroller (VGA Scroller) tt_um_tt08_wirecube (Wirecube) tt_um_vga_glyph_mode (Glyph Mode) tt_um_a1k0n_vgadonut (VGA donut) tt_um_roy1707018 (RO) tt_um_analog_factory_test (TT08 Analog Factory Test) tt_um_sign_addsub (CMOS design of 4-bit Signed Adder Subtractor) tt_um_tinytapeout_logo_screensaver (VGA Screensaver with Tiny Tapeout Logo) tt_um_patater_demokit (Patater Demo Kit Waggling Rainbow on a Chip) tt_um_algofoogle_tt08_vga_fun (TT08 VGA FUN!) tt_um_simon_cipher (simon_cipher) tt_um_thexeno_rgbw_controller (Color Controller) tt_um_demosiine_sda (DemoSiine) tt_um_bytex64_munch (Munch) tt_um_alexjaeger_ringoscillator (5MHz Ring Oscillator) tt_um_cfib_demo (cfib Demoscene Entry) tt_um_wokwi_407852791999030273 (Simple 8 Bit ALU) tt_um_Richard28277 (4-bit ALU) tt_um_betz_morse_keyer (Morse Code Keyer) tt_um_nvious_graphics (nVious Graphics) tt_um_tiny_pll (Tiny PLL) tt_um_ezchips_calc (8-Bit Calculator) tt_um_hack_cpu (HACK CPU) tt_um_noritsuna_Vctrl_LC_oscillator (Voltage Controlled LC-Oscillator) tt_um_ring_divider (Divided Ring Oscillator) tt_um_2048_vga_game (2048 sliding tile puzzle game (VGA)) tt_um_morningjava_r2r_from_matt (Bucket Brigade) tt_um_ephrenm_tsal (TSAL_TT) tt_um_kapilan_alarm (Alarm Clock) tt_um_stochastic_addmultiply_CL123abc (Stochastic Multiplier, Adder and Self-Multiplier) tt_um_wokwi_407760296956596225 (tt08-octal-alu) tt_um_dlfloatmac (DL float MAC) tt_um_wakki_0123_Raw_Transistors (Raw_Transistors) tt_um_faramire_rotary_ring_wrapper (Rotary Encoder WS2812B Control) tt_um_devstdin_LDO_OSC (LDO BG IREF OSC) tt_um_frequency_counter (Frequency Counter SSD1306 OLED) tt_um_rom_test (TT08 SKY130 ROM 'YOLO' Test) tt_um_i2c_peripheral_stevej (i2c peripherals: leading zero count and fnv-1a hash) tt_um_yuri_panchul_schoolriscv_cpu_with_fibonacci_program (schoolRISCV CPU with Fibonacci program) tt_um_yuri_panchul_adder_with_flow_control (Adder with Flow Control) tt_um_brailliance (Brailliance) tt_um_nyan (nyan) tt_um_MichaelBell_mandelbrot (VGA Mandelbrot) tt_um_ssp_opamp (2-stage Opamp Designs) tt_um_fountaincoder_top_ad (pulse_add) tt_um_edwintorok (Rounding error) tt_um_mac (MAC) tt_um_dpmu (DPMU) tt_um_JAC_EE_segdecode (7 Segment Decode) tt_um_wokwi_408118380088342529 (Traffic-light-sequence) tt_um_shiftreg_test (TT08 SKY130 Shift Register 'YOLO' Test) tt_um_yuri_panchul_sea_battle_vga_game (Sea Battle) tt_um_benpayne_ps2_decoder (PS2 Decoder) tt_um_meriac_play_tune (Super Mario Tune on A Piezo Speaker) tt_um_comm_ic_bhavuk (Comm_IC) tt_um_daosvik_aesinvsbox (AES Inverse S-box) tt_um_wokwi_408216451206371329 (Logic Test) tt_um_micro_tiles_container (Micro tile container) tt_um_cattuto_sr_latch (TT08 - experiments with latch-based shift registers) tt_um_rejunity_vga_test01 (VGA Drop (audio/visual demo)) tt_um_silice (Warp) tt_um_wokwi_408231820749720577 (Abacus Lock) tt_um_jayjaywong12 (mulmul) tt_um_emmyxu_obstacle_detection (Obstacle Detection) tt_um_neural_navigators (Neural Net ASIC) tt_um_a1k0n_nyancat (VGA Nyan Cat) tt_um_rebeccargb_styler (Styler) tt_um_resfuzzy (resfuzzy) tt_um_cejmu (CEJMU Beers and Adders) tt_um_16_mic_beamformer_arghunter (16 Mic Beamformer) tt_um_pdm_pitch_filter_arghunter (PDM Pitch Filter) tt_um_pdm_correlator_arghunter (PDM Correlator) tt_um_ddc_arghunter (DDC) tt_um_i2s_to_pwm_arghunter (I2S to PWM ) tt_um_georgboecherer_vco (Analog Voltage Controlled Oscillator) tt_um_supermic_arghunter (Supermic ) tt_um_dmtd_arghunter (DMTD ) tt_um_htfab_bouncy_capsule (Bouncy Capsule) tt_um_samuelm_pwm_generator (PWM generator) tt_um_mattvenn_analog_ring_osc (Ring Oscillators) tt_um_toivoh_demo_deluxe (Sequential Shadows Deluxe [TT08 demo competition]) tt_um_vga_clock (VGA clock) tt_um_z2a_rgb_mixer (RGB Mixer demo) tt_um_faramire_stopwatch (Simple Stopwatch) tt_um_micro_tiles_container_group2 (Micro tile container (group 2)) tt_um_johshoff_metaballs (Metaballs) tt_um_top (Flame demo) tt_um_NicklausThompson_SkyKing (SkyKing Demo) tt_um_Electom_cla_4bits (4-bit CLA) tt_um_vga_cbtest (Generate VGA output for Color Blindness Test) tt_um_zoom_zoom (Zoom Zoom) tt_um_dpmunit (DPM_Unit) tt_um_clock_divider_arghunter (Clock Divider ) tt_um_dlmiles_poc_fskmodem_hdlctrx (FSK Modem +HDLC +UART (PoC)) tt_um_emilian_muxpga (TinyFPGA resubmit for TT08) tt_um_pyamnihc_dummy_counter (Dummy Counter) tt_um_whynot (Why not?) tt_um_sudana_ota5t_1 (5-T OTA) tt_um_dlmiles_tt08_poc_uart (UART) tt_um_dendraws_donut (donut) tt_um_wokwi_408237988946759681 (Counter) tt_um_tmkong_rgb_mixer (RGB Mixer) Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available Available