Below are some designs solving select Advent of Code (AoC) puzzles.
Each puzzle follows a common base structure illustrated below:
flowchart
mk["Makefile"]
txt["Puzzle Input File"]
tcl["Vivado TCL Script"]
ans["(stdout) Puzzle Answer"]
subgraph shell["Shell Module (shell.sv)"]
tap["BSCANE2"]
subgraph usr["User Logic (user_logic.sv)"]
tap-dec["TAP Decoder (tap_decoder.sv)"]
log["Puzzle Solving Logic"]
tap-enc["TAP Encoder (tap_encoder.sv)"]
end
end
mk -- Invokes Vivado --> tcl
txt -- File Contents --> tcl
tcl <-- JTAG Interface --> tap
tcl -- Result --> ans
tap --JTAG-TAP--> tap-dec
tap-dec -- Puzzle Input Bytes --> log
log -- Computed Result --> tap-enc
tap-enc --JTAG-TAP--> tap
These implementations share the following common features:
- All the puzzle FPGA firmwares only use the JTAG interface. The
vivado.tclTCL script uses thescan_dr_hw_jtagcommand for transfering the puzzle input contents via aBSCANE2primitive instantiated in the firmware in theshell.svtop-level module- Puzzle contents are NOT embedded in the firmware bitstream
- No UART interfaces are used
- No IOs nor external clocks are used (see
constraints.xdc), thus all the puzzle firmwares are agnostic of the board
- Source code in plain/vanilla SystemVerilog
- The puzzle firmwares can be built and run on any board featuring a Xilinx 7-series FPGA, assuming the device density is enough to fit the design
- All the puzzles are solved on-board in a fraction of seconds, simulating most of them takes a few seconds with Verilator (compilation times usually largely being a bit longer)
The default target device is a Xilinx Zynq 7020. Different devices or families may have a different JTAG chains (with or without ARM DAP and PS TAP, single die or multi-SLR) resulting in different instruction register (IR) and data register (DR) lengths. Both are defined in the vivado.tcl script:
set zynq7_ir_length 10; # must match FPGA device family / SLR count
set zynq7_ir_user4 0x3e3; # same thing
set zynq7_dr_length_byte 9; # zynq7: extra bit for ARM DAP bypass regDue to the usage of a BSCANE2 primitive, small changes may be required to port the design to UltraScale devices (if I recall correctly, these devices use a different BSCANE2 clock constraint: INTERNAL_TCK). For Versal families, the porting may be more involved with BSCANE2 primitives being superseded by the CIPS component.
Porting to other vendors should also be straightforward, as the design is written in System Verilog and the only primitive requiring instantiating is a JTAG TAP controller and updating the three variables mentioned above.
- Text editors: Zed, VSCodium, PyCharm
- FPGA tools: Vivado 2025.2, Yosys online schematic viewer
- Simulators: Icarus Verilog 12, Verilator 5, Vivado 2025.2 (xsim)
- VCD viewers: Surfer, Vaporview, GTKWave
Each puzzle directory includes a Makefile supporting the following make targets.
make isim [INPUT_FILE=input.txt]
INPUT_FILE: puzzle contents input file, default isinput.txt
make sim [INPUT_FILE=input.txt]
INPUT_FILE: puzzle contents input file, default isinput.txt
Note
The Vivado xsim simuation target is only available on select puzzles (the ones in the 15/ directory).
make xsim [INPUT_FILE=input.txt]
INPUT_FILE: puzzle contents input file, default isinput.txt
make synth [VVD_MODE=batch] [PART=xc7z020clg484-1] [VVD_TASK=all] [INPUT_FILE=input.txt]
VVD_MODE: Vivado invocation mode, defaultbatchPART: FPGA targeted part, default isxc7z020clg484-1VVD_TASK: tasks executed in thevivado.tcl, default isallall: all tasks below exceptlintbuild: synthesis, pnr and bitstream generationprogram: configures FPGA with current bitstreamrun:program, load puzzle contents into the FPGA and readback resultslint: run the Vivado linting tool
INPUT_FILE: puzzle contents input file, default isinput.txt
Available in select puzzles.
explore.py [filename]
filename: puzzle contents input file, default isinput.txt
I opted to focus my efforts on the first part of the puzzles, thus most of them haven't the second part done.
| Puzzle | Simulation | Synthesis | On-board | Remarks |
|---|---|---|---|---|
| 1.1 | π‘ Design creation | π’ Integrate BSCANE2 primitive | π‘ Get familiar with JTAG TAP | First attempts were rough |
| 1.2 | π‘ Modulo arithmetics | π΅ Synthesized right away | π΅ Right out of the box | Part 2 |
| 4.1 | π‘ Used smarter algorithm | π΅ Synthesized right away | π΅ Right out of the box | Two-dimensional neighboors comparison |
| 5.1 | π΅ Brute force approach | π‘ Barely fits in a Zynq-7020 | π΅ Right out of the box | Comparison of value ranges |
| 5.2 | π‘ Wrong initial intuition | π’ Barely fits in a Zynq-7020 | π΅ Right out of the box | Part 2 |
| 6.1 | π’ Simple array-based design | π‘ Some rework required | π΅ Right out of the box | Arithmetics |
| 7.1 | π΅ Combinatorial algorithm | π΅ Synthesized right away | π΅ Right out of the box | Binary graph |
| 9.1 | π’ Storage and readback | π΅ Synthesized right away | π’ Initialy got a sim/syn mismatch | Mismatch due to non initialized enum types |
| 10.1 | π΄ Forgot to check a blind side | π΅ Synthesized right away | π΅ Initialy got a sim mismatch | Processing load fan-out accross multiple units making it running at line rate |
| 11.1 | β« Hello dynamic programming | β« cursed DPRAM inference | β« Had to fix sim / synth mismatch | DAG with bottom-up dynamic programming π€― |
Figured I shall start from the ground up.
| Puzzle | Simulation | Synthesis | On-board | Remarks |
|---|---|---|---|---|
| 1.1 | π΅ Added Xilinx Xsim | π΅ Synthesized right away | π΅ Right out of the box | Didn't expect xsim to be so pedantic |
| 1.2 | π΅ Simple changes | π΅ Synthesized right away | π΅ Right out of the box | Easiest part 2 |
| 2.1 | π΅ Math ops pipelining | π΅ Synthesized right away | π΅ Right out of the box | Read description a bit too fast |
| 2.2 | π΅ Math ops pipelining | π΅ Synthesized right away | π΅ Right out of the box | Even easier part 2 |
| 3.1 | π΅ Quite simple | π΅ Synthesized right away | π΅ Right out of the box | Completely overhauled the JTAG TAP serialization logic |
| 3.2 | π’ Got a reset corner-case | π΅ Synthesized right away | π΅ Right out of the box | Mismatch from reset handling |
| 4.1 | π‘ MD5 seriously?! π§ | π‘ Initialy got a 78-level deep timing path π | π‘ Beware of Synth 8-87 | Had a lot of fun with this one π |
| 4.2 | π’ Superscalar version of the above | π’ FPGA is fully packed | π΅ Forgot changing to six leading zeroes | Used a free-running clock cfgclk for faster processing |
| 5.1 | π΅ A breeze compared to the previous puzzles | π΅ Synthesized right away | π΅ Right out of the box | Straightforward |
| 5.2 | π’ Although non-trivial, was easy to implement | π΅ Synthesized right away | π΅ Right out of the box | An off by one error initially slept in the RTL design |
| 6.1 | π‘ Had to be clever for this one | π΅ Synthesized right away | π’ CDC exposed way too early results strobe event | Interesting on-board issue |
| 6.2 | π‘ Modeling in Python saved me from trouble | π΄ Xilinx lists RAM resources in RAMB18 not RAMB36 π¬ | π΅ Right out of the box | Large rework after noticing I'd run out of RAM blocks π |
| Symbol | Level | Description | Remarks |
|---|---|---|---|
| π΅ | Trivial | Straightforward | Copy-paste; wiring or basic logic |
| π’ | Easy | No surprises | Worked as expected |
| π‘ | Average | Some thoughts | Required multiple iterations |
| π΄ | Challenging | Serious thinking | Required some serious thinking |
| β« | Tedious | Cursed puzzle | Much harder than expected; learnt something new |
- Found an issue with the
run_state_hw_jtagVivado TCL command and opened a support request - This repository was selected as a winning entry in the Jane Street's Advent of FPGA Challenge 2025: recognized as an "elegant framework" and as a submission which "demonstrates the real-world engineering challenges of getting hardware designs working correctly, not just in simulation.".