Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I had Claude do all this for me. I've done 0 debugging on my own.
Here is the output of DECISIONS.md:
Ternip
Ternip is a custom fixed-point ternary matrix-multiply inference processor written in SystemVerilog. It requires native SV synthesis via yosys-slang (
SYNTH_HDL_FRONTEND: slang) and three FakeRAM macros replacing the behavioralternip_pipelined_memmodule.asap7
Status: not finishing — detail routing does not converge
Last updated: 2026-05-08
Configuration
SYNTH_HDL_FRONTEND = slang(native SystemVerilog — no sv2v)SYNTH_HIERARCHICAL = 0(hierarchical mode caused CTS ODB-1200 InsertBufferBeforeLoads failure)CORE_UTILIZATION = 25,PLACE_DENSITY = 0.55MACRO_PLACE_HALO = 12 12TNS_END_PERCENT = 100clk_i, 5000 ps (200 MHz)CONFIG_FILENAMEset viaVERILOG_DEFINES;hightide.svhresolved fromVERILOG_INCLUDE_DIRSFakeRAM macros (asap7)
ternip_pipelined_memis the sole memory primitive, parameterized byDATA_WIDTHandNUM_ENTRIES. Three instances are synthesized; each is replaced by afakeram7_*macro viaternip_pipelined_mem_fakeram7.v.fakeram7_4096x16vector_registers.pipelined_memternip_vector_registers.sv—FixedPointPrecision × (D × NumVectorRegisters)= 16 × 4096fakeram7_1024x16tmatmul/exportvectorfus/ternip_tmatmul.sv— export vector buffer,DATA_WIDTH=FixedPointPrecision,NUM_ENTRIES=Dfakeram7_16x1024tmatmul/importvectorfus/ternip_tmatmul.sv— import vector buffer,DATA_WIDTH=D×FixedPointPrecision/TmatmulParallelism,NUM_ENTRIES=TmatmulParallelismWith
D=1024,FixedPointPrecision=16,TmatmulParallelism=64: importvector DATA_WIDTH = 1024×16/64 = 256... see note below.LEF/LIB files generated by
designs/src/ternip/dev/gen_fakeram.py --platform asap7. Macro geometry targets a 2:1 aspect ratio; pin pitch matches bsg_fakeram's proven asap7 format (M4, 0.144 µm pitch, 0.072 µm protrusion).Floorplan — macro placement
Die: 514.9 × 514.9 µm at 25% utilization. RTLMP places all three macros automatically:
fakeram7_4096x16vector_registers.pipelined_memfakeram7_1024x16tmatmul/exportvectorfakeram7_16x1024tmatmul/importvectorDetail routing — convergence failure
Global routing passes cleanly (0 overflow, 1.79% resource usage). Detail routing does not converge; the router reaches the 50-iteration limit with ~4,150
eolKeepOutviolations remaining.Selected per-iteration violation counts:
The count drops sharply in iterations 0–3 (general routing cleanup), then plateaus at ~4,150
eolKeepOutviolations from iteration 4 onward with no further improvement.Root cause:
fakeram7_16x1024has 2 × 1024 data pins + 4 address pins + 3 control pins = 2055 signal pins at 0.144 µm pitch on a 148.8 µm-tall body. The macro sits in the upper-right corner of the die (x = 502 µm in a 515 µm-wide die) in R180 orientation. The resulting pin clusters at the macro edges create a local routing hot spot that the detail router cannot escape — every attempted reroute around oneeolKeepOutviolation displaces another.Global routing sees no overflow because the congestion is localized to the pin-access layer directly adjacent to the macro edge; the global router operates at a coarser granularity and does not model per-pin eolKeepOut constraints.
Open fix
Increase
pin_track_countfrom 3 to 6 ingen_fakeram.pyforfakeram7_16x1024(doubling the pin pitch from 0.144 µm to 0.288 µm). This grows the macro height from 148.8 µm to ~296 µm but gives the detail router 2× more routing space between adjacent pins. Requires regenerating the LEF/LIB and rerunning from floorplan.