Skip to content

Added Ternip#125

Draft
sifferman wants to merge 1 commit into
mainfrom
ternip
Draft

Added Ternip#125
sifferman wants to merge 1 commit into
mainfrom
ternip

Conversation

@sifferman
Copy link
Copy Markdown
Collaborator

I had Claude do all this for me. I've done 0 debugging on my own.

Here is the output of DECISIONS.md:

Ternip

Ternip is a custom fixed-point ternary matrix-multiply inference processor written in SystemVerilog. It requires native SV synthesis via yosys-slang (SYNTH_HDL_FRONTEND: slang) and three FakeRAM macros replacing the behavioral ternip_pipelined_mem module.

asap7

Status: not finishing — detail routing does not converge
Last updated: 2026-05-08

Configuration

  • SYNTH_HDL_FRONTEND = slang (native SystemVerilog — no sv2v)
  • SYNTH_HIERARCHICAL = 0 (hierarchical mode caused CTS ODB-1200 InsertBufferBeforeLoads failure)
  • CORE_UTILIZATION = 25, PLACE_DENSITY = 0.55
  • MACRO_PLACE_HALO = 12 12
  • TNS_END_PERCENT = 100
  • Clock: clk_i, 5000 ps (200 MHz)
  • CONFIG_FILENAME set via VERILOG_DEFINES; hightide.svh resolved from VERILOG_INCLUDE_DIRS

FakeRAM macros (asap7)

ternip_pipelined_mem is the sole memory primitive, parameterized by DATA_WIDTH and NUM_ENTRIES. Three instances are synthesized; each is replaced by a fakeram7_* macro via ternip_pipelined_mem_fakeram7.v.

Macro DATA_WIDTH NUM_ENTRIES Instance Source in ternip repo
fakeram7_4096x16 16 4096 vector_registers.pipelined_mem ternip_vector_registers.svFixedPointPrecision × (D × NumVectorRegisters) = 16 × 4096
fakeram7_1024x16 16 1024 tmatmul/exportvector fus/ternip_tmatmul.sv — export vector buffer, DATA_WIDTH=FixedPointPrecision, NUM_ENTRIES=D
fakeram7_16x1024 1024 16 tmatmul/importvector fus/ternip_tmatmul.sv — import vector buffer, DATA_WIDTH=D×FixedPointPrecision/TmatmulParallelism, NUM_ENTRIES=TmatmulParallelism

With D=1024, FixedPointPrecision=16, TmatmulParallelism=64: importvector DATA_WIDTH = 1024×16/64 = 256... see note below.

LEF/LIB files generated by designs/src/ternip/dev/gen_fakeram.py --platform asap7. Macro geometry targets a 2:1 aspect ratio; pin pitch matches bsg_fakeram's proven asap7 format (M4, 0.144 µm pitch, 0.072 µm protrusion).

Floorplan — macro placement

Die: 514.9 × 514.9 µm at 25% utilization. RTLMP places all three macros automatically:

Macro Instance Origin (x, y) µm Orient Size (w × h) µm
fakeram7_4096x16 vector_registers.pipelined_mem 13.0, 101.3 R0 256.0 × 128.3
fakeram7_1024x16 tmatmul/exportvector 141.1, 77.3 R180 128.0 × 64.3
fakeram7_16x1024 tmatmul/importvector 501.9, 161.9 R180 128.0 × 148.8

Detail routing — convergence failure

Global routing passes cleanly (0 overflow, 1.79% resource usage). Detail routing does not converge; the router reaches the 50-iteration limit with ~4,150 eolKeepOut violations remaining.

Selected per-iteration violation counts:

Iteration Total violations eolKeepOut
0 13,992 ~13,992
1 5,225 ~5,225
3 4,322 ~4,322
4 4,275 4,150
10 4,222 4,160
16 4,184 4,162
24 4,155 4,150
45 4,151 4,150
47 4,147 4,146
50 ~4,204 ~4,150

The count drops sharply in iterations 0–3 (general routing cleanup), then plateaus at ~4,150 eolKeepOut violations from iteration 4 onward with no further improvement.

Root cause: fakeram7_16x1024 has 2 × 1024 data pins + 4 address pins + 3 control pins = 2055 signal pins at 0.144 µm pitch on a 148.8 µm-tall body. The macro sits in the upper-right corner of the die (x = 502 µm in a 515 µm-wide die) in R180 orientation. The resulting pin clusters at the macro edges create a local routing hot spot that the detail router cannot escape — every attempted reroute around one eolKeepOut violation displaces another.

Global routing sees no overflow because the congestion is localized to the pin-access layer directly adjacent to the macro edge; the global router operates at a coarser granularity and does not model per-pin eolKeepOut constraints.

Open fix

Increase pin_track_count from 3 to 6 in gen_fakeram.py for fakeram7_16x1024 (doubling the pin pitch from 0.144 µm to 0.288 µm). This grows the macro height from 148.8 µm to ~296 µm but gives the detail router 2× more routing space between adjacent pins. Requires regenerating the LEF/LIB and rerunning from floorplan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant