Skip to content

flow/docs/rcx: add GRT-vs-RCX parasitic divergence study#4302

Closed
oharboe wants to merge 1 commit into
The-OpenROAD-Project:masterfrom
oharboe:rcx-fanout-study
Closed

flow/docs/rcx: add GRT-vs-RCX parasitic divergence study#4302
oharboe wants to merge 1 commit into
The-OpenROAD-Project:masterfrom
oharboe:rcx-fanout-study

Conversation

@oharboe

@oharboe oharboe commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

Rendered copy of flow/docs/rcx/README.md for review.

Global-route WNS vs RCX sign-off: a fan-out study

As a design moves through ORFS, timing is reported several times, each time with
a different parasitic model. The pre-route numbers are estimates; only the
post-route finish number uses parasitics extracted from the routed geometry
(OpenRCX). This study measures how far the global-route (GRT) estimate sits from
the RCX sign-off as a function of net fan-out and wire length, across four PDKs.

The short version: on a 7 nm node the gap is large enough that GRT and RCX
disagree about whether a design meets timing. At 130 nm it is negligible. At
2 nm there is no extraction model shipped at all, so there is nothing to check
the estimate against.

Three parasitic models, three answers

OpenROAD uses three parasitic models at three points in the flow
(discussion #3943):

Stage Parasitics Command Notes
placement one default layer's R/C × estimated length set_wire_rc roughest
global route per-layer R/C × global-route length set_layer_rc / estimate_parasitics -global_routing better topology, still lumped, coupling-blind
finish extracted from detailed-routed geometry OpenRCX extract_parasitics -ext_model_file $RCX_RULES sign-off

The global-route estimate is a linear per-layer model — capacitance ≈ (per-layer
C) × (wire length) — whose constants live in each platform's setRC.tcl. It
ignores coupling capacitance (which depends on neighbouring routing that does not
exist yet at global route) and is only as accurate as those hand-entered
constants. The resizer documentation is explicit about the consequence:

"Placement-based parasitics cannot accurately predict routed parasitics, so a
margin can be used to 'over-repair' the design to compensate." — rsz README

ORFS ships flow/util/correlateRC.py (make correlate_rc) to fit those
constants to RCX and report the residual gap, but it is an offline, whole-PDK
calibration step. Nothing in the flow reports whether the nets in a particular
design fall inside the range where the linear model holds. The platform
bring-up guide (PlatformBringUp) documents how to write setRC.tcl
but not how to validate it against extraction, and the coarseness of the model
is a known issue (#3969).

The experiment

docs/rcx/gen_study.py emits deliberately minimal designs: one launch flop (the
hub) drives a single net that fans out to N capture flops. Inputs are
pinned to the west edge and outputs to the east edge, so the fan-out net spans
the die; the hub is dont_touch so the resizer cannot clone the driver and chop
the net into short, easily-estimated segments. We sweep N = 1 … 128, run the
full flow, and read WNS at each stage plus the per-net GRT and RCX parasitics
(make write_net_rc). WNS is normalized by the clock period so curve shape is
comparable across PDKs.

The shaded bands on the plots mark where the estimate is expected to hold:
fan-out ≤ 16 (green) is at or below the logical-effort FO4 region and the usual
set_max_fanout floor, where the Steiner-tree topology is accurate; 16–64
(amber) is where buffer trees and coupling error grow; > 64 (red) is past the
point where the lumped model is trustworthy. Fan-out here is a stand-in for the
real driver, long coupling-heavy nets, which the wide die also exercises.

Results

Plots and plots/study_data.csv are produced by bazelisk run //flow/docs/rcx:update and committed so they render here directly.

WNS gap (RCX − GRT) across PDKs, normalized by clock period

On asap7 (7 nm) the gap is large and changes sign with fan-out. At fan-out 16
GRT is about 50 ps pessimistic; at fan-out 128 the design reports +21 ps slack
at global route but −2 ps after extraction
— GRT says it closes, RCX says it
does not. On sky130hd and ihp-sg13g2 (130 nm) the gap stays within a fraction
of a picosecond across the whole sweep. gt2n (2 nm) ships no OpenRCX deck, so
finish repeats the GRT estimate and the gap is identically zero — there is no
extraction to compare against.

normalized WNS per flow stage vs fan-out, asap7

Per net, the GRT capacitance estimate is off by tens of percent even where the
WNS summary looks healthy; the errors partly cancel along a path, which is why
they do not surface in the WNS number alone.

per-net GRT vs RCX wire capacitance, asap7

docs/rcx/rcx_divergence_report.py ranks the worst nets for one design (name,
fan-out, routed length, GRT vs RCX capacitance, % error).

Tuning, and whether it is a bug

It is a known calibration limitation, not a defect. The available levers are
blunt and global: re-derive setRC.tcl from the RCX deck with make correlate_rc, toggle ENABLE_RESISTANCE_AWARE, or apply repair_design -cap_margin/-slew_margin to over-repair. None of them report which net in a
given design is mis-estimated, and OpenRCX is calibrated to a field solver
(calibration) while the GRT estimate is not calibrated per design.

A proposal

Global route already maintains a DRC-marker database (dbMarkerCategory "Global route") and tags the offending nets on its congestion markers. The same
machinery could surface parasitic-estimation risk:

  1. a per-net GRT-vs-RCX divergence report (or an estimate_parasitics mode that
    emits per-net estimated R/C beside the routed extraction), so the gap is
    queryable without dumping and diffing two SPEFs;
  2. a global-route DRC-marker subcategory — parasitic estimation out of range
    that tags nets whose fan-out / Steiner length / layer span fall outside the
    model's range, visible in the DRC viewer like congestion markers, including
    on designs that route cleanly;
  3. a warning naming those nets, with the standard fixes (split the net, pipeline,
    add a fan-out buffer stage).

That turns an end-of-flow surprise into an early signal.

Reproduce

python3 flow/docs/rcx/gen_study.py                 # generate the designs (all PDKs)
flow/docs/rcx/run_study.sh asap7                   # run the flow + collect data
bazelisk run //flow/docs/rcx:update                # regenerate plots + study_data.csv
python3 flow/docs/rcx/rcx_divergence_report.py \
    results/asap7/rcx-fanout-128/base/6_net_rc.csv  # worst-net report for one design

References

  • I. Sutherland, R. Sproull, D. Harris, Logical Effort: Designing Fast CMOS
    Circuits
    , Morgan Kaufmann, 1999. (FO4, set_max_fanout.)
  • C. Chu, Y.-C. Wong, "FLUTE: Fast Lookup Table Based Rectilinear Steiner Minimal
    Tree Algorithm," ICCAD 2004 / IEEE TCAD 2008. (Steiner wire-length accuracy.)
  • Liu et al., "Bridging the Gap between Global Route and Detailed Route … for
    Wire Parasitics and Delay Prediction," arXiv:2305.06917, 2023.
  • L. Clark et al., "ASAP7: A 7-nm FinFET predictive PDK," Microelectronics
    Journal, 2016.

Synthetic left-to-right fan-out designs swept 1..128 across asap7, sky130hd,
ihp-sg13g2 and gt2n, measuring WNS at every flow stage plus per-net GRT-estimated
vs OpenRCX-extracted parasitics. Shows the global-route WNS estimate diverges
from the RCX sign-off for long/high-fan-out nets on asap7 (closes at GRT, fails
at finish), is negligible at 130nm, and unmeasurable on gt2n (no OpenRCX deck).
Adds a fan-out column to write_net_rc.tcl (used by correlateRC.py), a normalized
cross-PDK plotter (bazelisk run //flow/docs/rcx:update), an actionable per-net
divergence report, and a README proposing global route flag out-of-envelope nets.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a study on the divergence between global-route parasitic estimates and post-route OpenRCX extractions, adding synthetic fan-out designs across multiple PDKs, generation and plotting scripts, and updating utilities to report net fan-out. The review feedback suggests using a context manager for file operations in 'plot_rcx_study.py' to avoid resource leaks, hardening the CSV parser in 'rcx_divergence_report.py' against malformed lines, and ensuring the fan-out calculation in 'write_net_rc.tcl' accurately counts design output ports as sinks.

)
if not os.path.isfile(meta):
continue
d = json.load(open(meta))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The file is opened using open(meta) without a context manager (with statement). This leaves the file descriptor open until the garbage collector runs, which can lead to resource leaks. It is recommended to use a with statement to ensure the file is properly closed.

Suggested change
d = json.load(open(meta))
with open(meta) as f:
d = json.load(f)

Comment on lines +40 to +57
try:
lengths = [float(x) for x in t[7:]]
except ValueError:
continue
wl = sum(
ln for i, ln in enumerate(lengths)
if i < len(stack) and stack[i][1]
)
yield {
"net": t[0],
"type": t[1],
"fanout": int(t[2]),
"grt_res": float(t[3]),
"grt_cap": float(t[4]),
"rcx_res": float(t[5]),
"rcx_cap": float(t[6]),
"wire_length": wl,
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Only the lengths parsing is wrapped in a try-except ValueError block. If the CSV line is malformed or has fewer than 7 columns, accessing t[0] through t[6] or converting them (e.g., int(t[2]), float(t[3])) will raise IndexError or ValueError and crash the script. Wrapping the entire parsing and yielding logic in a single try-except (ValueError, IndexError) block makes the parser much more robust.

Suggested change
try:
lengths = [float(x) for x in t[7:]]
except ValueError:
continue
wl = sum(
ln for i, ln in enumerate(lengths)
if i < len(stack) and stack[i][1]
)
yield {
"net": t[0],
"type": t[1],
"fanout": int(t[2]),
"grt_res": float(t[3]),
"grt_cap": float(t[4]),
"rcx_res": float(t[5]),
"rcx_cap": float(t[6]),
"wire_length": wl,
}
try:
lengths = [float(x) for x in t[7:]]
wl = sum(
ln for i, ln in enumerate(lengths)
if i < len(stack) and stack[i][1]
)
yield {
'net': t[0],
'type': t[1],
'fanout': int(t[2]),
'grt_res': float(t[3]),
'grt_cap': float(t[4]),
'rcx_res': float(t[5]),
'rcx_cap': float(t[6]),
'wire_length': wl,
}
except (ValueError, IndexError):
continue

lassign $rcx_net_name_to_rc($net_name) rcx_net_res rcx_net_cap
set net_type [expr { [string equal $type "CLOCK"] ? "clock" : "signal" }]
puts -nonewline $stream "[get_full_name $net],$net_type,"
set fanout [llength [get_pins -of $net -filter "direction == input"]]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The fan-out calculation currently only counts instance input pins connected to the net. However, if a net is connected to a design output port, that port also acts as a sink (fan-out) for the net. To ensure accurate fan-out reporting, design output ports should also be counted.

      set fanout [expr {[llength [get_pins -quiet -of $net -filter {direction == input}]] + [llength [get_ports -quiet -of $net -filter {direction == output}]]}]

@maliberty

Copy link
Copy Markdown
Member

On sky130hd and ihp-sg13g2 (130 nm) the gap stays within a fraction of a picosecond across the whole sweep.

This fails the sniff test (I wish we were that accurate). I looked at sky130 fanout-128 and the worst path delay is 2.256 (grt) vs 2.516 (drt). Note the time unit in these pdks is ns not ps so you statement is 1000X off.

@maliberty

Copy link
Copy Markdown
Member

On asap7 the case that goes from passing to failing at fanout=128 a data arrival time of 321.70 (grt) vs 336.17 (drt). The required time is 344.07 (grt) vs 334.21 (drt). There is no single large error but a small accumulation of small errors. Given the path delays, the 24.33 ps of slack difference is not an unreasonable percentage. Its a bit of bad luck that arrival slows down while required speeds up so the errors add rather than cancel.

In short I don't see much to fix. The fact that the error bar straddles 0 isn't consequential by itself.

@oharboe

oharboe commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator Author

needs work

@oharboe oharboe closed this Jun 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants