Expand TRT decoder YAML config for composite decoding [depends on PR #524] by wsttiger · Pull Request #536 · NVIDIA/cudaqx

wsttiger · 2026-05-08T16:05:45Z

Add YAML/config support for TRT decoder runtime options including batch size,
CUDA graph execution, global decoder selection, and PyMatching-specific global
decoder parameters. Wire realtime decoder construction so TRT configs receive
the top-level observable matrix from O_sparse, and pass the same O matrix into
PyMatching global decoder params for composite observable decoding.

Expose the new config fields through Python bindings and heterogeneous_map
round-tripping. Extend YAML tests for TRT config round-trip, runtime parameter
conversion, and O_sparse-to-O injection.

Update test_trt_decoder_composite to support an optional --config-yaml path,
allowing the existing composite demo to construct and run a real TRT+PyMatching
decoder directly from YAML while preserving the original manual CLI path.

…output Add a "predecoder" execution mode to the TensorRT decoder so it can be chained with a second decoder (e.g. PyMatching) and return logical-frame observables directly. The TRT model is assumed to emit a single output that concatenates [pre_L (num_observables entries), residual_dets (rest)]. New constructor parameters: - "batch_size": required when the ONNX model has a dynamic batch dim. Used to size the optimization profile and pre-allocate I/O buffers. - "global_decoder" + "global_decoder_params": optional decoder name and params for a follow-up decoder run on the residual_dets portion of the TRT output. Created with the same H passed to trt_decoder. - "O": observables matrix (num_observables x block_size). Enables decode()/decode_batch() to return the predicted logical frame. Number of observables is inferred from O.shape()[0]. Decode behavior matrix: - no global_decoder, no O -> raw TRT output (unchanged). - no global_decoder, O -> return the pre_L prefix only. - global_decoder, no O -> entire output -> global_decoder.result. - global_decoder, O -> residual -> global_decoder; return pre_L XOR global_decoder.logical_frame. Constructor validation when O is set: - output_size_per_sample >= num_observables, and - when global_decoder_ is set, output_size_per_sample == num_observables + global_decoder.syndrome_size. Other changes: - Dynamic batch support: setInputShape per call when the model's batch dim is -1; ONNX builder now installs a min/opt/max optimization profile when "batch_size" is provided. - Split decode_batch into a typed decode_batch_impl<float|uint8_t> for cleaner dtype dispatch (engine I/O dtypes float32 / uint8 unchanged). - Better INFO logging: total non-zero input vs residual detector counts per batch to help diagnose predecoder behavior. Signed-off-by: Ben Howe <bhowe@nvidia.com>

Add a realtime test/demo that initializes the TensorRT decoder from an ONNX predecoder model with PyMatching configured as the global decoder. The driver loads detector, observable, parity-check, observable, and prior data from the Stim export bundle, decodes samples through the composite TRT+PyMatching path, and reports latency, throughput, correctness, and residual-syndrome diagnostics. Register the new test_trt_decoder_composite target when TensorRT, realtime, and the TRT decoder plugin are available. Signed-off-by: Scott Thornton <wsttiger@gmail.com>

Add YAML/config support for TRT decoder runtime options including batch size, CUDA graph execution, global decoder selection, and PyMatching-specific global decoder parameters. Wire realtime decoder construction so TRT configs receive the top-level observable matrix from O_sparse, and pass the same O matrix into PyMatching global decoder params for composite observable decoding. Expose the new config fields through Python bindings and heterogeneous_map round-tripping. Extend YAML tests for TRT config round-trip, runtime parameter conversion, and O_sparse-to-O injection. Update test_trt_decoder_composite to support an optional --config-yaml path, allowing the existing composite demo to construct and run a real TRT+PyMatching decoder directly from YAML while preserving the original manual CLI path. Signed-off-by: Scott Thornton <wsttiger@gmail.com>

…yaml # Conflicts: # libs/qec/unittests/realtime/CMakeLists.txt # libs/qec/unittests/realtime/test_trt_decoder_composite.cpp

Replace the TRT decoder's hardcoded optional PyMatching global decoder params with a tagged global_decoder_config variant. Preserve PyMatching as the current supported concrete config while using std::monostate for the unset case. Update heterogeneous-map conversion, YAML mapping, and Python bindings so the existing PyMatching YAML/Python surface continues to round-trip. Extend the YAML unit test to verify the PyMatching variant arm is selected and still produces the expected runtime parameter map. Signed-off-by: Scott Thornton <wsttiger@gmail.com>

…yaml # Conflicts: # libs/qec/python/bindings/py_decoding_config.cpp

Signed-off-by: Scott Thornton <wsttiger@gmail.com>

bmhowe23 and others added 5 commits April 29, 2026 23:57

Merge branch 'main' into pr-composite-decoder

6da6ff1

Merge remote-tracking branch 'upstream/main' into update_trt_decoder_…

d3a387e

…yaml # Conflicts: # libs/qec/unittests/realtime/CMakeLists.txt # libs/qec/unittests/realtime/test_trt_decoder_composite.cpp

wsttiger marked this pull request as ready for review May 11, 2026 22:10

wsttiger added 5 commits May 12, 2026 00:45

Merge remote-tracking branch 'upstream/main' into update_trt_decoder_…

054f748

…yaml # Conflicts: # libs/qec/python/bindings/py_decoding_config.cpp

Merge branch 'main' into update_trt_decoder_yaml

d60c4e2

Merge branch 'main' into update_trt_decoder_yaml

5d1f2af

Signed-off-by: Scott Thornton <wsttiger@gmail.com>

Restore optional None setter helper

26be6b4

Signed-off-by: Scott Thornton <wsttiger@gmail.com>

wsttiger force-pushed the update_trt_decoder_yaml branch from 6c2eefc to 26be6b4 Compare May 29, 2026 17:15

wsttiger requested a review from melody-ren May 29, 2026 18:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expand TRT decoder YAML config for composite decoding [depends on PR #524]#536

Expand TRT decoder YAML config for composite decoding [depends on PR #524]#536
wsttiger wants to merge 10 commits into
NVIDIA:mainfrom
wsttiger:update_trt_decoder_yaml

wsttiger commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

wsttiger commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants