Skip to content

Commit e23d64b

Browse files
kvmtovedika-saravanan
authored andcommitted
updates to validation
Signed-off-by: kvmto <kmato@nvidia.com>
1 parent 7e8acb5 commit e23d64b

5 files changed

Lines changed: 978 additions & 40 deletions

File tree

.claude/skills/cuda-qx-qec/SKILL.md

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,11 +13,11 @@ description: >-
1313
surface code, Steane code, repetition code, detector error model, DEM,
1414
dem_sampling, sliding window, predecoder, real-time decoding, Helios, or
1515
Quantinuum.
16-
version: "0.1.0"
16+
version: "0.2.0"
1717
author: "CUDA-QX"
18-
license: "Apache License 2.0"
19-
compatibility: "Linux x86_64/aarch64, Python 3.10+, C++ 20"
20-
tags: [cuda-qx, cudaq-qec, quantum-error-correction, decoders, surface-code, real-time-decoding, dem-sampling, nvidia]
18+
license: "LicenseRef-NVIDIA-Proprietary"
19+
compatibility: "Python 3.11+, C++ 20, Linux x86_64/aarch64"
20+
tags: [cuda-qx, cudaq-qec, qec, quantum-error-correction, decoders, surface-code, real-time-decoding, dem-sampling, nvidia]
2121
tools: [Read, Glob, Grep, Bash]
2222
metadata:
2323
author: "CUDA-QX"
@@ -218,3 +218,8 @@ runs but reports the wrong logical error rate.
218218
5. If the symptom is "LER looks wrong", go to the
219219
**Troubleshooting** section in `REFERENCE.md`. The first three
220220
causes there account for roughly 90% of cases.
221+
222+
## Additional resources
223+
224+
- Benchmark / eval prompts: [benchmark.md](benchmark.md)
225+
- Scoring helper: `.claude/skills/scripts/score_benchmark.py --skill qec --responses responses.json`
Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
# CUDA-QX QEC Skill Benchmark
2+
3+
Evaluation prompts for the `cuda-qx-qec` skill. Same methodology as the
4+
solvers benchmark; runnable via `scripts/score_benchmark.py`.
5+
6+
## Methodology
7+
8+
Run three passes:
9+
10+
1. With skill enabled
11+
2. Without skill (control)
12+
3. Activation pass (see below)
13+
14+
Two scoring layers:
15+
16+
**Human rubric** (per scenario, 0–8):
17+
18+
- Correctness (0–2): facts true, paths/APIs real
19+
- Specificity (0–2): cites files, exact API names, exact kwargs
20+
- Coverage (0–2): hits each "must include" item
21+
- No hallucinations (0–2): no "must not include" items
22+
23+
12 scenarios × 8 + 10 activation = 106 max.
24+
25+
**Substring proxy** (`scripts/score_benchmark.py`):
26+
27+
- Coverage max for QEC = 45 (sum of `must_include` items)
28+
- Purity max for QEC = 13 (1 per scenario, more for some)
29+
- Activation = 10
30+
- Substring total max = 68 for QEC. Always pair with a human pass.
31+
32+
---
33+
34+
## Scenario Prompts
35+
36+
### 1. F-order Parity Matrix
37+
38+
**prompt:** "My `qec.get_decoder('pymatching', H)` raises a runtime error.
39+
H is `dtype=uint8` but ordered as Fortran. Help."
40+
41+
- must_include: "C-order", "C-contiguous"
42+
- must_not_include: "F-order is supported"
43+
44+
### 2. cuStabilizer Missing
45+
46+
**prompt:** "Importing `cudaq_qec` fails with `libcustabilizer` not found."
47+
48+
- must_include: `cuquantum-python-cu12`, `cuquantum-python-cu13`,
49+
`26.03.0`
50+
- must_not_include: "uninstall cudaq_qec"
51+
52+
### 3. dem_sampling Backend Choice
53+
54+
**prompt:** "How do I force GPU sampling in `dem_sampling`, and what
55+
happens when GPU isn't available?"
56+
57+
- must_include: `backend="gpu"`, `RuntimeError`, `auto`, `cpu`
58+
- must_not_include: "silently falls back when gpu requested"
59+
60+
### 4. PyTorch CPU Tensor
61+
62+
**prompt:** "Can I pass a PyTorch CPU tensor to `dem_sampling`?"
63+
64+
- must_include: "no", "convert to NumPy"
65+
- must_not_include: "yes, supported"
66+
67+
### 5. Steane Memory Circuit Shape
68+
69+
**prompt:** "What is the shape of `syndromes` returned by
70+
`sample_memory_circuit(steane, prep0, numShots=10, numRounds=4)`?"
71+
72+
- must_include: `(40, 6)`, "numShots * numRounds"
73+
- must_not_include: `(10, 4, 6)`
74+
75+
### 6. Custom Python Code Requirements
76+
77+
**prompt:** "What attributes must my `@qec.code` class expose?"
78+
79+
- must_include: `stabilizers`, `pauli_observables`, `operation_encodings`
80+
- must_include: `get_num_data_qubits`, `get_num_ancilla_qubits`
81+
- must_not_include: "only stabilizers are required"
82+
83+
### 7. Sliding Window Decoder Setup
84+
85+
**prompt:** "How do I wrap pymatching in a sliding window decoder over a
86+
distance-5 surface code?"
87+
88+
- must_include: `sliding_window`, `window_size`, `step_size`,
89+
`num_syndromes_per_round`, `inner_decoder_name`
90+
- must_include: `error_rate_vec`
91+
- must_not_include: "sliding_window does not require an inner decoder"
92+
93+
### 8. NV-QLDPC Config Fields
94+
95+
**prompt:** "What does `nv_qldpc_decoder_config` accept?"
96+
97+
- must_include: `use_osd`, `bp_method`, `proc_float`, `error_rate_vec`,
98+
`max_iterations`
99+
- must_include: "default to None"
100+
- must_not_include: "all fields are required"
101+
102+
### 9. Tensor Network Decoder Install
103+
104+
**prompt:** "I get `ModuleNotFoundError: quimb` when loading the tensor
105+
network decoder. Fix?"
106+
107+
- must_include: `cudaq-qec[tensor_network_decoder]` (or `[all]`),
108+
`quimb`, `cuquantum-python`
109+
- must_not_include: "quimb is included by default"
110+
111+
### 10. License Surprise
112+
113+
**prompt:** "Is `cudaq-qec` Apache 2.0 like `cudaq-solvers`?"
114+
115+
- must_include: "no", `LicenseRef-NVIDIA-Proprietary`,
116+
`libs/qec/pyproject.toml`
117+
- must_not_include: "yes, both are Apache 2.0"
118+
119+
### 11. Operation Enum
120+
121+
**prompt:** "What logical operations does `qec.operation` expose?"
122+
123+
- must_include: `prep0`, `prep1`, `prepp`, `prepm`, `stabilizer_round`
124+
- must_include: `cx`, `cz`, `h`, `s`
125+
- must_not_include: "swap", "measure_x" (not in the enum)
126+
127+
### 12. DEM Variants
128+
129+
**prompt:** "What's the difference between `dem_from_memory_circuit`,
130+
`x_dem_from_memory_circuit`, and `z_dem_from_memory_circuit`?"
131+
132+
- must_include: "X errors only", "Z errors only", "full"
133+
- must_include: `DetectorErrorModel`, `detector_error_matrix`
134+
- must_not_include: "they are aliases"
135+
136+
---
137+
138+
## Activation Tests
139+
140+
| # | prompt | should_activate |
141+
| --- | --- | --- |
142+
| A1 | "Build a Steane memory experiment" | Y |
143+
| A2 | "Wrap pymatching in a sliding window decoder" | Y |
144+
| A3 | "Run dem_sampling on a parity check matrix" | Y |
145+
| A4 | "Configure NV-QLDPC for a custom code" | Y |
146+
| A5 | "Install CUDA-Q from pip" | N |
147+
| A6 | "Write a VQE for H2" | N |
148+
| A7 | "Run a Bell state kernel" | N |
149+
| A8 | "Generate a tensor-network decoder" | Y |
150+
| A9 | "Use ADAPT-VQE on a molecule" | N |
151+
| A10 | "Construct a detector error model from memory rounds" | Y |
152+
153+
## Sources
154+
155+
- `libs/qec/python/cudaq_qec/__init__.py`
156+
- `libs/qec/python/bindings/py_decoder.cpp`, `py_code.cpp`,
157+
`py_dem_sampling.cpp`
158+
- `libs/qec/python/cudaq_qec/dem_sampling.py`
159+
- `libs/qec/python/cudaq_qec/plugins/decoders/tensor_network_decoder.py`
160+
- `libs/qec/python/tests/test_*.py`
161+
- `libs/qec/pyproject.toml.cu12`
162+
- `docs/sphinx/components/qec/introduction.rst`

0 commit comments

Comments
 (0)