Skip to content

Latest commit

 

History

History
265 lines (189 loc) · 8.81 KB

File metadata and controls

265 lines (189 loc) · 8.81 KB

Qiskit Code Validation with Instruct-Validate-Repair

This example demonstrates using Mellea's Instruct-Validate-Repair (IVR) pattern to generate Qiskit quantum computing code that automatically passes flake8-qiskit-migration validation rules (QKT rules).

What This Example Does

Takes a prompt containing deprecated Qiskit code and:

  1. Generates corrected code using the LLM
  2. Validates the output against QKT rules
  3. Automatically repairs the code if validation fails (up to 10 attempts)

Quick Start

# Run the example (uses default deprecated code prompt)
uv run docs/examples/instruct_validate_repair/qiskit_code_validation/qiskit_code_validation.py

Dependencies (mellea, flake8-qiskit-migration) are automatically installed.

Requirements

  • Ollama backend running locally (ollama serve)
  • Compatible model: hf.co/Qiskit/mistral-small-3.2-24b-qiskit-GGUF:latest (recommended — domain-specialized; see Changing the Model)
  • flake8-qiskit-migration: Automatically installed when using uv run

How It Works

The IVR Pipeline

  1. Instruction: LLM generates code following structured requirements
  2. Post-condition validation: Validates generated code against QKT rules (see Qiskit Migration Guide)
  3. Repair loop: Automatically repairs code that fails validation (up to 10 attempts)

Sampling Strategies

The example supports two repair strategies (see Sampling Strategies):

  • RepairTemplateStrategy (default): Adds validation failure reasons directly to the instruction and retries generation
  • MultiTurnStrategy: Builds conversation history by adding validation failures as new user messages

To switch strategies, edit the use_multiturn_strategy variable in test_qiskit_code_validation()

Note: MultiTurnStrategy requires ChatContext() while RepairTemplateStrategy works with SimpleContext(). The example automatically selects the appropriate context based on your strategy choice.

Strategy Performance Comparison

Benchmarks on mistral-small-3.2-24b-qiskit model:

Dataset Strategy First Pass (QKT) Post-Repair (QKT)
QHE RepairTemplate 97.4% 100%
MultiTurn 95.4% 100%
QKT RepairTemplate 88.9% 100%
MultiTurn 97.8% 100%

Datasets:

  • QHE (QiskitHumanEval): 151 general Qiskit code generation tasks
  • QKT: 45 Qiskit version migration tasks requiring fixes to deprecated APIs

Note: Pass rates measure whether generated code passes QKT validation rules, not whether the code correctly solves the prompt. On QHE, the model achieves ~27.8% correctness when running the QHE check() test suite against the generated code. Full benchmark data and analysis are available in @ajbozarth's toolbox repo.

Code Structure

qiskit_code_validation/
├── qiskit_code_validation.py   # Main example
├── validation_helpers.py       # Validation utilities
└── README.md                   # This file

validation_helpers.py provides:

  • extract_code_from_markdown(): Extracts code from markdown blocks
  • validate_qiskit_migration(): Validates against QKT rules
  • validate_input_code(): Pre-validates input prompts

Trying Different Prompts

To try different prompts, edit the prompt variable in test_qiskit_code_validation() function. Here are some examples you can copy/paste:

Simple Prompts

Bell State Circuit:

prompt = "create a bell state circuit"

List Backends:

prompt = "use qiskit to list fake backends"

Random Circuit:

prompt = "give me a random qiskit circuit"

Code Completion Prompts

Toffoli Gate:

prompt = """Complete this code:
```python
from qiskit import QuantumCircuit

qc = QuantumCircuit(3)
qc.toffoli(0, 1, 2)

# draw the circuit
```
"""

Entanglement Circuit:

prompt = """from qiskit import QuantumCircuit

# create an entanglement state circuit
"""

Deprecated Code (Default)

The default prompt demonstrates fixing deprecated Qiskit APIs:

prompt = """from qiskit import BasicAer, QuantumCircuit, execute

backend = BasicAer.get_backend('qasm_simulator')

qc = QuantumCircuit(5, 5)
qc.h(0)
qc.cnot(0, range(1, 5))
qc.measure_all()

# run circuit on the simulator"""

This code uses deprecated APIs (BasicAer, execute) that the LLM will automatically fix to use modern Qiskit APIs.

Complex Prompts

Runtime Service with Estimator:

prompt = """from qiskit.circuit.random import random_circuit
from qiskit.quantum_info import SparsePauliOp
from qiskit_ibm_runtime import Estimator, Options, QiskitRuntimeService, Session

# create a Qiskit random circuit named "circuit" with 2 qubits, depth 2, seed 1.
# After that, generate an observable type SparsePauliOp("IY"). Run it in the backend "ibm_sherbrooke" using QiskitRuntimeService inside a session
# Instantiate the runtime Estimator primitive using the session and the options optimization level 3 and resilience level 2. Run the estimator
# Conclude the code printing the observable, expectation value and the metadata of the job."""

Bell Circuit with Runtime Service:

prompt = """from qiskit import QuantumCircuit
from qiskit_ibm_runtime import QiskitRuntimeService

# define a Bell circuit and run it in ibm_salamanca using QiskitRuntimeService"""

Expected Output

When you run the example with the default deprecated code prompt, you'll see:

====== Prompt ======
from qiskit import BasicAer, QuantumCircuit, execute

backend = BasicAer.get_backend('qasm_simulator')

qc = QuantumCircuit(5, 5)
qc.h(0)
qc.cnot(0, range(1, 5))
qc.measure_all()

# run circuit on the simulator
======================

Validation failed with 1 error(s):
QKT101: QuantumCircuit.cnot() has been removed in Qiskit 1.0; use `.cx()` instead

====== Result (23.1s, 2 attempt(s)) ======
```python
from qiskit_aer import AerSimulator, QuantumCircuit

backend = AerSimulator()

qc = QuantumCircuit(5, 5)
qc.h(0)
qc.cx(0, range(1, 5))
qc.measure_all()
```
======================

✓ Code passes Qiskit migration validation

Note: The exact output may vary depending on the model and its interpretation of the prompt.

Changing the Model

To try a different model, edit the model_id variable in the test_qiskit_code_validation() function:

model_id = "hf.co/Qiskit/mistral-small-3.2-24b-qiskit-GGUF:latest"

The default model is a Qiskit-specialized fine-tune of Mistral Small. It requires a large initial download (~15GB) but produces reliable results without a system prompt.

General-purpose models (e.g. granite4:micro-h) can be used as a lighter alternative but have significantly lower correctness on Qiskit tasks. When using a non-specialized model, set system_prompt = QISKIT_SYSTEM_PROMPT to improve results.

Using Grounding Context

The grounding_context parameter accepts a dict[str, str] of additional context passed to the LLM alongside the prompt. Keys act as section labels and values are the content. This is useful for injecting relevant documentation snippets, RAG results, or API references at inference time.

Example — injecting migration guide excerpts:

grounding_context = {
    "primitives_migration": (
        "SamplerV2 replaces the legacy execute() function. "
        "Use: sampler = SamplerV2(backend); job = sampler.run([circuit]); result = job.result()"
    ),
    "transpilation": (
        "Use generate_preset_pass_manager() instead of transpile(). "
        "Example: pm = generate_preset_pass_manager(optimization_level=1, backend=backend); isa_circuit = pm.run(circuit)"
    ),
}

code, success, attempts = generate_validated_qiskit_code(
    m, prompt, strategy, grounding_context=grounding_context
)

Troubleshooting

Ollama Connection Refused

Error: Connection refused

Solution: Start Ollama with ollama serve

Model Not Found

Error: model 'hf.co/Qiskit/mistral-small-3.2-24b-qiskit-GGUF:latest' not found

Solution: Pull the model first:

ollama pull hf.co/Qiskit/mistral-small-3.2-24b-qiskit-GGUF:latest

Validation Always Fails

If using a general-purpose model, it may not have enough Qiskit knowledge to pass validation consistently. Try:

  • Switching to the Qiskit-specialized model (hf.co/Qiskit/mistral-small-3.2-24b-qiskit-GGUF:latest)
  • Setting system_prompt = QISKIT_SYSTEM_PROMPT to guide the model toward modern Qiskit APIs
  • Using simpler prompts

Import Error: flake8-qiskit-migration

ModuleNotFoundError: No module named 'flake8_qiskit_migration'

Solution: Use uv run which auto-installs dependencies