feat: Add Lean 4 formalization with hybrid Lean/Coq resolution#3
Conversation
Adds three modules following the existing Coq prover patterns: - LeanTranslator: translates claims to Lean 4 theorems via IR or regex, using omega/native_decide tactics - LeanProver: invokes the lean binary with sorry/admit detection, caching, timeout handling, and proper ProofStatus assignment - HybridLeanCoqResolver: tries one backend first with configurable fallback, exposing prove_specification() for drop-in use 16 new tests covering translation, prover status paths, assumption detection, timeout, and hybrid fallback behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PR SummaryMedium Risk Overview Introduces Reviewed by Cursor Bugbot for commit 9b8792a. Bugbot is set up for automated code reviews on this repo. Configure here. |
| lean = ( | ||
| f"theorem exists_claim : ∃ {variable} : Nat, " | ||
| f"{prop_lean} := ⟨1, by omega⟩" |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
- LeanProver: use ProofStatus.default_for_solver_result() for failed proofs instead of hardcoding REFUTED — a tactic insufficiency is INCONCLUSIVE, not a definitive disproof - LeanTranslator._exists: infer witness from property bounds (x > 5 → 6, x >= 10 → 10, x = 3 → 3) instead of hardcoding 1 - Add test for witness inference Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
_implication now extracts single-letter free variables from the hypothesis/conclusion and wraps them in ∀ binders, matching the Coq translator's behavior. Without this, implications like "if x > 5 then x > 3" generated invalid Lean 4 with undeclared variables. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…atus - _multiplication: use native_decide instead of omega for nonlinear arithmetic, matching factorial/fibonacci/gcd - HybridLeanCoqResolver: translation failures now return INCONCLUSIVE instead of UNAVAILABLE (which means binary-not-found) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
|
||
| return self.coq_prover.prove_specification(spec) |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
When Lean fails and Coq fallback also fails (e.g. UNAVAILABLE), return the Lean result instead of discarding it. Symmetric logic applied to both _lean_then_coq and _coq_then_lean — the initial prover's result is preserved when the fallback doesn't prove either. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| error_message="Proof attempt timed out", | ||
| counter_example=None, | ||
| proof_output="", | ||
| prover_name="lean", | ||
| solver_status=ProofStatus.TIMEOUT.value, | ||
| ) | ||
|
|
||
| if self.use_cache and self.cache: | ||
| self.cache.put(spec, result) | ||
|
|
||
| return result |
There was a problem hiding this comment.
Bug: An unhandled UnicodeDecodeError in _run_lean can cause an UnboundLocalError in prove_specification because the result variable is used before it is assigned.
Severity: MEDIUM
Suggested Fix
Expand the except block in prove_specification to catch other potential exceptions from _run_lean, such as UnicodeDecodeError. When such an exception is caught, assign a sensible default or error state to the result variable before the function attempts to use it, preventing the UnboundLocalError.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: formal_verification/lean_prover.py#L113-L131
Potential issue: In `prove_specification`, the `try...except` block only handles
`subprocess.TimeoutExpired`. If the `_run_lean` function raises a different exception,
such as a `UnicodeDecodeError` from decoding non-UTF-8 output from the Lean binary, the
exception will not be caught. This prevents the `result` variable from being assigned.
The function then proceeds to the `self.cache.put(spec, result)` and `return result`
statements, which will raise an `UnboundLocalError` because `result` was never assigned,
causing a crash.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is ON, but it could not run because the spend limit has been reached. To enable Bugbot Autofix, have a team admin raise the spend limit in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 9b8792a. Configure here.
| ) | ||
|
|
||
| if self.use_cache and self.cache: | ||
| self.cache.put(spec, result) |
There was a problem hiding this comment.
Timeout results incorrectly cached unlike CoqProver
Medium Severity
The cache.put call is placed after the try/except block, which means timeout results are cached (including to disk). In CoqProver, caching happens inside the try block before the except, so timeout results are intentionally not cached. This inconsistency means a transient timeout in LeanProver permanently prevents retries for the same spec, since subsequent calls return the cached TIMEOUT result without ever invoking lean again.
Reviewed by Cursor Bugbot for commit 9b8792a. Configure here.


Summary
omegaandnative_decidetacticsleanbinary withsorry/admit/axiomdetection, result caching, timeout handling, and properProofStatusassignment (MACHINE_CHECKED,REFUTED,FORMALIZED_UNPROVED, etc.)prove_specification()for drop-in use with the existing detectorReplaces the reverted #1 with a clean implementation that follows existing patterns (same
FormalSpec→ProofResultpipeline, same test style, same lint compliance).Test plan
ruff check --select ...) passes cleanruff checkpasses clean (no E501)🤖 Generated with Claude Code