Skip to content

Improving error localization in Pulse#4247

Merged
mtzguido merged 17 commits into
masterfrom
_nik_err_loc
May 16, 2026
Merged

Improving error localization in Pulse#4247
mtzguido merged 17 commits into
masterfrom
_nik_err_loc

Conversation

@nikswamy
Copy link
Copy Markdown
Collaborator

@nikswamy nikswamy commented May 5, 2026

Pulse Error Localization: Comparison Report

Comparing error locations between:

  • Baseline: ~/.local/bin/fstar.exe (last binary release)
  • This branch: stage3/out/bin/fstar.exe (_nik_err_loc)

Summary

Category Count
Improved localization 6
New tests (no baseline) 0
Same location, new SMT detail format 18
Identical output 12
Total 36

Improved Error Locations

These tests show tighter error ranges on this branch due to fixes in:

  1. Parser ranges for assert/fold/unfold/rewrite (withBindersOpt fix in pulseparser.mly)
  2. Captured error_range_bound in deferred prove_pure/prove_with_pure closures
  3. Captured error_range_bound in deferred prove_atom_result closure (atom equivalence check)

FailAssertion.fst

Version Error Location
Baseline FailAssertion.fst(10,1-11,16)
This branch FailAssertion.fst(11,2-11,16)

Improvement: Range narrowed from 2 lines to 1 line(s).

Full baseline output
* Info at FailAssertion.fst(10,1-11,16):
  - Expected failure:
  - Tactic failed
  - Cannot prove:
      x > 0
  - In the context:
Full this-branch output
* Info at FailAssertion.fst(11,2-11,16):
  - Expected failure:
  - Tactic failed
  - Cannot prove:
      x > 0
  - In the context:

FoldError.fst

Version Error Location
Baseline FoldError.fst(12,1-14,4)
This branch FoldError.fst(13,2-13,22)

Improvement: Range narrowed from 3 lines to 1 line(s).

Full baseline output
* Info at FoldError.fst(12,1-14,4):
  - Expected failure:
  - Failed to prove pure property: false
  - Assertion failed
  - The SMT solver could not prove the query. Use --query_stats for more
    details.
Full this-branch output
* Info at FoldError.fst(13,2-13,22):
  - Expected failure:
  - Failed to prove pure property: false
  - Assertion failed
  - The SMT solver could not prove the query.
  - Env = (r : Pulse.Lib.Reference.ref Prims.int)
  - VC = Prims.l_False

IfBranchMismatch.fst

Version Error Location
Baseline IfBranchMismatch.fst(11,2-14,8)
This branch IfBranchMismatch.fst(14,4-14,8)

Improvement: Range narrowed from 4 lines to 1 line(s).

Full baseline output
* Info at IfBranchMismatch.fst(11,2-14,8):
  - Expected failure:
  - Could not prove equality of:
  - Pulse.Lib.Reference.pts_to r 2
  - Pulse.Lib.Reference.pts_to r 1
  - Assertion failed
  - The SMT solver could not prove the query. Use --query_stats for more
    details.
  - See also IfBranchMismatch.fst(11,2-14,10)

* Info at IfBranchMismatch.fst(30,4-30,6):
  - Expected failure:
  - Could not prove equality of:
  - Pulse.Lib.Reference.pts_to r 2
  - Pulse.Lib.Reference.pts_to r 1
  - Assertion failed
  - The SMT solver could not prove the query. Use --query_stats for more
    details.
Full this-branch output
* Info at IfBranchMismatch.fst(14,4-14,8):
  - Expected failure:
  - Could not prove equality of:
  - Pulse.Lib.Reference.pts_to r 2
  - Pulse.Lib.Reference.pts_to r 1
  - Assertion failed
  - The SMT solver could not prove the query.
  - Env =
      (r : Pulse.Lib.Reference.ref Prims.int) (b : Prims.bool)
      (_if_hyp : Prims.squash (Pulse.Lib.Core.rewrites_to_p b false))
      (_if_br : Prims.unit)
  - VC = Pulse.Lib.Reference.pts_to r 2 == Pulse.Lib.Reference.pts_to r 1
  - See also IfBranchMismatch.fst(14,4-14,10)

* Info at IfBranchMismatch.fst(30,4-30,6):
  - Expected failure:
  - Could not prove equality of:
  - Pulse.Lib.Reference.pts_to r 2
  - Pulse.Lib.Reference.pts_to r 1
  - Assertion failed
  - The SMT solver could not prove the query.
  - Env =
      (r : Pulse.Lib.Reference.ref Prims.int) (b : Prims.bool)
      (_if_hyp : Prims.squash (Pulse.Lib.Core.rewrites_to_p b false))
      (__ : Prims.unit)
  - VC = Pulse.Lib.Reference.pts_to r 2 == Pulse.Lib.Reference.pts_to r 1

InvariantPayload.fst

Version Error Location
Baseline InvariantPayload.fst(12,1-14,4)
This branch InvariantPayload.fst(13,2-13,16)

Improvement: Range narrowed from 3 lines to 1 line(s).

Full baseline output
* Info at InvariantPayload.fst(12,1-14,4):
  - Expected failure:
  - Failed to prove pure property: 0 > 0
  - Assertion failed
  - The SMT solver could not prove the query. Use --query_stats for more
    details.
Full this-branch output
* Info at InvariantPayload.fst(13,2-13,16):
  - Expected failure:
  - Failed to prove pure property: 0 > 0
  - Assertion failed
  - The SMT solver could not prove the query.
  - Env = (r : Pulse.Lib.Reference.ref Prims.int)
  - VC = Prims.l_False

MatchBranchMismatch.fst

Version Error Location
Baseline MatchBranchMismatch.fst(11,2-13,15)
This branch MatchBranchMismatch.fst(13,11-13,15)

Improvement: Range narrowed from 3 lines to 1 line(s).

Full baseline output
* Info at MatchBranchMismatch.fst(11,2-13,15):
  - Expected failure:
  - Could not prove equality of:
  - Pulse.Lib.Reference.pts_to r 2
  - Pulse.Lib.Reference.pts_to r 1
  - Assertion failed
  - The SMT solver could not prove the query. Use --query_stats for more
    details.
  - See also MatchBranchMismatch.fst(11,2-13,17)
Full this-branch output
* Info at MatchBranchMismatch.fst(13,11-13,15):
  - Expected failure:
  - Could not prove equality of:
  - Pulse.Lib.Reference.pts_to r 2
  - Pulse.Lib.Reference.pts_to r 1
  - Assertion failed
  - The SMT solver could not prove the query.
  - Env =
      (r : Pulse.Lib.Reference.ref Prims.int) (x : Prims.int)
      (branch equality : Prims.squash (x == 1)) (_br : Prims.unit)
  - VC = Pulse.Lib.Reference.pts_to r 2 == Pulse.Lib.Reference.pts_to r 1
  - See also MatchBranchMismatch.fst(13,11-13,17)

TerminalActionPostConditionMismatch.fst

Version Error Location
Baseline TerminalActionPostConditionMismatch.fst(11,2-12,6)
This branch TerminalActionPostConditionMismatch.fst(12,2-12,6)

Improvement: Range narrowed from 2 lines to 1 line(s).

Full baseline output
* Info at TerminalActionPostConditionMismatch.fst(11,2-12,6):
  - Expected failure:
  - Could not prove equality of:
  - Pulse.Lib.Reference.pts_to r 2
  - Pulse.Lib.Reference.pts_to r 1
  - Assertion failed
  - The SMT solver could not prove the query. Use --query_stats for more
    details.
  - See also TerminalActionPostConditionMismatch.fst(11,2-12,8)

* Info at TerminalActionPostConditionMismatch.fst(23,2-23,4):
  - Expected failure:
  - Could not prove equality of:
  - Pulse.Lib.Reference.pts_to r 2
  - Pulse.Lib.Reference.pts_to r 1
  - Assertion failed
  - The SMT solver could not prove the query. Use --query_stats for more
    details.
Full this-branch output
* Info at TerminalActionPostConditionMismatch.fst(12,2-12,6):
  - Expected failure:
  - Could not prove equality of:
  - Pulse.Lib.Reference.pts_to r 2
  - Pulse.Lib.Reference.pts_to r 1
  - Assertion failed
  - The SMT solver could not prove the query.
  - Env =
      (r : Pulse.Lib.Reference.ref Prims.int) (__ : Prims.unit)
      (__ : Prims.unit)
  - VC = Pulse.Lib.Reference.pts_to r 2 == Pulse.Lib.Reference.pts_to r 1
  - See also TerminalActionPostConditionMismatch.fst(12,2-12,8)

* Info at TerminalActionPostConditionMismatch.fst(23,2-23,4):
  - Expected failure:
  - Could not prove equality of:
  - Pulse.Lib.Reference.pts_to r 2
  - Pulse.Lib.Reference.pts_to r 1
  - Assertion failed
  - The SMT solver could not prove the query.
  - Env =
      (r : Pulse.Lib.Reference.ref Prims.int) (__ : Prims.unit)
      (__ : Prims.unit)
  - VC = Pulse.Lib.Reference.pts_to r 2 == Pulse.Lib.Reference.pts_to r 1

Same Location, New SMT Detail Format

These tests report the same error location on both versions. The only difference is the
upstream change (PR #4258) that now shows Env = and VC = inline instead of
"Use --query_stats for more details."

Test Error Location
AssertExtraContext.fst AssertExtraContext.fst(11,7-11,43)
ClosureError.fst ClosureError.fst(17,2-17,8)
ExistsIntro.fst ExistsIntro.fst(12,2-12,4)
IllTypedInvariant.fst IllTypedInvariant.fst(13,43-13,44)
IllTypedSpec.fst IllTypedSpec.fst(8,11-8,12)
IntroExistsFail.fst IntroExistsFail.fst(11,2-11,4)
LambdaArg.fst LambdaArg.fst(18,2-18,11)
MultiError.fst MultiError.fst(12,18-12,19)
NestedBlock.fst NestedBlock.fst(14,2-14,4)
NestedCall.fst NestedCall.fst(18,2-18,9)
OpenScopeError.fst OpenScopeError.fst(11,2-11,25)
PrePostMismatch.fst PrePostMismatch.fst(12,2-12,4)
ReturnImplicit.fst ReturnImplicit.fst(12,2-12,4)
RewriteFail.fst RewriteFail.fst(11,11-11,33)
SequenceError.fst SequenceError.fst(13,2-13,4)
SubtypingFailure.fst SubtypingFailure.fst(18,2-18,10)
WhileInvPreservation.fst WhileInvPreservation.fst(15,4-15,8)
WithLocalError.fst WithLocalError.fst(13,21-13,22)

Identical Output

These tests produce identical output on both versions (no SMT errors, so no format change).

Test Error Location
AtomicMismatch.fst AtomicMismatch.fst(12,2-12,8)
BindTypeMismatch.fst BindTypeMismatch.fst(19,17-19,26)
DuplicateResource.fst DuplicateResource.fst(11,2-11,4)
FramingFailure.fst FramingFailure.fst(11,10-11,13)
GhostEffect.fst GhostEffect.fst(21,10-21,18)
IfCondType.fst IfCondType.fst(11,5-11,6)
LeftoverResources.fst LeftoverResources.fst(11,2-11,4)
LetBindingType.fst LetBindingType.fst(11,17-11,19)
MissingPost.fst MissingPost.fst(12,2-12,4)
MutualExcl.fst MutualExcl.fst(11,2-11,4)
ReturnTypeMismatch.fst ReturnTypeMismatch.fst(12,2-12,3)
UnfoldFailure.fst UnfoldFailure.fst(13,4-13,6)

Remaining Things to Improve

1. Multi-line ranges that could potentially be tighter

These are cases where the reported range is wide. Some may be inherently correct
(e.g., a rewrite that spans multiple lines), while others may benefit from further
narrowing in the future.


Fixes in This Branch

Commit Fix Impact
766f80a Pulse.Checker.If: wrap branch checking with with_error_bound branch_range If/Match branches get branch-level error bound
3cedfb6 pulseparser.mly: split withBindersOpt alternatives to fix $startpos inheritance assert/fold/unfold/rewrite ranges no longer include enclosing block start
b537cbc Pulse.Checker.Prover: with_saved_bound in prove_pure/prove_with_pure Pure proof obligations report correct range even when deferred
4f753d2 Pulse.Checker.Prover: with_saved_bound in prove_atom_result Atom equiv checks report correct range even when deferred
ad4d24c FStarC.Parser.Dep: sort valid_namespaces for deterministic output Namespace list in error messages is stable across runs

nikswamy and others added 14 commits May 1, 2026 12:01
The Pulse prover uses lazy continuations (cont_elab_thunk) where the
check_slprop_equiv_ext call inside prove_atom_result is deferred until
elaboration time. By then, the with_error_bound scope from the branch
checker has already exited, causing errors to be localized to the
entire if-expression rather than the offending branch.

Fix: capture the current error_range_bound when prove_atom_result is
created (while the correct bound is still active), and restore it
inside the lazy closure via RU.with_error_bound.

Also wrap each branch check in Pulse.Checker.If with
RU.with_error_bound using the branch's source range, so that errors
within a branch are bounded to that branch.

Before: error at (10,2-13,8) pointing to the whole if-expression
After:  error at (13,4-13,8) pointing to the specific branch statement

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
16 test files in pulse/test/error_messages/ covering various error
categories: assertion failures, framing, ghost effects, if-branch
mismatches, type mismatches, subtyping, unfolding, etc.

Each file uses [@@expect_failure [code]] to verify the expected error
is produced. These serve as a regression suite for error localization.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When assert, fold, unfold, assume, or rewrite statements appear as the
first statement in a function body (after '{'), the Menhir parser was
producing ranges that started at the '{' character rather than at the
keyword.

Root cause: withBindersOpt could derive epsilon, causing Menhir's
$startpos to inherit $endpos of the preceding token (LBRACE). This
propagated through pulseStmtNonempty -> pulseStmtNoSeq -> pulseStmt and
into the statement's range.

Fix: Split each 'bs=withBindersOpt KEYWORD ...' alternative into two:
one starting directly with the keyword (no binders), and one with
explicit 'bs=withBinders KEYWORD ...' (which starts with WITH, a real
token). Remove the now-unused withBindersOpt non-terminal.

Before: assert errors at (10,1-11,16) including the '{' line
After:  assert errors at (11,2-11,16) pinpointing the assert

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add comprehensive regression test suite for Pulse error localization.
Each test file triggers a specific error category and pins down the
exact error range via .fst.output.expected files.

Test categories covered (35 tests):
- Assertion failures (assert, fold, unfold, rewrite, nested block)
- Type mismatches (if condition, let binding, return, spec, with-local)
- Framing/resource errors (leftover, duplicate, missing post, mutual excl)
- Branch mismatches (if, match)
- Effect errors (ghost in non-ghost context)
- Prover failures (exists intro, invariant payload, subtyping)
- Scope/binding errors (open nonexistent, arity mismatch)
- Loop errors (while invariant preservation)
- Sequence/bind errors (postcondition at continuation)
- Closure/local function errors (pre-condition mismatch at call site)
- Multiple errors in single function

The Makefile uses diff-based comparison (same approach as
FStar/tests/error-messages) with:
- --print_expected_failures for clean output
- --ext fstar:no_absolute_paths for reproducibility
- Warning suppression for non-localization noise
- Integrated into pulse/test as a SUBDIRS entry

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The prover's prove_pure and prove_with_pure functions create lazy
continuations (via cont_elab_thunk) that defer check_prop_validity calls
until elaboration time. By then, the with_error_bound scope has exited,
causing errors to be localized to the outer term (e.g., entire function
body) rather than the specific statement.

Fix: capture the current error_range_bound when the deferred closure is
created and restore it when check_prop_validity runs inside the closure.
Same pattern as the if-branch fix for prove_atom_result.

Example: 'fold (my_inv r (-1)); ()' now correctly shows error at the
fold statement (13,2-13,22) instead of spanning both statements (13,2-14,4).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add continuation statements (; ()) after error-causing statements in
test files to ensure error localization doesn't bleed into following
code. This catches regressions where the error range expands to include
the continuation.

Also uncomment [@@expect_failure] attributes on FoldError, IllTypedInvariant,
and NestedBlock so __verify and __diff both succeed in the test harness.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…lization

Previously, prove_atom_result deferred check_slprop_equiv_ext to a
continuation closure. This caused errors from mismatched atoms to be
mislocalized when the closure executed after with_error_bound had
exited. In particular, 'assert (P); continuation' would show errors
at the continuation rather than at the assert statement, because the
postcondition prover's deferred closure ran first in continuation
order.

By running check_slprop_equiv_ext eagerly (at prove time rather than
elaboration time), errors are reported with the correct
error_range_bound context. This is safe because the check uses the
original environment g (not the extended g'), so it doesn't depend on
later prover state.

The saved_bound/get_error_bound mechanism is retained in prove_pure
and prove_with_pure, where the deferred checks need the extended
environment g''.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When the prover matches an atom (slprop) from context to goal, it needs to
verify the equivalence via check_slprop_equiv_ext. Previously this was always
deferred in a continuation closure, which caused error mislocalization: if the
continuation ran after with_error_bound exited, the error range was wrong.

The fix uses a hybrid approach:
- If both terms are uvar-free (checked via deep_compress_safe + no_uvars_in_term),
  run the check eagerly while the correct error_range_bound is active.
- If either term has unresolved uvars (which later prover steps may instantiate),
  defer the check in a continuation but capture/restore the error_range_bound.

Also factors out shared helpers (with_saved_bound, check_atom_equiv) to reduce
duplication across the eager/deferred paths and the prove_pure/prove_with_pure
functions that use the same saved-bound pattern.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When the prover matches atoms or pure goals, the equivalence/validity checks
are deferred in continuation closures (cont_elab_thunk). These closures run
after the original with_error_bound scope has exited, causing errors to be
reported with incorrect ranges.

Fix: capture the active error_range_bound at match time (via get_error_bound)
and restore it inside the deferred closure (via with_error_bound). This is done
in prove_pure and prove_with_pure via a shared with_saved_bound helper.

The atom equivalence check in prove_atom_result remains deferred (inside the
continuation) as before—this is necessary because the prover may match atoms
before all unification variables are resolved.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Regenerate .output.expected files for both pulse/test/error_messages (21 files)
and pulse/test/bug-reports (21 files) to account for:

1. Upstream PR #4258 (gebner_default_smtfail): SMT failure messages now include
   inline Env/VC details instead of 'Use --query_stats for more details'

2. Parser range improvements from earlier commits in this branch: Bug267.fst
   and IntroGhost.fst show tighter error ranges due to the
   withBindersOpt parser fix

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Leftover from the reverted eager atom equivalence check experiment.
The inline logic in prove_atom_result's deferred closure does the same work.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
nikswamy and others added 2 commits May 14, 2026 17:22
- RewriteNopWarning.fst: (12,1-13,21) -> (13,2-13,21) due to parser fix
- Test.RewriteBy.fst: (31,1-37,8) -> (32,2-37,8) due to parser fix
- FStarC.Parser.Dep: sort valid_namespaces list for deterministic output
- OpenScopeError.fst: regenerated with sorted namespace list

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The prover's prove_atom_result defers check_slprop_equiv_ext into a
continuation closure that executes after the with_error_bound scope
has exited, causing errors to inherit the wider enclosing range.

Fix: capture the active error_range_bound when the atom match is made,
and restore it (via with_saved_bound) when the deferred closure runs.
This is the same pattern already used for prove_pure/prove_with_pure.

Impact on error localization:
- IfBranchMismatch: (11,2-14,8) -> (14,4-14,8) — points to the exact
  mismatched statement instead of the whole if-else block
- MatchBranchMismatch: (11,2-13,15) -> (13,11-13,15) — points to the
  exact mismatched statement instead of the whole match
- TerminalAction: (11,2-12,6) -> (12,2-12,6) — points to the terminal
  statement instead of the enclosing block

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@nikswamy nikswamy marked this pull request as ready for review May 15, 2026 05:16
@mtzguido mtzguido merged commit f6aab05 into master May 16, 2026
3 checks passed
@mtzguido mtzguido deleted the _nik_err_loc branch May 16, 2026 03:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants