Skip to content

Add Concrete Syntax for Unstructured Programs in Strata Core#1196

Open
PROgram52bc wants to merge 101 commits into
main2from
htd/unstructured-procedure
Open

Add Concrete Syntax for Unstructured Programs in Strata Core#1196
PROgram52bc wants to merge 101 commits into
main2from
htd/unstructured-procedure

Conversation

@PROgram52bc

Copy link
Copy Markdown
Contributor

Add Support for Unstructured Programs in Strata Core

Description of changes

  • Adds syntax and parser support for unstructured (CFG-based) procedure bodies in Strata Core. For example,
cfg entry { 
    // Deterministic block (default): 
    entry: {  
      x := 0; 
      if (x < n) goto entry else done;       
    } 
    done: {
      return; 
    }
  }
  
cfg entry {
    // Nondeterministic block:   
    entry: {   
      y := y + 1;     
      goto entry, done;   // nondeterministic choice 
    }
    done: {
      return; 
    }
  }

Prior to this PR, the only way to obtain unstructured programs was to apply StructuredToUnstructured transformation.

  • Adds .core.st file examples illustrating the syntax of unstructured programs.
  • Introduces Procedure.Body as a sum type:
  inductive Procedure.Body where
    | structured : List Statement → Procedure.Body
    | cfg : DetCFG → Procedure.Body
  • Adapts uses of Procedure.body to handle the cfg case based on the context
  • Adds metadata support to unstructured programs (new since initial PR draft)
  • Propagates loop invariants/decreases measure to unstructured CFG during transformation (new since initial PR draft)
  • Adds CFG-based Core-to-GOTO pipeline alongside direct path (new since initial PR draft)

Adaptation Methods for Procedure.body Uses

  1. CFG (Implemented) — Functions that handle both structured and CFG bodies with meaningful separate logic for each case.
  2. CFG (N/A) — Operations where CFGs legitimately contribute nothing (e.g., no local funcDecls, no structured loops), so returning [] or skipping is semantically correct.
  3. CFG (deferred) — Operations that currently error on CFGs or produce placeholder results, where proper CFG support is desirable but requires non-trivial additional work (e.g., dominator analysis, two-stage
    pipelines, graph-level inlining).
  4. Proof (Implemented) — Formal correctness proofs adapted to the Procedure.Body sum type.
  5. Proof (deferred) — Proof obligations where CFG correctness is not yet captured; the current formulation is trivially true for CFGs and is left for future PRs.

(Bolded items are updates since the initial PR draft)

Category Specific Strategy File Function/Location
Proof (Implemented) postconditionsValid adapted with CoreBodyExec Transform/CoreSpecification.lean ProcedureCorrect.postconditionsValid
Proves structured if transform succeeds Transform/ProcBodyVerifyCorrect.lean procToVerifyStmt_is_structured (new helper theorem)
Unifies two ss witnesses via subst Transform/ProcBodyVerifyCorrect.lean procBodyVerify_procedureCorrect
Adjusted case splits Languages/Core/ObligationExtraction.lean extractGo_ok proof
CFG (Implemented) Symbolic evaluation of CFG bodies using fuel measure, without path merging. Languages/Core/ProcedureEval.lean eval
Type-check CFG bodies (labels, vars, targets) implemented. Languages/Core/ProcedureType.lean typeCheck, checkModificationRights
Alternative two-stage pipeline implemented by procedureToGotoCtxViaCFG Backends/CBMC/GOTO/CoreToCProverGOTO.lean transformToGoto
Alternative two-stage pipeline implemented by procedureToGotoCtxViaCFG Backends/CBMC/GOTO/CoreToGOTOPipeline.lean procedureToGotoCtx
Alternative two-stage pipeline implemented by procedureToGotoCtxViaCFG StrataTest/.../E2E_CoreToGOTO.lean coreToGotoJsonWithSummary
Alternative two-stage pipeline implemented by procedureToGotoCtxViaCFG Backends/CBMC/CoreToCBMC.lean createImplementationSymbolFromAST
extractCallsFromStatements vs extractCallsFromDetCFG Languages/Core/CallGraph.lean extractCallsFromProcedure
blockToCST vs detCFGToCST Languages/Core/DDMTransform/FormatCore.lean procToCST
runStmt vs runCFG Languages/Core/StatementEval.lean Command.runCall
Use CoreBodyExec with two constructors, and CoreStepStar vs CoreCFGStepStar Languages/Core/StatementSemantics.lean EvalCommand.call_sem
Lang.core vs Lang.coreCFG Transform/CoreSpecification.lean AssertValidInProcedure
transformStmts vs transformDetCFG Transform/PrecondElim.lean precondElim
eraseTypes, stripMetaData, getVars implemented and used for CFG Languages/Core/Procedure.lean eraseTypes, stripMetaData, getVars
extractFromStatements vs extractFromDetCFG Languages/Core/ObligationExtraction.lean extractObligations
CFG (N/A) return .none on CFG, skip inlining, because the inlined procedure could be structured or unstructured. Semantics is largely undefined. Transform/ProcedureInlining.lean inlineCallCmd
stmtsToCFG vs identity, latter is a no-op. StrataTest/.../Loops.lean singleCFG
Remove loops in structured, pass CFG through because it has no loops Transform/LoopElim.lean loopElim
[] for CFG (no local funcDecls in CFG bodies) Languages/Core/Core.lean buildEnv
CFG (deferred) Dominator-based path-condition propagation to be implemented Languages/Core/ObligationExtraction.lean extractFromDetCFG
throw on CFG, due to incompatible return type of List Statement instead of List Command Transform/CoreTransform.lean runProgram
throw on CFG, VC for CFG bodies to be implemented, proof to be adapted as well. Transform/ProcBodyVerify.lean procToVerifyStmt
Encode structured procedures, handling for CFG procedures to be implemented. Transform/ANFEncoder.lean anfEncodeProgram
CFG body renaming for inlining to be implemented Transform/ProcedureInlining.lean renameAllLocalNames
Call extraction for CFG bodies to be implemented StrataTest/Boole/global_readonly_call.lean callHelper
Proof (deferred) WF Properties wfstmts, wfloclnd, bodyExitsCovered conditioned on structured procedures, props for CFG to be implemented Languages/Core/WF.lean WFProcedureProp

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

PROgram52bc added 30 commits May 5, 2026 11:24
Address PR review comments: note that the GOTO backend could be
refactored into a two-stage pipeline (structured→cfg, then cfg→GOTO)
to eliminate the pattern matching on Procedure.Body.
- Add Body.getCfg accessor (mirrors getStructured)
- CoreToCBMC: throw error on CFG body instead of returning []
- StatementEval: interpret CFG bodies by linearizing blocks
- Grammar: add comment explaining why 'branch' is used instead of 'if'
  (DDM registers tokens globally, causing conflict with if-statement)
- WF.lean: replace .stmts with case-split premises (bodyIsStructured)
- Procedure.lean: replace isEmpty with isAbstract/isStructured/isCfg,
  remove Body.stmts definition
- ProcedureInlining test: remove unused getStmts helper
- Boole test: use explicit pattern matching with comment
- Update all .body.stmts usages to explicit match expressions
- Fix ProcBodyVerifyCorrect proof to use new WF structure
…r main2 merge

The synthesized PR-merge commit triggered three #guard_msgs failures in
StrataTest/Languages/Core/Examples/Loops.lean because byte-offset
provenance annotations baked into the expected docstrings drifted when
main2 content shifted the source file:

  - line 82  (1354-1460 → 1312-1418, Δ -42)
  - line 175 (3322-3478 → 3280-3436, Δ -42)
  - line 449 (9337-9590 → 9390-9643, Δ +53)
  - line 449 (9457-9570 → 9510-9623, Δ +53)

Mechanical update: the structured-to-unstructured translator emits the
correct provenance ranges; only the test's expected output needed to
catch up with the post-merge source positions.

Reproduces locally with: `git merge origin/main2 && lake build
StrataTest.Languages.Core.Examples.Loops`

Build clean (491 jobs) after fix.
Same byte-offset drift class as the prior Loops.lean fix
(cb658ca): main2's content shifted source positions by -64 bytes
uniformly across two #guard_msgs sites in Exit.lean (lines 130 and
167), totaling 13 provenance annotations.

Mechanical update: 13 byte-range corrections, all -64 from the prior
expected values. Build clean (491 jobs).
PROgram52bc added a commit that referenced this pull request Jun 9, 2026
The Examples revert in e4d4ca5 (inherit main2 spelling) brought back main2's #guard_msgs expected outputs, but main2's content has byte-offsets that don't match this branch's source layout (which has additional small-step CFG infrastructure changes shifting the provenance numbers).

Loops.lean (4 sites at lines 80, 173, 447): top-of-file shifts -42 bytes, bottom-of-file shifts +53 bytes (same drift signature as the recent PR #1196 fix).

Exit.lean (13 sites at lines 130, 167): uniform -64 byte shift (also matches PR #1196's drift pattern for this file).

Mechanical update: byte-range corrections only. Build clean (StrataTest target: 519 jobs).
keyboardDrummer pushed a commit to keyboardDrummer/Strata that referenced this pull request Jun 11, 2026
…ng (split of strata-org#1196) (strata-org#1342)

## Summary

This PR introduces `Procedure.Body := .structured (List Statement) |
.cfg DetCFG` — a sum type letting a Strata Core procedure carry either a
structured statement-list body or a deterministic control-flow graph. It
includes the type infrastructure, consumer threading across the
verification pipeline, the DDM parser/translator support for the new
`cfg` syntax, and CFG example programs.

This PR is split out of
[strata-org#1196](strata-org#1196) as a
self-contained PR that can land independently on `main2`. It contributes
~714 LoC of additions vs `main2` (compared to ~5,567 LoC on
[strata-org#1196](strata-org#1196)). It is
**independent** of the companion split [strata-org#1341
(htd/unstructured-infra)](strata-org#1341)
— the two PRs touch disjoint files (only `ProcBodyVerifyCorrect.lean`
overlaps in different regions, mergeable cleanly).

This PR does **not** introduce the imperative-DL infrastructure
(metadata-bearing transfers, `EvalDetBlock` constructor renames, etc.) —
those live in the companion [strata-org#1341
`htd/unstructured-infra`](strata-org#1341)
PR. Without that, this PR's `CoreBodyExec` semantic relation has only
the `.structured` constructor; the `.cfg` constructor is added when both
branches merge.

## What's added

The 44 files modified group into 5 layers:

### Type infrastructure (`Procedure.lean`, ~110 LoC)
- `inductive Procedure.Body := .structured (List Statement) | .cfg
DetCFG` plus `Inhabited` instance
- `abbrev DetCFG := Imperative.CFG String (Imperative.DetBlock String
Command Expression)`
- 7 projection helpers: `getStructured`, `getCfg`, `getVars`,
`isAbstract`, `isStructured`, `isCfg`, `structuredLength`
- `HasVarsPure`/`HasVarsImp` instances on `Body` and `DetCFG`
- `DetCFG.eraseTypes`, `DetCFG.stripMetaData`
- `Procedure.body` field flips from `List Statement` to `Procedure.Body`
(default `.structured []`)
- `Procedure.eraseTypes`/`stripMetaData`/`getVars` retrofits

### WF predicate adaptation
- `WF.lean`: `wfstmts`/`wfloclnd`/`bodyExitsCovered` rewritten as `∀ ss,
body = .structured ss → ...`
- `ProcBodyVerifyCorrect.lean`: new `procToVerifyStmt_is_structured`
lemma bridging the sum type to the verification pipeline's `.structured`
requirement

### Semantic relation
- `StatementSemantics.lean`: `CoreBodyExec` with `.structured`
constructor; `EvalCommand.call_sem` updated to use it
- The `.cfg` constructor is left for
[strata-org#1196](strata-org#1196) since it depends
on `EvalDetBlock` from the companion infra PR

### Consumer threading (~150 LoC across 15 files)
**Real CFG handling** (cannot be safely stubbed):
- `CallGraph.lean`: traverses CFG arms to extract calls
- `ProcedureInlining.lean`: handles `.structured` arm with the existing
logic; throws on `.cfg`

**`.cfg`-arm stubs** (error/passthrough/throw, replaceable when
[strata-org#1196](strata-org#1196) merges):
- `StatementEval.lean`, `ProcedureType.lean`, `ProcedureEval.lean`,
`Verifier.lean`, `ObligationExtraction.lean`, `FormatCore.lean`
- `Transform/PrecondElim.lean`, `CoreTransform.lean`, `LoopElim.lean`,
`ANFEncoder.lean`, `TerminationCheck.lean`, `ProcBodyVerify.lean`,
`CoreSpecification.lean`

### DDM parser support (`DDMTransform/Grammar.lean`, ~54 LoC)
- New parser categories: `Transfer`, `CFGBlock`, `CFGBlocks`, `CFGBody`
- `command_cfg_procedure` operator parsing `procedure name ... cfg ENTRY
{ ... }` syntax
- Transfer commands: `transfer_goto`, `transfer_nondet_goto`,
`transfer_cond_goto` (using `branch (cond) goto LT else goto LF` to
avoid `if`-collision with structured syntax), `transfer_return`

### DDM translator (`DDMTransform/Translate.lean`, ~115 LoC)
- `translateCFGBlock`, `translateCFGBlocks`, `translateCFGBody`,
`translateTransfer` build `Procedure.Body.cfg` from parsed AST
- `translateProcedure`/`translateBlockCommand` updated to route
`.structured` vs `.cfg` body shapes

### Examples and tooling
- `Examples/CFGSimple.core.st` — sample procedure (`Max(x,y)` computing
maximum of two integers via CFG)
- `Examples/CFGNondet.core.st` — sample with nondeterministic transfer
- `docs/Architecture.md`, `docs/verso/IRTranslationPhilosophyDoc.lean` —
documentation updates
- `editors/emacs/core-st-mode.el`,
`editors/vscode/syntaxes/core-st.tmLanguage.json` — keyword highlighting
for `cfg`/`branch`/`goto`/`return` literals

## Comparison vs [strata-org#1196](strata-org#1196)

| Feature | This PR |
[strata-org#1196](strata-org#1196) |
|---|:---:|:---:|
| `Procedure.Body` sum type + `DetCFG` abbreviation + `body` field flip
| ✅ | ✅ |
| Body projection helpers (`getStructured`, `getVars`, etc.) | ✅ | ✅ |
| `WF.lean` adaptation for sum-type body | ✅ | ✅ |
| `procToVerifyStmt_is_structured` bridge lemma | ✅ | ✅ |
| `CoreBodyExec` (`.structured` arm) | ✅ | ✅ |
| `CoreBodyExec` (`.cfg` arm) | — (needs infra) | ✅ |
| `CallGraph.lean` CFG traversal | ✅ | ✅ |
| `ProcedureInlining.lean` sum-type handling | ✅ | ✅ |
| `.cfg`-arm stubs in 13 consumers | ✅ | ✅ replaced by full impl |
| Full `.cfg`-arm implementations (PrecondElim, ProcedureEval, etc.) | —
| ✅ |
| DDM `cfg ENTRY { ... }` parser syntax | ✅ | ✅ |
| DDM `translateCFG*` translator | ✅ | ✅ |
| CFG examples (`Examples/CFG{Simple,Nondet}.core.st`) | ✅ | ✅ |
| Editor/docs syntax-highlighting updates | ✅ | ✅ |
| Imperative DL: metadata-bearing transfers, `EvalDetBlock` rename | —
(in companion PR) | ✅ |
| Translator metadata propagation | — (in companion PR) | ✅ |
| GOTO backend CFG pipeline | — | ✅ |
| CFG-specific test suites | — | ✅ |
| Lambda framework theorems | — | ✅ (via main2 merges) |

## Build status

- `lake build`: green (490 jobs)
- `lake test --exclude Languages.Python`: green (modulo the pre-existing
CI-managed `ion-java-1.11.11.jar` test fixture)
- 0 sorries, 0 axioms across all modified files

## Test plan

- [x] `lake build` succeeds locally on a fresh worktree
- [x] `lake test --exclude Languages.Python` succeeds (the only failure
is the env-managed `ion-java` jar download which CI handles)
- [x] No new sorries or axioms in any file
- [x] Existing structured-procedure tests continue to pass (the `.body
:= .structured body` adapter at `Translate.lean:1683`/`1702` keeps
existing parsing/verification paths working)
- [ ] CI passes
- [ ] Pairs with the companion [strata-org#1341
`htd/unstructured-infra`](strata-org#1341)
PR — either can land first; see comparison table above
- [ ] After this and the infra PR land,
[strata-org#1196](strata-org#1196) can be rebased
to ~2,000 LoC of remaining CFG-pipeline-specific content (CFG test
suites, GOTO backend, full `.cfg`-arm implementations replacing stubs)
@atomb

atomb commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

We've changed plans and won't merge this PR as-is. Instead, we merged part of it yesterday (#1342) and will merge two other parts (#1347, #1348) after the restructure is finished.

@PROgram52bc PROgram52bc disabled auto-merge June 11, 2026 16:13
@github-actions github-actions Bot removed the Laurel label Jun 12, 2026
shigoel pushed a commit to strata-org/Strata-Boole that referenced this pull request Jun 15, 2026
…ng (split of #1196) (#1342)

## Summary

This PR introduces `Procedure.Body := .structured (List Statement) |
.cfg DetCFG` — a sum type letting a Strata Core procedure carry either a
structured statement-list body or a deterministic control-flow graph. It
includes the type infrastructure, consumer threading across the
verification pipeline, the DDM parser/translator support for the new
`cfg` syntax, and CFG example programs.

This PR is split out of
[#1196](strata-org/Strata#1196) as a
self-contained PR that can land independently on `main2`. It contributes
~714 LoC of additions vs `main2` (compared to ~5,567 LoC on
[#1196](strata-org/Strata#1196)). It is
**independent** of the companion split [#1341
(htd/unstructured-infra)](strata-org/Strata#1341)
— the two PRs touch disjoint files (only `ProcBodyVerifyCorrect.lean`
overlaps in different regions, mergeable cleanly).

This PR does **not** introduce the imperative-DL infrastructure
(metadata-bearing transfers, `EvalDetBlock` constructor renames, etc.) —
those live in the companion [#1341
`htd/unstructured-infra`](strata-org/Strata#1341)
PR. Without that, this PR's `CoreBodyExec` semantic relation has only
the `.structured` constructor; the `.cfg` constructor is added when both
branches merge.

## What's added

The 44 files modified group into 5 layers:

### Type infrastructure (`Procedure.lean`, ~110 LoC)
- `inductive Procedure.Body := .structured (List Statement) | .cfg
DetCFG` plus `Inhabited` instance
- `abbrev DetCFG := Imperative.CFG String (Imperative.DetBlock String
Command Expression)`
- 7 projection helpers: `getStructured`, `getCfg`, `getVars`,
`isAbstract`, `isStructured`, `isCfg`, `structuredLength`
- `HasVarsPure`/`HasVarsImp` instances on `Body` and `DetCFG`
- `DetCFG.eraseTypes`, `DetCFG.stripMetaData`
- `Procedure.body` field flips from `List Statement` to `Procedure.Body`
(default `.structured []`)
- `Procedure.eraseTypes`/`stripMetaData`/`getVars` retrofits

### WF predicate adaptation
- `WF.lean`: `wfstmts`/`wfloclnd`/`bodyExitsCovered` rewritten as `∀ ss,
body = .structured ss → ...`
- `ProcBodyVerifyCorrect.lean`: new `procToVerifyStmt_is_structured`
lemma bridging the sum type to the verification pipeline's `.structured`
requirement

### Semantic relation
- `StatementSemantics.lean`: `CoreBodyExec` with `.structured`
constructor; `EvalCommand.call_sem` updated to use it
- The `.cfg` constructor is left for
[#1196](strata-org/Strata#1196) since it depends
on `EvalDetBlock` from the companion infra PR

### Consumer threading (~150 LoC across 15 files)
**Real CFG handling** (cannot be safely stubbed):
- `CallGraph.lean`: traverses CFG arms to extract calls
- `ProcedureInlining.lean`: handles `.structured` arm with the existing
logic; throws on `.cfg`

**`.cfg`-arm stubs** (error/passthrough/throw, replaceable when
[#1196](strata-org/Strata#1196) merges):
- `StatementEval.lean`, `ProcedureType.lean`, `ProcedureEval.lean`,
`Verifier.lean`, `ObligationExtraction.lean`, `FormatCore.lean`
- `Transform/PrecondElim.lean`, `CoreTransform.lean`, `LoopElim.lean`,
`ANFEncoder.lean`, `TerminationCheck.lean`, `ProcBodyVerify.lean`,
`CoreSpecification.lean`

### DDM parser support (`DDMTransform/Grammar.lean`, ~54 LoC)
- New parser categories: `Transfer`, `CFGBlock`, `CFGBlocks`, `CFGBody`
- `command_cfg_procedure` operator parsing `procedure name ... cfg ENTRY
{ ... }` syntax
- Transfer commands: `transfer_goto`, `transfer_nondet_goto`,
`transfer_cond_goto` (using `branch (cond) goto LT else goto LF` to
avoid `if`-collision with structured syntax), `transfer_return`

### DDM translator (`DDMTransform/Translate.lean`, ~115 LoC)
- `translateCFGBlock`, `translateCFGBlocks`, `translateCFGBody`,
`translateTransfer` build `Procedure.Body.cfg` from parsed AST
- `translateProcedure`/`translateBlockCommand` updated to route
`.structured` vs `.cfg` body shapes

### Examples and tooling
- `Examples/CFGSimple.core.st` — sample procedure (`Max(x,y)` computing
maximum of two integers via CFG)
- `Examples/CFGNondet.core.st` — sample with nondeterministic transfer
- `docs/Architecture.md`, `docs/verso/IRTranslationPhilosophyDoc.lean` —
documentation updates
- `editors/emacs/core-st-mode.el`,
`editors/vscode/syntaxes/core-st.tmLanguage.json` — keyword highlighting
for `cfg`/`branch`/`goto`/`return` literals

## Comparison vs [#1196](strata-org/Strata#1196)

| Feature | This PR |
[#1196](strata-org/Strata#1196) |
|---|:---:|:---:|
| `Procedure.Body` sum type + `DetCFG` abbreviation + `body` field flip
| ✅ | ✅ |
| Body projection helpers (`getStructured`, `getVars`, etc.) | ✅ | ✅ |
| `WF.lean` adaptation for sum-type body | ✅ | ✅ |
| `procToVerifyStmt_is_structured` bridge lemma | ✅ | ✅ |
| `CoreBodyExec` (`.structured` arm) | ✅ | ✅ |
| `CoreBodyExec` (`.cfg` arm) | — (needs infra) | ✅ |
| `CallGraph.lean` CFG traversal | ✅ | ✅ |
| `ProcedureInlining.lean` sum-type handling | ✅ | ✅ |
| `.cfg`-arm stubs in 13 consumers | ✅ | ✅ replaced by full impl |
| Full `.cfg`-arm implementations (PrecondElim, ProcedureEval, etc.) | —
| ✅ |
| DDM `cfg ENTRY { ... }` parser syntax | ✅ | ✅ |
| DDM `translateCFG*` translator | ✅ | ✅ |
| CFG examples (`Examples/CFG{Simple,Nondet}.core.st`) | ✅ | ✅ |
| Editor/docs syntax-highlighting updates | ✅ | ✅ |
| Imperative DL: metadata-bearing transfers, `EvalDetBlock` rename | —
(in companion PR) | ✅ |
| Translator metadata propagation | — (in companion PR) | ✅ |
| GOTO backend CFG pipeline | — | ✅ |
| CFG-specific test suites | — | ✅ |
| Lambda framework theorems | — | ✅ (via main2 merges) |

## Build status

- `lake build`: green (490 jobs)
- `lake test --exclude Languages.Python`: green (modulo the pre-existing
CI-managed `ion-java-1.11.11.jar` test fixture)
- 0 sorries, 0 axioms across all modified files

## Test plan

- [x] `lake build` succeeds locally on a fresh worktree
- [x] `lake test --exclude Languages.Python` succeeds (the only failure
is the env-managed `ion-java` jar download which CI handles)
- [x] No new sorries or axioms in any file
- [x] Existing structured-procedure tests continue to pass (the `.body
:= .structured body` adapter at `Translate.lean:1683`/`1702` keeps
existing parsing/verification paths working)
- [ ] CI passes
- [ ] Pairs with the companion [#1341
`htd/unstructured-infra`](strata-org/Strata#1341)
PR — either can land first; see comparison table above
- [ ] After this and the infra PR land,
[#1196](strata-org/Strata#1196) can be rebased
to ~2,000 LoC of remaining CFG-pipeline-specific content (CFG test
suites, GOTO backend, full `.cfg`-arm implementations replacing stubs)
shigoel pushed a commit that referenced this pull request Jun 16, 2026
…ng (split of #1196) (#1342)

## Summary

This PR introduces `Procedure.Body := .structured (List Statement) |
.cfg DetCFG` — a sum type letting a Strata Core procedure carry either a
structured statement-list body or a deterministic control-flow graph. It
includes the type infrastructure, consumer threading across the
verification pipeline, the DDM parser/translator support for the new
`cfg` syntax, and CFG example programs.

This PR is split out of
[#1196](#1196) as a
self-contained PR that can land independently on `main2`. It contributes
~714 LoC of additions vs `main2` (compared to ~5,567 LoC on
[#1196](#1196)). It is
**independent** of the companion split [#1341
(htd/unstructured-infra)](#1341)
— the two PRs touch disjoint files (only `ProcBodyVerifyCorrect.lean`
overlaps in different regions, mergeable cleanly).

This PR does **not** introduce the imperative-DL infrastructure
(metadata-bearing transfers, `EvalDetBlock` constructor renames, etc.) —
those live in the companion [#1341
`htd/unstructured-infra`](#1341)
PR. Without that, this PR's `CoreBodyExec` semantic relation has only
the `.structured` constructor; the `.cfg` constructor is added when both
branches merge.

## What's added

The 44 files modified group into 5 layers:

### Type infrastructure (`Procedure.lean`, ~110 LoC)
- `inductive Procedure.Body := .structured (List Statement) | .cfg
DetCFG` plus `Inhabited` instance
- `abbrev DetCFG := Imperative.CFG String (Imperative.DetBlock String
Command Expression)`
- 7 projection helpers: `getStructured`, `getCfg`, `getVars`,
`isAbstract`, `isStructured`, `isCfg`, `structuredLength`
- `HasVarsPure`/`HasVarsImp` instances on `Body` and `DetCFG`
- `DetCFG.eraseTypes`, `DetCFG.stripMetaData`
- `Procedure.body` field flips from `List Statement` to `Procedure.Body`
(default `.structured []`)
- `Procedure.eraseTypes`/`stripMetaData`/`getVars` retrofits

### WF predicate adaptation
- `WF.lean`: `wfstmts`/`wfloclnd`/`bodyExitsCovered` rewritten as `∀ ss,
body = .structured ss → ...`
- `ProcBodyVerifyCorrect.lean`: new `procToVerifyStmt_is_structured`
lemma bridging the sum type to the verification pipeline's `.structured`
requirement

### Semantic relation
- `StatementSemantics.lean`: `CoreBodyExec` with `.structured`
constructor; `EvalCommand.call_sem` updated to use it
- The `.cfg` constructor is left for
[#1196](#1196) since it depends
on `EvalDetBlock` from the companion infra PR

### Consumer threading (~150 LoC across 15 files)
**Real CFG handling** (cannot be safely stubbed):
- `CallGraph.lean`: traverses CFG arms to extract calls
- `ProcedureInlining.lean`: handles `.structured` arm with the existing
logic; throws on `.cfg`

**`.cfg`-arm stubs** (error/passthrough/throw, replaceable when
[#1196](#1196) merges):
- `StatementEval.lean`, `ProcedureType.lean`, `ProcedureEval.lean`,
`Verifier.lean`, `ObligationExtraction.lean`, `FormatCore.lean`
- `Transform/PrecondElim.lean`, `CoreTransform.lean`, `LoopElim.lean`,
`ANFEncoder.lean`, `TerminationCheck.lean`, `ProcBodyVerify.lean`,
`CoreSpecification.lean`

### DDM parser support (`DDMTransform/Grammar.lean`, ~54 LoC)
- New parser categories: `Transfer`, `CFGBlock`, `CFGBlocks`, `CFGBody`
- `command_cfg_procedure` operator parsing `procedure name ... cfg ENTRY
{ ... }` syntax
- Transfer commands: `transfer_goto`, `transfer_nondet_goto`,
`transfer_cond_goto` (using `branch (cond) goto LT else goto LF` to
avoid `if`-collision with structured syntax), `transfer_return`

### DDM translator (`DDMTransform/Translate.lean`, ~115 LoC)
- `translateCFGBlock`, `translateCFGBlocks`, `translateCFGBody`,
`translateTransfer` build `Procedure.Body.cfg` from parsed AST
- `translateProcedure`/`translateBlockCommand` updated to route
`.structured` vs `.cfg` body shapes

### Examples and tooling
- `Examples/CFGSimple.core.st` — sample procedure (`Max(x,y)` computing
maximum of two integers via CFG)
- `Examples/CFGNondet.core.st` — sample with nondeterministic transfer
- `docs/Architecture.md`, `docs/verso/IRTranslationPhilosophyDoc.lean` —
documentation updates
- `editors/emacs/core-st-mode.el`,
`editors/vscode/syntaxes/core-st.tmLanguage.json` — keyword highlighting
for `cfg`/`branch`/`goto`/`return` literals

## Comparison vs [#1196](#1196)

| Feature | This PR |
[#1196](#1196) |
|---|:---:|:---:|
| `Procedure.Body` sum type + `DetCFG` abbreviation + `body` field flip
| ✅ | ✅ |
| Body projection helpers (`getStructured`, `getVars`, etc.) | ✅ | ✅ |
| `WF.lean` adaptation for sum-type body | ✅ | ✅ |
| `procToVerifyStmt_is_structured` bridge lemma | ✅ | ✅ |
| `CoreBodyExec` (`.structured` arm) | ✅ | ✅ |
| `CoreBodyExec` (`.cfg` arm) | — (needs infra) | ✅ |
| `CallGraph.lean` CFG traversal | ✅ | ✅ |
| `ProcedureInlining.lean` sum-type handling | ✅ | ✅ |
| `.cfg`-arm stubs in 13 consumers | ✅ | ✅ replaced by full impl |
| Full `.cfg`-arm implementations (PrecondElim, ProcedureEval, etc.) | —
| ✅ |
| DDM `cfg ENTRY { ... }` parser syntax | ✅ | ✅ |
| DDM `translateCFG*` translator | ✅ | ✅ |
| CFG examples (`Examples/CFG{Simple,Nondet}.core.st`) | ✅ | ✅ |
| Editor/docs syntax-highlighting updates | ✅ | ✅ |
| Imperative DL: metadata-bearing transfers, `EvalDetBlock` rename | —
(in companion PR) | ✅ |
| Translator metadata propagation | — (in companion PR) | ✅ |
| GOTO backend CFG pipeline | — | ✅ |
| CFG-specific test suites | — | ✅ |
| Lambda framework theorems | — | ✅ (via main2 merges) |

## Build status

- `lake build`: green (490 jobs)
- `lake test --exclude Languages.Python`: green (modulo the pre-existing
CI-managed `ion-java-1.11.11.jar` test fixture)
- 0 sorries, 0 axioms across all modified files

## Test plan

- [x] `lake build` succeeds locally on a fresh worktree
- [x] `lake test --exclude Languages.Python` succeeds (the only failure
is the env-managed `ion-java` jar download which CI handles)
- [x] No new sorries or axioms in any file
- [x] Existing structured-procedure tests continue to pass (the `.body
:= .structured body` adapter at `Translate.lean:1683`/`1702` keeps
existing parsing/verification paths working)
- [ ] CI passes
- [ ] Pairs with the companion [#1341
`htd/unstructured-infra`](#1341)
PR — either can land first; see comparison table above
- [ ] After this and the infra PR land,
[#1196](#1196) can be rebased
to ~2,000 LoC of remaining CFG-pipeline-specific content (CFG test
suites, GOTO backend, full `.cfg`-arm implementations replacing stubs)
shigoel pushed a commit that referenced this pull request Jun 16, 2026
…ng (split of #1196) (#1342)

## Summary

This PR introduces `Procedure.Body := .structured (List Statement) |
.cfg DetCFG` — a sum type letting a Strata Core procedure carry either a
structured statement-list body or a deterministic control-flow graph. It
includes the type infrastructure, consumer threading across the
verification pipeline, the DDM parser/translator support for the new
`cfg` syntax, and CFG example programs.

This PR is split out of
[#1196](#1196) as a
self-contained PR that can land independently on `main2`. It contributes
~714 LoC of additions vs `main2` (compared to ~5,567 LoC on
[#1196](#1196)). It is
**independent** of the companion split [#1341
(htd/unstructured-infra)](#1341)
— the two PRs touch disjoint files (only `ProcBodyVerifyCorrect.lean`
overlaps in different regions, mergeable cleanly).

This PR does **not** introduce the imperative-DL infrastructure
(metadata-bearing transfers, `EvalDetBlock` constructor renames, etc.) —
those live in the companion [#1341
`htd/unstructured-infra`](#1341)
PR. Without that, this PR's `CoreBodyExec` semantic relation has only
the `.structured` constructor; the `.cfg` constructor is added when both
branches merge.

## What's added

The 44 files modified group into 5 layers:

### Type infrastructure (`Procedure.lean`, ~110 LoC)
- `inductive Procedure.Body := .structured (List Statement) | .cfg
DetCFG` plus `Inhabited` instance
- `abbrev DetCFG := Imperative.CFG String (Imperative.DetBlock String
Command Expression)`
- 7 projection helpers: `getStructured`, `getCfg`, `getVars`,
`isAbstract`, `isStructured`, `isCfg`, `structuredLength`
- `HasVarsPure`/`HasVarsImp` instances on `Body` and `DetCFG`
- `DetCFG.eraseTypes`, `DetCFG.stripMetaData`
- `Procedure.body` field flips from `List Statement` to `Procedure.Body`
(default `.structured []`)
- `Procedure.eraseTypes`/`stripMetaData`/`getVars` retrofits

### WF predicate adaptation
- `WF.lean`: `wfstmts`/`wfloclnd`/`bodyExitsCovered` rewritten as `∀ ss,
body = .structured ss → ...`
- `ProcBodyVerifyCorrect.lean`: new `procToVerifyStmt_is_structured`
lemma bridging the sum type to the verification pipeline's `.structured`
requirement

### Semantic relation
- `StatementSemantics.lean`: `CoreBodyExec` with `.structured`
constructor; `EvalCommand.call_sem` updated to use it
- The `.cfg` constructor is left for
[#1196](#1196) since it depends
on `EvalDetBlock` from the companion infra PR

### Consumer threading (~150 LoC across 15 files)
**Real CFG handling** (cannot be safely stubbed):
- `CallGraph.lean`: traverses CFG arms to extract calls
- `ProcedureInlining.lean`: handles `.structured` arm with the existing
logic; throws on `.cfg`

**`.cfg`-arm stubs** (error/passthrough/throw, replaceable when
[#1196](#1196) merges):
- `StatementEval.lean`, `ProcedureType.lean`, `ProcedureEval.lean`,
`Verifier.lean`, `ObligationExtraction.lean`, `FormatCore.lean`
- `Transform/PrecondElim.lean`, `CoreTransform.lean`, `LoopElim.lean`,
`ANFEncoder.lean`, `TerminationCheck.lean`, `ProcBodyVerify.lean`,
`CoreSpecification.lean`

### DDM parser support (`DDMTransform/Grammar.lean`, ~54 LoC)
- New parser categories: `Transfer`, `CFGBlock`, `CFGBlocks`, `CFGBody`
- `command_cfg_procedure` operator parsing `procedure name ... cfg ENTRY
{ ... }` syntax
- Transfer commands: `transfer_goto`, `transfer_nondet_goto`,
`transfer_cond_goto` (using `branch (cond) goto LT else goto LF` to
avoid `if`-collision with structured syntax), `transfer_return`

### DDM translator (`DDMTransform/Translate.lean`, ~115 LoC)
- `translateCFGBlock`, `translateCFGBlocks`, `translateCFGBody`,
`translateTransfer` build `Procedure.Body.cfg` from parsed AST
- `translateProcedure`/`translateBlockCommand` updated to route
`.structured` vs `.cfg` body shapes

### Examples and tooling
- `Examples/CFGSimple.core.st` — sample procedure (`Max(x,y)` computing
maximum of two integers via CFG)
- `Examples/CFGNondet.core.st` — sample with nondeterministic transfer
- `docs/Architecture.md`, `docs/verso/IRTranslationPhilosophyDoc.lean` —
documentation updates
- `editors/emacs/core-st-mode.el`,
`editors/vscode/syntaxes/core-st.tmLanguage.json` — keyword highlighting
for `cfg`/`branch`/`goto`/`return` literals

## Comparison vs [#1196](#1196)

| Feature | This PR |
[#1196](#1196) |
|---|:---:|:---:|
| `Procedure.Body` sum type + `DetCFG` abbreviation + `body` field flip
| ✅ | ✅ |
| Body projection helpers (`getStructured`, `getVars`, etc.) | ✅ | ✅ |
| `WF.lean` adaptation for sum-type body | ✅ | ✅ |
| `procToVerifyStmt_is_structured` bridge lemma | ✅ | ✅ |
| `CoreBodyExec` (`.structured` arm) | ✅ | ✅ |
| `CoreBodyExec` (`.cfg` arm) | — (needs infra) | ✅ |
| `CallGraph.lean` CFG traversal | ✅ | ✅ |
| `ProcedureInlining.lean` sum-type handling | ✅ | ✅ |
| `.cfg`-arm stubs in 13 consumers | ✅ | ✅ replaced by full impl |
| Full `.cfg`-arm implementations (PrecondElim, ProcedureEval, etc.) | —
| ✅ |
| DDM `cfg ENTRY { ... }` parser syntax | ✅ | ✅ |
| DDM `translateCFG*` translator | ✅ | ✅ |
| CFG examples (`Examples/CFG{Simple,Nondet}.core.st`) | ✅ | ✅ |
| Editor/docs syntax-highlighting updates | ✅ | ✅ |
| Imperative DL: metadata-bearing transfers, `EvalDetBlock` rename | —
(in companion PR) | ✅ |
| Translator metadata propagation | — (in companion PR) | ✅ |
| GOTO backend CFG pipeline | — | ✅ |
| CFG-specific test suites | — | ✅ |
| Lambda framework theorems | — | ✅ (via main2 merges) |

## Build status

- `lake build`: green (490 jobs)
- `lake test --exclude Languages.Python`: green (modulo the pre-existing
CI-managed `ion-java-1.11.11.jar` test fixture)
- 0 sorries, 0 axioms across all modified files

## Test plan

- [x] `lake build` succeeds locally on a fresh worktree
- [x] `lake test --exclude Languages.Python` succeeds (the only failure
is the env-managed `ion-java` jar download which CI handles)
- [x] No new sorries or axioms in any file
- [x] Existing structured-procedure tests continue to pass (the `.body
:= .structured body` adapter at `Translate.lean:1683`/`1702` keeps
existing parsing/verification paths working)
- [ ] CI passes
- [ ] Pairs with the companion [#1341
`htd/unstructured-infra`](#1341)
PR — either can land first; see comparison table above
- [ ] After this and the infra PR land,
[#1196](#1196) can be rebased
to ~2,000 LoC of remaining CFG-pipeline-specific content (CFG test
suites, GOTO backend, full `.cfg`-arm implementations replacing stubs)
joehendrix pushed a commit that referenced this pull request Jun 16, 2026
…ng (split of #1196) (#1342)

## Summary

This PR introduces `Procedure.Body := .structured (List Statement) |
.cfg DetCFG` — a sum type letting a Strata Core procedure carry either a
structured statement-list body or a deterministic control-flow graph. It
includes the type infrastructure, consumer threading across the
verification pipeline, the DDM parser/translator support for the new
`cfg` syntax, and CFG example programs.

This PR is split out of
[#1196](#1196) as a
self-contained PR that can land independently on `main2`. It contributes
~714 LoC of additions vs `main2` (compared to ~5,567 LoC on
[#1196](#1196)). It is
**independent** of the companion split [#1341
(htd/unstructured-infra)](#1341)
— the two PRs touch disjoint files (only `ProcBodyVerifyCorrect.lean`
overlaps in different regions, mergeable cleanly).

This PR does **not** introduce the imperative-DL infrastructure
(metadata-bearing transfers, `EvalDetBlock` constructor renames, etc.) —
those live in the companion [#1341
`htd/unstructured-infra`](#1341)
PR. Without that, this PR's `CoreBodyExec` semantic relation has only
the `.structured` constructor; the `.cfg` constructor is added when both
branches merge.

## What's added

The 44 files modified group into 5 layers:

### Type infrastructure (`Procedure.lean`, ~110 LoC)
- `inductive Procedure.Body := .structured (List Statement) | .cfg
DetCFG` plus `Inhabited` instance
- `abbrev DetCFG := Imperative.CFG String (Imperative.DetBlock String
Command Expression)`
- 7 projection helpers: `getStructured`, `getCfg`, `getVars`,
`isAbstract`, `isStructured`, `isCfg`, `structuredLength`
- `HasVarsPure`/`HasVarsImp` instances on `Body` and `DetCFG`
- `DetCFG.eraseTypes`, `DetCFG.stripMetaData`
- `Procedure.body` field flips from `List Statement` to `Procedure.Body`
(default `.structured []`)
- `Procedure.eraseTypes`/`stripMetaData`/`getVars` retrofits

### WF predicate adaptation
- `WF.lean`: `wfstmts`/`wfloclnd`/`bodyExitsCovered` rewritten as `∀ ss,
body = .structured ss → ...`
- `ProcBodyVerifyCorrect.lean`: new `procToVerifyStmt_is_structured`
lemma bridging the sum type to the verification pipeline's `.structured`
requirement

### Semantic relation
- `StatementSemantics.lean`: `CoreBodyExec` with `.structured`
constructor; `EvalCommand.call_sem` updated to use it
- The `.cfg` constructor is left for
[#1196](#1196) since it depends
on `EvalDetBlock` from the companion infra PR

### Consumer threading (~150 LoC across 15 files)
**Real CFG handling** (cannot be safely stubbed):
- `CallGraph.lean`: traverses CFG arms to extract calls
- `ProcedureInlining.lean`: handles `.structured` arm with the existing
logic; throws on `.cfg`

**`.cfg`-arm stubs** (error/passthrough/throw, replaceable when
[#1196](#1196) merges):
- `StatementEval.lean`, `ProcedureType.lean`, `ProcedureEval.lean`,
`Verifier.lean`, `ObligationExtraction.lean`, `FormatCore.lean`
- `Transform/PrecondElim.lean`, `CoreTransform.lean`, `LoopElim.lean`,
`ANFEncoder.lean`, `TerminationCheck.lean`, `ProcBodyVerify.lean`,
`CoreSpecification.lean`

### DDM parser support (`DDMTransform/Grammar.lean`, ~54 LoC)
- New parser categories: `Transfer`, `CFGBlock`, `CFGBlocks`, `CFGBody`
- `command_cfg_procedure` operator parsing `procedure name ... cfg ENTRY
{ ... }` syntax
- Transfer commands: `transfer_goto`, `transfer_nondet_goto`,
`transfer_cond_goto` (using `branch (cond) goto LT else goto LF` to
avoid `if`-collision with structured syntax), `transfer_return`

### DDM translator (`DDMTransform/Translate.lean`, ~115 LoC)
- `translateCFGBlock`, `translateCFGBlocks`, `translateCFGBody`,
`translateTransfer` build `Procedure.Body.cfg` from parsed AST
- `translateProcedure`/`translateBlockCommand` updated to route
`.structured` vs `.cfg` body shapes

### Examples and tooling
- `Examples/CFGSimple.core.st` — sample procedure (`Max(x,y)` computing
maximum of two integers via CFG)
- `Examples/CFGNondet.core.st` — sample with nondeterministic transfer
- `docs/Architecture.md`, `docs/verso/IRTranslationPhilosophyDoc.lean` —
documentation updates
- `editors/emacs/core-st-mode.el`,
`editors/vscode/syntaxes/core-st.tmLanguage.json` — keyword highlighting
for `cfg`/`branch`/`goto`/`return` literals

## Comparison vs [#1196](#1196)

| Feature | This PR |
[#1196](#1196) |
|---|:---:|:---:|
| `Procedure.Body` sum type + `DetCFG` abbreviation + `body` field flip
| ✅ | ✅ |
| Body projection helpers (`getStructured`, `getVars`, etc.) | ✅ | ✅ |
| `WF.lean` adaptation for sum-type body | ✅ | ✅ |
| `procToVerifyStmt_is_structured` bridge lemma | ✅ | ✅ |
| `CoreBodyExec` (`.structured` arm) | ✅ | ✅ |
| `CoreBodyExec` (`.cfg` arm) | — (needs infra) | ✅ |
| `CallGraph.lean` CFG traversal | ✅ | ✅ |
| `ProcedureInlining.lean` sum-type handling | ✅ | ✅ |
| `.cfg`-arm stubs in 13 consumers | ✅ | ✅ replaced by full impl |
| Full `.cfg`-arm implementations (PrecondElim, ProcedureEval, etc.) | —
| ✅ |
| DDM `cfg ENTRY { ... }` parser syntax | ✅ | ✅ |
| DDM `translateCFG*` translator | ✅ | ✅ |
| CFG examples (`Examples/CFG{Simple,Nondet}.core.st`) | ✅ | ✅ |
| Editor/docs syntax-highlighting updates | ✅ | ✅ |
| Imperative DL: metadata-bearing transfers, `EvalDetBlock` rename | —
(in companion PR) | ✅ |
| Translator metadata propagation | — (in companion PR) | ✅ |
| GOTO backend CFG pipeline | — | ✅ |
| CFG-specific test suites | — | ✅ |
| Lambda framework theorems | — | ✅ (via main2 merges) |

## Build status

- `lake build`: green (490 jobs)
- `lake test --exclude Languages.Python`: green (modulo the pre-existing
CI-managed `ion-java-1.11.11.jar` test fixture)
- 0 sorries, 0 axioms across all modified files

## Test plan

- [x] `lake build` succeeds locally on a fresh worktree
- [x] `lake test --exclude Languages.Python` succeeds (the only failure
is the env-managed `ion-java` jar download which CI handles)
- [x] No new sorries or axioms in any file
- [x] Existing structured-procedure tests continue to pass (the `.body
:= .structured body` adapter at `Translate.lean:1683`/`1702` keeps
existing parsing/verification paths working)
- [ ] CI passes
- [ ] Pairs with the companion [#1341
`htd/unstructured-infra`](#1341)
PR — either can land first; see comparison table above
- [ ] After this and the infra PR land,
[#1196](#1196) can be rebased
to ~2,000 LoC of remaining CFG-pipeline-specific content (CFG test
suites, GOTO backend, full `.cfg`-arm implementations replacing stubs)
…lass hierarchy

main2's f7f6dee refactor (merged here) removed HasNot/HasBoolVal/HasIntOrder
and made HasBool extend HasVal + require HasFvars for HasInt. Migrate the test's
P-instance fixture accordingly:
- add HasVal P (value := True) and HasFvars P (getFvars := LExpr.getVars)
- HasBool gains boolIsVal
- HasIntOrder split into HasInt (zero/intTy/isNumeral + proof obligations,
  discharged via the trivial value predicate and LExpr.getVars=[] on consts)
  and HasIntOps (eq/lt)
- HasNot -> HasBoolOps (not/and/imp)

Full build green (496 jobs); #guard_msgs goldens unchanged.
The nondeterministic goto lowering now prepends an `init $__nondet_N : bool`
command to the block so the branch variable is declared in scope for the type
checker. The NondetChoice entry block therefore has 1 cmd, not 0. Guarded
cond_gotos (if/while) emit no synthesized command, so their goldens are unchanged.
A complete command_cfg_procedure arm (with source-located error and procedure
name) already precedes this one in toCoreDecls. The duplicate catch-all arm is
now unreachable, which Lean rejects as a redundant alternative. Remove it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants