Skip to content

Python FE (Re)architecture#1293

Draft
ssomayyajula wants to merge 426 commits into
strata-org:mainfrom
ssomayyajula:ssomayyajula/python-fe-refactor-v2
Draft

Python FE (Re)architecture#1293
ssomayyajula wants to merge 426 commits into
strata-org:mainfrom
ssomayyajula:ssomayyajula/python-fe-refactor-v2

Conversation

@ssomayyajula

Copy link
Copy Markdown
Contributor

TODO

ssomayyajula and others added 30 commits May 7, 2026 16:57
1. Translation: Emit LocalVariable for nested tuple unpack temps (unpack_N)
2. Elaborate: Restore needsDecl (check if var in env before declaring)
   - Hole assigns: declare only if NOT already in scope (avoids dupes)
   - Regular assigns: declare if not in scope (fixes bare assigns)

32/54 pass. 5 internal_error remain (class_field, power, multi-output procs).
15 inconclusive. 1 user_error (correct). 1 pre-existing Core gap (float64).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…overy)

synthValue for FieldSelect with no heapVar now fails instead of
producing a raw fieldAccess. This forces grade discovery to try heap
grade, which correctly discovers the proc needs heap and rewrites
its signature with $heap_in/$heap params.

Fixes test_class_field_use (internal_error → inconclusive).
4 internal_error remain (power=Core gap, class_field_any, 2 multi-output).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
procGrades no longer rolls back when Option fails during grade trials.
- discoverGrade reads from reader, uses local (trialEnv) for coinduction
- fullElaborate accumulates knownGrades externally, passes via reader
- No state mutation for grade discovery — safe to call from anywhere

32/54 pass (no regressions). This unblocks synthValue grade check.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Per architecture line 123-125: a call is a VALUE only if grade(f) = 1.
synthValue for StaticCall now calls discoverGrade and fails if grade > 1.
This is safe because procGrades is now in the reader (no state mutation).

No regressions. 32/54 pass. 4 internal_error remain (need ANF lifting
for effectful calls nested in value position).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
checkArgsK handles effectful args by binding them via smart constructors.
However, using it in elabAssign causes regressions because discoverGrade
on simple runtime functions (from_int, NoError) triggers trial runs that
fail unexpectedly.

The ANF issue (2 tests: procedure_in_assert, unsupported_config) needs
a targeted approach: only lift args that are KNOWN multi-output procs,
not all StaticCall args.

32/54 pass. procGrades in reader + synthValue grade check working.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. lookupProcBody now finds bodies in .Opaque procs (not just .Transparent)
2. fullElaborate discovers grades for .Opaque procs with implementations
3. checkArgsK with ANF fallback (triggered only when hasEffectfulArg)
4. elabAssign tries checkArgs first, falls back to checkArgsK if effectful

Root cause of procedure_in_assert/unsupported_config identified:
timedelta_func is a RUNTIME proc (not in Translation output).
fullElaborate never sees it. discoverGrade returns pure (no body found).
Fix requires passing runtime to fullElaborate or pre-computing runtime grades.

32/54 pass. 4 internal_error remain.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ElabEnv now has a `runtime` field (passed from pipeline)
- lookupProcBody handles .Opaque bodies with implementations
- fullElaborate discovers grades for .Opaque procs
- checkArgsK with targeted ANF fallback (hasEffectfulArg guard)
- Runtime grade lookup reverted (causes regressions when runtime
  function bodies are evaluated during grade discovery)

The procedure_in_assert/unsupported_config tests need a different
approach: pre-compute multi-output runtime proc grades in the pipeline
before calling fullElaborate, and pass them as initial knownGrades.

32/54 pass. 4 internal_error. 16 inconclusive. 1 user_error (correct).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pipeline pre-computes grades for runtime procs with Error outputs
and passes them as initialGrades to fullElaborate. This allows
elaboration to know that runtime multi-output procs are grade err
without needing to evaluate their bodies.

Fixes test_procedure_in_assert partially (timedelta_func is a user
proc issue, not runtime). 32/54 pass. 4 internal_error remain.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
checkArgsK breaks class/with tests when used as default because
discoverGrade inside it fails for method calls. Reverted to simple
checkArgs.

Root cause of procedure_in_assert fully understood: main's GRADE
DISCOVERY fails because synthValue's grade check makes checkArgs
fail during the trial. The fix needs the grade check to be disabled
during discovery trials (trials are for grade, not for correctness).

32/54 pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
discoveryMode=true suppresses synthValue's grade check during grade
discovery trials. This prevents discovery from failing due to nested
effectful calls that haven't been ANF-lifted yet.

The procedure_in_assert fix still needs checkArgsK to work during
actual elaboration (not just discovery). checkArgsK as default causes
4 class/with regressions that need separate investigation.

32/54 pass. 4 internal_error. discoveryMode infrastructure in place.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When args exceed params in checkArgsK, synth without coercion (same
as checkArgs). Fixes Composite→Any coercion issue for self args.

checkArgsK still not used as default — causes regex regressions for
unknown reason. Infrastructure ready for when root cause is found.

32/54 pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. LaurelToCoreTranslator: Add .TFloat64 => real case (was falling to
   NotSupportedYet catch-all). Fixes test_power.
2. Elaborate: Prefix Field constructor names with "$field." to avoid
   collision with Laurel's field resolution. Fixes test_class_field_any.

33/54 pass. 2 internal_error remain (procedure_in_assert, unsupported_config).
18 inconclusive. 1 user_error (correct).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- discoverGrade returns .pure immediately for isFunctional procs
  (Laurel functions are pure by definition, no body evaluation needed)
- checkArgsK handles excess args (self) via synthValue without coercion
- checkArgsK still not default (regex regression: Translation marks
  user functions as isFunctional=false, causing body cascade)

The ANF fix requires Translation to set isFunctional=true for
single-output deterministic user functions (re_fullmatch etc).

33/54 pass. 2 internal_error (procedure_in_assert, unsupported_config).
18 inconclusive. 1 user_error (correct).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…thExpr

- discoverGrade uses output signature for RUNTIME procs (gradeFromOutputs):
  Error output → err, Heap output → heap. No body evaluation needed.
- User procs still use body-based discovery (outputs not yet rewritten).
- Added SynthResult (defunctionalized producer synthesis) and synthExpr.
- checkArgsK rewritten to use synthExpr (not yet enabled as default —
  still causes cascade for user procs not yet in knownGrades).

33/54 pass. 2 internal_error (procedure_in_assert, unsupported_config).
18 inconclusive. 1 user_error (correct).

The last 2 tests need a two-pass approach: discover ALL proc grades
first (including user procs that come after main in the list), then
elaborate with full knowledge. checkArgsK is ready for when that's done.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Architecture-correct separation of synth and check:
- Pass 1 (synth): discover all proc grades via tryGrades. No FGL produced.
  All grades stored in knownGrades before any elaboration.
- Pass 2 (check): elaborate each proc with FULL knownGrades in reader.
  discoverGrade is just a HashMap lookup — no body evaluation cascade.

Also: gradeFromOutputs for runtime procs (signature determines grade).
Also: synthExpr (defunctionalized SynthResult) + checkArgsK ready but
not enabled as default (still has a subtle bug causing regex regression).

33/54 pass. 2 internal_error (procedure_in_assert, unsupported_config).
These 2 need checkArgsK which needs the regex bug fixed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ping rules

Major architecture update:
- Grade inference is coinductive fixpoint iteration over call graph
  (standard technique from functional language type checkers)
- Typing rules (synthValue, checkValue, synthExpr, checkProducer) are
  textbook — pure, no state mutation, no boolean flags
- discoverGrades is separate: iterates typing rules until convergence
- fullElaborate = discoverGrades (fixpoint) + checkProducer (terms)
- SynthResult is defunctionalized producer synthesis (no closures)
- checkArgsK applies to-rule at expression level (ANF for nested effects)
- No discoveryMode. No on-demand body evaluation during elaboration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Major restructuring per updated architecture:
- Remove discoveryMode boolean (boolean blindness eliminated)
- Remove grade check from synthValue (values have no grades)
- synthExpr uses pure HashMap lookup for grades (no body evaluation)
- tryGrades is standalone (not in mutual block)
- Pass 1 is proper fixpoint iteration (while changed, iterate all procs)
- Pass 2 elaborates with ALL grades known (no cascading possible)
- checkArgsK is default in elabAssign (to-rule at expression level)

32/54 pass. Regex regression from checkArgsK needs investigation.
The architecture is correct — the bug is mechanical, not theoretical.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
checkArgsK used when sig is known (ANF lifting for effectful args).
Fallback to checkValue against Any when sig unknown (type deficit).

Next: proper separation of user vs trusted procs via data structure
(userBodies map — elaborator can only touch what's in the map).

33/54 pass. 2 internal_error (need full TypeEnv for callee types).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ctural grades

- Pass translationEnv (all sigs) to elaboration so checkArgsK inserts
  coercions at runtime call boundaries (re_fullmatch, timedelta_func, etc.)
- gradeFromSignature derives grades structurally from proc signature via
  eraseType (no ad-hoc boolean checks on type names)
- eraseType handles UserDefined "Any"/"Error"/etc. from Laurel parser
- Translation emits maybe_except in outputs (prelude convention), not as
  local var declaration
- Signature rewriting matches calling convention: err→[result,err],
  heap→[heap,result], heapErr→[heap,result,err]
- Architecture doc: user/runtime separation principle

32/54 pass (was 28 old pipeline). 0 regressions vs old. +4 new passes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Every FGLValue/FGLProducer constructor now has an md field from the
source StmtExprMd that produced it. Projection extracts md structurally —
no parameter needed, impossible to forget. Source locations in output
now match the original Python source positions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nature, tech debt

- eraseType handles UserDefined "Any"/"Error"/etc. from Laurel parser
- FGL terms carry md (correct by construction), projection is parameter-free
- gradeFromSignature documented (structural via eraseType)
- Translation declares maybe_except in outputs (prelude convention)
- Subsumption coercions carry Md
- Smart constructors take md parameter
- Known tech debt: if/then/else rest duplication causes exponential VC blowup

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… background

Add introductory prose explaining:
- Pipeline overview: what each pass does and why they're separate
- FGCBV: values vs producers, the sequencing construct
- Graded effects: the monoid classifies effects, determines calling convention
- Bidirectional typing: synthesis vs checking, proof-relevant witnesses
- Elaboration as derivation construction (not term transformation)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…status

Add Current Status section comparing pyAnalyzeV2 vs pyAnalyzeLaurel:
- Test outcome table (32 vs 28 pass, strict domination)
- VC generation differences (tighter encoding, fewer VCs)
- Source location fidelity (metadata by construction)
- CI compatibility notes
- What the 4 new passes gain (coercion/effect tracking resolves ambiguity)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… VCs)

The 4 tests that went inconclusive→success produce 0 VCs. They "pass"
because elaboration silently discards proc bodies it can't elaborate,
producing empty blocks that Core trivially accepts. This is NOT a genuine
improvement — it's a soundness concern (silent body erasure).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Labeled blocks (from try/except translation) use Exit to jump to end of
block. Previously, `rest` was appended inside the block's stmt list where
Exit would skip over it — causing assert statements after try/except to
be silently dropped (0 VCs, vacuous pass).

Fix: labeledBlock now has an `after` field. Block rule elaborates the
block body standalone, then elaborates `rest` as the continuation after
the block. Projection emits [block] ++ after.

test_try_except: 0 VCs → 5 passed + 1 inconclusive (genuine)
test_multiple_except: 0 VCs → 8 passed (genuine)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
IfThenElse now has an `after : FGLProducer` field. Both branches
elaborate standalone (no rest threading), rest goes in after (elaborated
once). Projection omits else block when els is .unit (matching old
pipeline's guard pattern: `if cond then exit; next`).

Eliminates exponential VC blowup from nested if/else.
test_boolean_logic: 54 VCs → 27 (matches old pipeline's 28)

Known: test_try_except_scoping still inconclusive (27/30 pass, 3 solver
timeout). The VCs are correct but Core's path enumeration on nested
labeled blocks exceeds solver budget. Old pipeline: 6 VCs for same test.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- elabAssign: when RHS is IfThenElse (ternary), desugar into
  statement-level if/else with assignment in both branches
- Translation: subscript assignment (x[i] = v) emits
  Assign [x] (Any_sets(i, x, v)) instead of invalid Assign [Any_get(x,i)] v

Both fixes prepare for __main__ metadata (blocked on making elaboration
handle all module-level constructs without failing).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- __main__ gets proper metadata (sourceRangeToMd) so Core generates VCs
  from module-level assertions
- Slice reads: from_Slice(Any..as_int!(start), OptSome(Any..as_int!(stop)))
  matching old pipeline's type-correct encoding
- Subscript assignment: collectSubscriptChain flattens nested subscripts,
  emits Any_sets(ListAny indices, root, value)
- Constructor FuncSigs added to prelude: from_Slice, OptSome, OptNone,
  Any..as_float!, Any..as_Composite!, Any_sets

Remaining 5 crashes: Any/Composite mismatch (self typing), missing user
function sigs (datetime_now, test_helper_create_client).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Method FuncSigs include self with type UserDefined className (enables
  elaborator to insert Any→Composite coercion at call boundaries)
- Translation strips self from FuncSig when building proc inputs (avoids
  duplicate self param)
- Box protocol: BoxAny(anyVal: Any) for Any-typed fields instead of
  invalid Box..AnyVal!

Remaining: test_class_field_any still fails (Any/Composite mismatch
from a different source — needs deeper investigation of Core type-checker
error location).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- .New in elabAssign: coerce MkComposite result to target type via
  applySubsume (fixes Any/Composite mismatch for class instantiation)
- Block RHS: desugar x := Block[stmts; last] into Block[stmts; x:=last]

Fixes test_class_field_any and test_method_call_with_kwargs crashes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ssomayyajula and others added 4 commits May 20, 2026 23:52
pyAnalyzeLaurelV2 now attempts to load imported module stubs:
- Extracts ImportFrom module names from AST
- Looks for <module>.python.st.ion adjacent to the input file
- Resolves the stub to get function/class context entries
- Passes importedContext to Resolution.resolve

Limitations (TODO):
- Only handles ImportFrom, not plain Import
- Path lookup is naive (same directory as input)
- Module-qualified calls (module.func) not handled yet

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Extracts module names from both Import and ImportFrom statements
- Looks for <module>.python.st.ion in input directory and parent
- Resolves stub to get function/class context entries
- Merges into Resolution context so imported names resolve correctly

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Source stubs for import resolution. Generate Ions via:
  python3 -m strata.gen py_to_strata <stub>.py <module>.python.st.ion

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolution is now monadic (ResolveM) and loads imported modules on demand
when it encounters Import/ImportFrom statements. Modules are resolved by
converting dotted paths to filesystem paths and running the full Resolution
fold on the loaded Ion files. Results are memoized per-run to avoid
re-resolving the same module multiple times.

Key changes:
- Resolution.lean: CtxEntry.module_ carries resolved Ctx, resolveStmt/
  resolveBlock/resolveFuncDef are monadic, resolveModuleComponent loads
  and resolves modules from disk, resolveMethodCall looks up names in
  module contexts for dotted access (boto3.client(...))
- PySpecPipeline.lean: Removed step 1.5 (manual stub loading). Pipeline
  now calls monadic resolve which handles imports internally. Imported
  modules are translated to Laurel with filesystem caching (.laurel.st).
- PythonRuntimeLaurelPart.lean: Added 26 missing builtin function stubs
  (Any_len_to_Any, Any_list_to_Any, etc.) and 10 missing operator stubs
  (PDiv, PBitAnd, PIs, etc.)
- Translation.lean: Map Python 'object' type to Any
- PythonDoc.lean: Architecture doc updated with Import Resolution and
  Compiled Module Cache sections

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ssomayyajula and others added 4 commits June 1, 2026 12:38
Make all of Resolution monadic (resolveExpr, typeOfExpr, resolveMethodCall,
and all helpers). This enables demand-driven module loading: when typeOfExpr
traverses a qualified type annotation through a module Ctx and a name is
missing, it loads the corresponding submodule from disk on demand.

Key changes:
- resolveExpr, resolveOptExpr, resolveKeyword, resolveArg, resolveArguments,
  resolveComprehension, resolveTypeParam, resolveWithitem, extractParamList,
  extractFuncSig, resolveFunctionBody all become monadic (ResolveM)
- typeOfExpr triggers demand-driven loads when traversing module.name chains
- All mapAnnArr/mapAnnOpt calls replaced with explicit monadic loops
- boto3 __init__.py post-processed to use qualified return types (boto3.S3)
  without importing all 421 submodules

Performance: `import boto3` now takes 0.3s (was >2 min with eager loading)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When Resolution encounters consecutive @overload-decorated FunctionDefs,
it stores them as CtxEntry.overloadedFunction (ordered list of signatures).
At call sites, it walks the list in order and picks the first signature
whose parameter types match the arguments (Literal matching for string
constants, broad matching otherwise). The matched overload carries an
index that Translation uses for disambiguated procedure names (client$0,
client$1, etc.).

Key changes:
- New CtxEntry.overloadedFunction variant
- hasOverloadDecorator helper
- matchOverload / argMatchesParam for signature matching
- FuncSig.overloadIndex field for disambiguated naming
- FuncSig.laurelName includes $N suffix for overloaded functions
- resolveMethodCall accepts callArgs for module-level overload matching
- Non-@overload def after overloads preserves the overload list (doesn't overwrite)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…d sigs

Replace full Resolution fold on imported modules with a shallow index scan.
Loading a 2841-line module now takes milliseconds (just reads declaration
names and type annotations) instead of 6+ seconds (was resolving every
statement including TypedDicts).

Key changes:
- CtxEntry.class_ methods are now List (PythonIdentifier × Thunk FuncSig)
- resolveModuleComponent does index-only scan: extracts class/function names
  and builds thunked signatures without resolving bodies or defaults
- extractParamListShallow: lightweight param extraction (names + types only)
- Method signatures resolved lazily when Thunk is forced (on first access)
- All benchmarks complete within 10s (was timing out at 15s+)
- Translation handles .irrelevant on Name nodes (module refs in attr chains)
- Imported procedures separated from user code for Elaboration (imported
  stubs treated as runtime — not elaborated)

Performance: `delete_s3_object` benchmark 0.06s (was 6+ seconds)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the broken .laurel.st caching and full-fold module resolution with
query-based resolution:

- resolveModuleComponent does index-only scan: top-level functions resolved
  eagerly (needed for overload matching), class methods stored as raw ASTs
  (CtxEntry.class_ gains methodAsts field), TypedDicts/assignments skipped.
- resolveMethodCall resolves a class method on demand from its raw AST via
  resolveMethodAstSig, recording the resolved FunctionDef in
  ResolveState.demandedMethods for the pipeline to translate.
- Stub leading-asserts become preconditions via the existing translateFunction
  path (stub asserts = specs; user-program asserts unchanged).
- Pipeline translates only demanded imported methods, not whole modules.
  Removed the .laurel.st cache read/write loop entirely.

Performance: delete_s3_object benchmark 0.07s (was 22s+ / timeout).
Unit tests: same 1 pre-existing regression, no new ones.

Known remaining: client overload procedures and service class types not yet
emitted (next commit) — boto3 benchmarks resolve fast but error at Core with
'S3 is not defined' / 'client\$N is not defined'.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot removed the Python label Jun 5, 2026
ssomayyajula and others added 17 commits June 5, 2026 01:25
…ck to Any

- ResolveState records demandedFunctions (matched overloads/top-level fns) and
  demandedClasses (id + fields) in addition to demandedMethods.
- overloadedFunction carries each overload's raw AST so the matched one is
  resolved + recorded on demand (client$N now emitted).
- Pipeline emits Composite type decls for demanded classes into the trusted
  runtime; translates demanded function/method ASTs.
- pythonTypeToHighType maps Literal/Unpack/NotRequired/Required subscripts to Any
  (fixes 'Literal is not defined').

Builds clean; unit tests at 1 pre-existing regression. boto3 benchmarks resolve
in <0.1s. Remaining: variable typed by a bare return-type name (s3_client: S3)
doesn't yet demand the class; in progress.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AnnAssign now types a variable by its RHS call's resolved return type (e.g.
boto3.S3) rather than the bare written annotation, so method calls on the
variable resolve through the module.

resolveMethodCall handles a qualified class type (boto3.S3) by loading the
submodule boto3/S3 on demand, finding the class, and resolving the called
method (recording it + the class as demanded). This clears 'S3 is not defined'.

Builds; unit tests at 1 pre-existing regression. delete_s3_object now gets
past S3 resolution; remaining: **kwargs param scoping in stub method asserts
and a hole-lifting issue (next).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e class resolution

Checkpoint: import resolution architecture complete and fast. Benchmark sweep:
5 pass, 47 internal-error, 0 timeout. Dominant root cause (23/47) is precondition
variable scoping: stub method preconditions reference body-local param names
(kwargs) but Laurel 'requires' is evaluated in the $in_ input scope. Fix next.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…_ scope

Two fixes that together unlock stub methods with **kwargs preconditions:
- FuncSig.kwargName: the **kwargs param is now a declared Any-typed input
  (in laurelDeclInputs) but NOT matched positionally by matchArgs. At call
  sites matchArgs appends a havoc value for it (mkKwargs).
- translateFunction rewrites precondition identifiers: a param `x` becomes
  `$in_x`, since Laurel `requires` is evaluated in the input scope, not the
  body-local scope. renameParamsToInputs handles the expression forms.

Clears 'kwargs is not defined' (was 23/47 benchmark failures). Builds; unit
tests at 1 pre-existing regression. delete_s3_object now only blocked by a
hole-lifting issue on the client() assignment (next).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…orator)

Bug I introduced: demanded imported methods/functions were placed in the
trusted (un-elaborated) runtime. Their bodies contain holes (stub `...` and
nondeterministic values), which the elaborator is responsible for converting
into hole-procedure calls. Bypassing elaboration let bare `<?>` holes reach
Laurel→Core, violating the elaborator's guarantee ("holes should have been
eliminated").

Fix: merge importedLaurel procedures/types into the elaboration INPUT
(toElaborate) alongside user code; trusted runtime is just pythonRuntimeLaurelPart.

Clears the hole-elimination errors. Unit tests at 1 pre-existing regression.
delete_s3_object now reaches a type-check error (arrow/Any unification) — next.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Invariant: a .static sig has no receiver slot; a .instance sig has exactly one.
Translation was prepending translateExpr(obj) for ANY .Attribute callee, so
boto3.client('s3', region_name=...) prepended the module `boto3` (a havoc hole)
as the first positional arg, shifting 's3' into region_name and feeding a
havoc-module value into a typed slot (also the arrow/Any Core type error).

Fix: prepend the receiver only when sig.params is .instance. Verified
delete_s3_object now emits client$147(from_str("s3"), from_str("us-east-1"))
and S3@delete_object(from_Composite(s3_client), kwargs) with correct ordering.

Unit tests: 1 pre-existing regression, no new.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two bypasses of the elaborator's typability invariant, both surfacing as the
Core error `Impossible to unify Any with (arrow Any (arrow Any Any))`:

1. Preconditions were carried to Core un-elaborated, so literals reached Core
   uncoerced (bare `intConst 1` where `Any` is required) and an Any-typed
   operator result sat in a bool position. A `requires` is a pure bool value, so
   pass 2 now elaborates each via the value judgment `checkValue .TBool`, which
   inserts from_int/from_str on arguments and Any_to_bool on the result, then
   projectValue yields the single Core expression.

2. `checkProducer`'s `.Hole` case recorded a deterministic hole (declared with
   the procedure's inputs) but invoked it with zero arguments via mkGradedCall.
   Declaration/call-site arity disagreed, so the 2-arg havoc function appeared as
   a bare value (`arrow Any (arrow Any Any)`). It is now applied to procInputs,
   matching the value-judgment `.Hole` case and the emission convention.

delete_s3_object now translates to Core without internal error.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
pythonTypeToHighType matched the typing aliases Dict/List/Tuple/Set but not the
lowercase builtin generics dict[...]/list[...], so `body: dict[str, Any]` fell to
the user-defined-class branch and was typed Composite. Its dict-literal value
elaborated to DictStrAny, and Core failed to unify Composite with DictStrAny (and
likewise Composite with ListAny for list[...]). The subscript cases now agree
with the bare-name cases: dict/Dict → DictStrAny; list/List/tuple/Tuple/set/Set/
frozenset → ListAny. Clears the Composite-vs-container internal errors across 12
kiro benchmarks (32 → 20 internal).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ors are pure

Two fixes to value-judgment elaboration, both surfacing as a contract that kept a
bare hole and failed Core with "holes should have been eliminated":

1. checkValue's `.Hole` case rejected nondeterministic holes (`guard
   deterministic`), so a stub assert like `re.compile(...).search(...) is not
   None` — where `re` is unmodeled, yielding a nondeterministic hole — fell back
   to the raw precondition with a bare `<??>`. In pure value position
   nondeterminism is meaningless: the value is a deterministic function of what is
   in scope. checkValue now elaborates any hole as the deterministic
   `hole_N(inputs)`. Sound, uninterpretable (inconclusive), no conjunct dropped.

2. synthValueStaticCall required the callee to be in procGrades (`| failure`), but
   datatype constructors (from_None, from_int, ...) and pure runtime functions
   carry a function sig in typeEnv.names without an explicit grade. An explicit
   `from_None()` in a precondition therefore failed synthesis and dropped the
   whole clause back to raw. It now defaults the grade to pure (as elaborateCall
   and lookupProcOutputs already do), rejecting only names graded above pure.

kiro internal errors 20 → 10.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rojection dest

A procedure body is a command in destination-passing style, not a value: Python
statements don't return their last expression, and `return e` was already lowered
by Translation to `LaurelResult := e; exit`. Elaborating the body against the
procedure's return type conflated the two — a loop body or branch arm ending in a
void call (`print(...)`) had that `()` result coerced toward the return type and
projected as a spurious `LaurelResult := from_None()`, failing Core with e.g.
`Impossible to unify Any with string`.

Two coordinated changes:
- fullElaborate (both passes) checks the body at .TVoid, not the first non-error
  output's type. The return value still reaches LaurelResult through the explicit
  `return` assignment, which checkAssign types against LaurelResult's own type.
- proj's destination becomes Option StmtExprMd. projProduce with none emits no
  assignment (a void command has nowhere to put a value); with some d emits
  d := v as before. projectProducer projects the body with none; assignment RHS
  and varDecl init subproducers pass some target, so `x := f()` still writes x.

This matches the projection's documented signature (the x : A destination is now
genuinely optional). kiro internal errors 10 → 7.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
subtype listed a handful of pairs and ended in `| _, _ => .unrelated`, which
conflated "genuinely unrelated" with "case not written". Python implicit
conversions that exist (truthiness, numeric widening) fell through silently as
.unrelated, so a typed value reached a slot of the wrong type — e.g. `if
description:` (string) or `if services:` (list) in a bool position, failing Core
with `Impossible to unify bool with string` / `with ListAny`.

subtype is now a total analysis: every (LowType, LowType) pair is decided.
.refl when equal; .coerce w when Python implicitly converts, witnessed by one
direct runtime function; .unrelated otherwise as a deliberate verdict. Unknown
TCore names (outside the finite set eraseType produces) are .unrelated. Grouped
by target type, covering four families:
- box T<:Any and unbox Any<:T (unchanged witnesses)
- truthiness T<:bool: int/str/float/list/dict_to_bool, None->false, Composite->true
- numeric bool<:int<:float: bool_to_int, int_to_real, bool_to_real

Truthiness runtime functions (int_to_bool, str_to_bool, float_to_bool,
list_to_bool, dict_to_bool) added to PythonRuntimeLaurelPart; numeric witnesses
already existed. PythonDoc updated to describe the total relation and families.

kiro internal errors 7 -> 5; apigateway_key_manager and ecs_utils clear.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… to a hole

A name referring to a function, overloaded function, or class — used in VALUE
position rather than as a call callee — was annotated `.variable`, which
Translation renders as a bare Laurel identifier (e.g. `str` from
`isinstance(x, str)`). That identifier is never bound in Laurel, so the
elaborator's lookup failed, which aborted elaboration of the WHOLE procedure; pass
2's fallback then emitted the proc unchanged, and its holes survived into Core
("holes should have been eliminated").

Resolution's `.Name` case now annotates only a bound local/param as `.variable`;
a function/overloaded/class entry in value position becomes `.unresolved`, which
Translation already renders as a sound hole (Laurel has no first-class
function/class values). Call sites are unaffected — a call computes its own
`.funcCall`/`.classNew` from the callee, independent of the callee name's
value-position annotation.

This restores the saturation invariant: the elaborator only ever receives
well-scoped Laurel — every name is a bound variable or was turned into a hole
upstream. No elaborator change.

PythonDoc updated: `.variable` means bound local/param only; the saturation
invariant; and the unmodeled-library names (defaultdict, argparse, logging, bytes,
sys.argv, boto3 KMS) that resolve to sound holes.

kms_client_manager now elaborates; kiro internal errors 5 -> 4.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…gassign

Systematic analysis of why holes survived elaboration: the elaborator is
Option-monadic, so any local failure deep in a proc body aborted the whole proc;
pass 2 then emitted it unchanged (holes intact) and the pipeline treated that as a
stderr warning, letting un-elaborated Laurel reach Core ("holes should have been
eliminated", far from the cause). Two upstream triggers fed it.

Amplifier (PySpecPipeline): a non-empty elaboration-failure list is now a hard
error listing every un-elaboratable proc, not a warning. No Core is emitted on
failure. A failure now localizes to a named proc at the elaboration step instead
of surfacing as a mysterious downstream Core error.

Trigger A — sys.argv (Resolution .Attribute): an attribute access annotated
.attribute unconditionally, even when the object is a module (sys -> .irrelevant),
producing FieldSelect(hole, argv) the elaborator can't type. A field access
requires a value receiver; an attribute on a module/unresolved object now resolves
to .unresolved (-> hole). Field reads on real values (object .variable) unchanged.

Trigger B — subscript augmented assignment (Translation .AugAssign): `a[i] op= v`
translated the subscript target to Any_get(...) and assigned TO that StaticCall,
which is not an lvalue. A subscript is not an lvalue identifier; both plain and
augmented subscript assignment now write back through Any_sets (factored into
subscriptWriteBack, reused by the .Assign subscript case).

kiro internal errors: 4 -> 0 (all 56 benchmarks translate to Core; 0 internal).
PythonDoc updated: fail-fast elaboration contract, .attribute requires a value
receiver, subscript assignment (incl. augmented) writes back via Any_sets.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…on doc

Drop two leftover dbg_trace lines in checkProducer's catch-all (debug noise, no
behavior change). Add `lake exe python` to docs/verso/generate.sh so the Python
pipeline doc (PythonDoc) is generated alongside the others.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Delete the working docs written during the refactor — they don't belong in the
final PR. Retained in commit history.

- docs/architecture/ARCHITECTURE.md
- docs/architecture/EXECUTIVE_SUMMARY.md
- .claude/agent-preamble.md

The pre-existing docs/Architecture.md (added 2025-07-24, before this refactor) is
kept.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… debt

The V2 Laurel→Core path still uses the old Laurel `resolve` and `inferHoleTypes`,
not yet ported to the new Laurel resolver. Both are load-bearing as wired:
`resolve` builds the SemanticModel Core translation reads, and removing
`inferHoleTypes` produces ill-typed Core across the whole suite (it annotates
expression types Core translation depends on, despite its name). Comment-only;
records that these must be replaced by the new resolver and not deleted piecemeal.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
translateMinimal still uses the old Laurel resolve + inferHoleTypes instead of the
new Laurel resolver; both are load-bearing (resolve builds the SemanticModel Core
translation reads; removing inferHoleTypes produces ill-typed Core suite-wide).
Recorded as a bullet in PythonDoc's Tech Debt section; trimmed the inline comment
in translateMinimal to a one-line pointer there.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@keyboardDrummer

Copy link
Copy Markdown
Contributor

If I understand correctly, this PR doesn't run existing Laurel tests using the new elaborator. Is that right? @ssomayyajula Glancing over Elaborate.lean, it seems to have only part of the feature set of the existing Laurel -> Core pipeline. I guess it supports whatever Python needs.

If you want to see how well it supports the upcoming Laurel features, you should rebase this onto keyboardDrummer#28, and then hookup all the tests to use the compilation pipeline this PR introduces.

A second thing, why does this PR modify Python->Laurel and Laurel->Core pipelines in one PR? Can't we scope it down to either of the two?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants