Skip to content

Commit 7263794

Browse files
author
Douglas Jones
committed
v3.0 dispatches, docs, and case study
Dispatches: V3-1, V3-2, V3-2 design, V3-3, v3.0 roadmap, v3.0 close, AUD-OVERNIGHT-02, relay v2.0 case study, Gemini 2.5 Pro v3.0 case study (4/5), FIND-G1 fix, B-Team prompt v3.0 update, session closes Docs: ROADMAP (v3.0 closed, V3-4 deferred), CHANGELOG (v3.0.0 entry), AGENT_QUICKREF (believe multiline arm, bottom reason), GPT4O_PROMPT (v3.0), RPC_API (V3-2 section), dispatch INDEX regenerated Examples: case_study_v2 relay programs
1 parent 57a96cb commit 7263794

37 files changed

Lines changed: 2432 additions & 158 deletions
Lines changed: 180 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,180 @@
1+
# Sable audit — AUD-OVERNIGHT-02: parallel evaluator import handling
2+
3+
*By Sable. Gate: severity rating and fix/accept decision before v3.0 parallel work begins.*
4+
*Date: 2026-05-14*
5+
6+
---
7+
8+
## Scope
9+
10+
The parallel evaluator in `crates/codifide-interpreter/src/interpreter.rs`
11+
(`eval_parallel_exprs`) creates branch interpreters with `resolved_imports:
12+
HashMap::new()`. This means imported symbols are not available inside parallel
13+
branches. The gap is documented in the source with a `// Note:` comment and
14+
tracked as AUD-OVERNIGHT-02. This audit rates the severity, characterises the
15+
failure mode, and decides: fix before v3.0 or formally accept as a known
16+
limitation.
17+
18+
---
19+
20+
## The gap — exact location
21+
22+
`interpreter.rs`, `eval_parallel_exprs`, inside the `rayon::scope` closure:
23+
24+
```rust
25+
let mut branch_interp = Interpreter {
26+
module,
27+
max_depth,
28+
depth: current_depth,
29+
prims: build_default_registry(),
30+
trace: EffectTrace::fresh(),
31+
// Note: resolved_imports is not passed to branch interpreters.
32+
// Imported symbols are not available in parallel branches.
33+
// This is a known limitation (AUD-OVERNIGHT-02). If a parallel
34+
// branch calls an imported symbol, it will fail with
35+
// unknown_callable. Fix: pass resolved_imports here when the
36+
// parallel evaluator gains full import support.
37+
resolved_imports: HashMap::new(),
38+
};
39+
```
40+
41+
The Python interpreter has no parallel evaluator, so this gap is Rust-only.
42+
43+
---
44+
45+
## Failure mode
46+
47+
A program that:
48+
1. Imports a symbol by content hash (`import f = sha256:...`), AND
49+
2. Calls that symbol inside a `list(...)` or `++` expression that the
50+
parallel evaluator decides to parallelize
51+
52+
will fail at runtime with:
53+
54+
```
55+
runtime error: unknown callable: "f"
56+
```
57+
58+
The error is deterministic — it fires every time the parallel path is taken,
59+
not intermittently. The sequential fallback path (which fires when
60+
`should_parallelize` returns false) would succeed, because the sequential
61+
`call()` method checks `self.resolved_imports` correctly.
62+
63+
---
64+
65+
## Severity assessment
66+
67+
### Is the parallel path currently reachable with imports?
68+
69+
The `should_parallelize` threshold requires **all args to be direct `Call`
70+
nodes to user-defined functions** (`is_direct_user_call`). A user-defined
71+
function is one found in `module.symbols` — the local module's symbol table.
72+
Imported symbols are not in `module.symbols`; they are in `resolved_imports`.
73+
74+
Therefore: `is_direct_user_call` returns `false` for a call to an imported
75+
symbol. `should_parallelize` returns `false`. The parallel path is **never
76+
taken** when any arg is a call to an imported symbol.
77+
78+
This is the key finding. The gap is real but currently unreachable by
79+
construction: the threshold that gates parallelism also excludes the case
80+
that would expose the gap.
81+
82+
### Severity: P3 (low)
83+
84+
- **Not reachable today.** The threshold check prevents the parallel path
85+
from firing on imported-symbol calls. No program can currently hit this
86+
error through normal use.
87+
- **Deterministic when reachable.** If the threshold were relaxed (e.g., to
88+
support imported-symbol parallelism in v3.0), the failure would be
89+
immediate and obvious — not a data race or intermittent failure.
90+
- **Well-documented.** The source comment names the limitation and the fix
91+
path. The CHANGELOG v2.0 known-limitations section documents it. This
92+
audit was scheduled before v3.0 parallel work begins.
93+
- **No security surface.** The gap cannot be exploited to bypass effect
94+
checks or access unintended symbols — it fails closed (unknown callable
95+
error), not open.
96+
97+
Downgrade from the initial "unknown severity" to **P3**. It is a latent
98+
gap, not an active defect.
99+
100+
---
101+
102+
## Fix path
103+
104+
When v3.0 parallel work begins and the threshold is relaxed to support
105+
imported-symbol calls in parallel branches, the fix is:
106+
107+
```rust
108+
let mut branch_interp = Interpreter {
109+
module,
110+
max_depth,
111+
depth: current_depth,
112+
prims: build_default_registry(),
113+
trace: EffectTrace::fresh(),
114+
resolved_imports: resolved_imports.clone(), // pass parent's imports
115+
};
116+
```
117+
118+
`resolved_imports` is a `HashMap<String, Definition>`. `Definition` derives
119+
`Clone`. The clone cost is proportional to the number of imports — for the
120+
current pipeline programs (3 imports), negligible. For programs with large
121+
import sets, a shared `Arc<HashMap<...>>` would be preferable to avoid
122+
repeated cloning.
123+
124+
The `eval_parallel_exprs` method signature would need to accept
125+
`resolved_imports` as a parameter (or `Interpreter` would need to expose it
126+
as a field accessible to the closure). The `rayon::scope` closure already
127+
transmits `module` as a raw pointer; `resolved_imports` can be transmitted
128+
the same way (it is immutable during parallel evaluation).
129+
130+
**Prerequisite:** before relaxing the threshold, add a regression test that:
131+
1. Publishes a symbol to the store
132+
2. Writes a module that imports it and calls it inside `list(...)`
133+
3. Asserts the result is correct (not `unknown callable`)
134+
135+
This test will fail before the fix and pass after it, pinning the behavior.
136+
137+
---
138+
139+
## Effect check interaction
140+
141+
One subtlety: the transitive effect check (`check_transitive_effects`) runs
142+
on the local module only. It does not walk into imported definitions. This
143+
means a parallel branch that calls an imported effectful function would not
144+
be caught by the static check — it would be caught at runtime by the effect
145+
budget check in `call_with_vals`. This is the same behavior as the sequential
146+
path. No new risk introduced by the fix.
147+
148+
---
149+
150+
## Decision
151+
152+
**Formally accept as P3 — no fix required before v3.0 planning.**
153+
154+
Rationale:
155+
- The gap is unreachable by construction under the current threshold.
156+
- The fix is straightforward and well-understood.
157+
- The correct time to apply the fix is when the threshold is relaxed as
158+
part of v3.0 parallel work — not before, because fixing it now would
159+
add code that is never exercised and cannot be tested.
160+
- The regression test described above should be written as part of the
161+
v3.0 parallel work spec, not now.
162+
163+
**Action item for v3.0 spec:** include "pass `resolved_imports` to branch
164+
interpreters" and the regression test as explicit requirements when the
165+
parallel evaluator threshold is relaxed.
166+
167+
---
168+
169+
## Summary
170+
171+
| Item | Finding |
172+
|---|---|
173+
| Gap location | `eval_parallel_exprs`, branch interpreter construction |
174+
| Failure mode | `unknown callable` when imported symbol called in parallel branch |
175+
| Currently reachable | No — threshold excludes imported-symbol calls |
176+
| Severity | P3 (latent, not active) |
177+
| Fix complexity | Low — clone `resolved_imports` into branch interpreter |
178+
| Fix timing | At v3.0 threshold relaxation, not before |
179+
| Decision | Formally accepted as P3; fix deferred to v3.0 parallel work |
180+
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# AUD-OVERNIGHT-02 — parallel evaluator import handling: closed
2+
3+
**Date:** 2026-05-14
4+
**Persona:** Quill
5+
**Audit:** `dispatches/2026-05-14-aud-overnight-02-parallel-imports.md`
6+
7+
---
8+
9+
## What happened
10+
11+
Sable audited the parallel evaluator's import handling gap (AUD-OVERNIGHT-02).
12+
The gap was documented in the Rust source and the v2.0 CHANGELOG known-limitations
13+
section. This audit was the prerequisite for v3.0 parallel work.
14+
15+
## Key finding
16+
17+
The gap is **currently unreachable by construction.** The `should_parallelize`
18+
threshold requires all args to be direct calls to symbols in `module.symbols`
19+
(the local module's symbol table). Imported symbols live in `resolved_imports`,
20+
not `module.symbols`, so `is_direct_user_call` returns false for them.
21+
`should_parallelize` returns false. The parallel path never fires on
22+
imported-symbol calls.
23+
24+
This means: no program can currently hit the `unknown callable` error through
25+
the parallel path. The gap is latent, not active.
26+
27+
## Severity: P3
28+
29+
Downgraded from "unknown" to P3. Not reachable, deterministic when reachable,
30+
well-documented, fails closed. No security surface.
31+
32+
## Decision: formally accepted, fix deferred to v3.0
33+
34+
The fix is straightforward — clone `resolved_imports` into branch interpreters.
35+
The correct time to apply it is when the threshold is relaxed as part of v3.0
36+
parallel work. Fixing it now would add untestable code.
37+
38+
The v3.0 parallel work spec must include:
39+
1. Pass `resolved_imports` to branch interpreters when threshold is relaxed.
40+
2. Regression test: import a symbol, call it inside `list(...)`, assert correct result.
41+
42+
## Gate: cleared
43+
44+
AUD-OVERNIGHT-02 is formally closed. v3.0 planning may proceed.
45+
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
dispatch:
2+
version: "1.0"
3+
date: "2026-05-14"
4+
persona: Glyph
5+
subject: "AUD-OVERNIGHT-02 — parallel evaluator import handling: severity rating and decision"
6+
capability_hash: "sha256:42d73647ba8de29a7d219bf2218bad0a42dc2a11d7878cac12ee931be2a1a185"
7+
audit_id: "AUD-OVERNIGHT-02"
8+
gap: "branch interpreters created with empty resolved_imports; imported symbols unavailable in parallel branches"
9+
currently_reachable: false
10+
reason_unreachable: "should_parallelize threshold requires all args to be direct calls to module.symbols; imported symbols are in resolved_imports, not module.symbols"
11+
severity: "P3"
12+
decision: "formally accepted; fix deferred to v3.0 parallel work"
13+
fix_path: "clone resolved_imports into branch interpreter when threshold is relaxed"
14+
v3_requirement: "pass resolved_imports to branch interpreters + regression test when threshold relaxed"
15+
gate: "cleared — v3.0 planning may proceed"
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
# B-Team Prompt Update — v3.0
2+
3+
**Date:** 2026-05-14
4+
**Persona:** Quill
5+
**Trigger:** v3.0 closed; prompt needs updating before next case study
6+
7+
---
8+
9+
## What changed
10+
11+
`docs/GPT4O_PROMPT.md` updated from v2.0 to v3.0.
12+
13+
### Title
14+
`T1-2 (updated for v2.0)``T1-2 (updated for v3.0)`
15+
16+
### Version string
17+
`__version__` bumped from `1.0.0` to `3.0.0` (was never bumped for v2.0 — corrected now).
18+
`docs/capability-0.1.json` regenerated. `dispatches/INDEX.md` regenerated.
19+
20+
### Manifest hash
21+
`sha256:42d73647...``sha256:d900fe7e...`
22+
23+
### Generator
24+
`codifide-python-2.0.0``codifide-python-3.0.0`
25+
26+
### RESOURCE 2 — surface rules
27+
28+
Added `bottom "reason"` entry:
29+
> `bottom "reason"` (v3.0). `bottom` accepts an optional string payload. The reason is propagated through `RefusalError` for diagnostics. Bare `bottom` still works.
30+
31+
### RESOURCE 2 — `is_bottom` note
32+
33+
Updated to mention "with or without a reason string" — `is_bottom` returns true for both.
34+
35+
### RESOURCE 2 — content-addressed imports
36+
37+
RPC API section updated to `v2.0+`. New V3-2 remote symbol resolution section added:
38+
- `codifide store push sha256:<hash> --registry <url>`
39+
- `codifide run <file> --registry <url>`
40+
- `codifide serve --read-only`
41+
42+
### Manifest JSON block
43+
44+
`is_bottom` primitive entry updated with `note` field documenting reason-string behavior.
45+
46+
---
47+
48+
## What was NOT changed
49+
50+
- Task spec (Programs 1–5) — unchanged; still the right test surface
51+
- FOR_AGENTS.md section — unchanged
52+
- Primitive table — no new primitives in v3.0
53+
- Surface keyword table — unchanged
54+
55+
---
56+
57+
*Filed by: Douglas Jones + Claude*
Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
id: 2026-05-14-bteam-prompt-v3
2+
date: 2026-05-14
3+
persona: Quill
4+
kind: maintenance
5+
title: "B-Team Prompt Update — v3.0"
6+
status: complete
7+
files_changed:
8+
- docs/GPT4O_PROMPT.md
9+
- codifide/__init__.py
10+
- docs/capability-0.1.json
11+
- dispatches/INDEX.md
12+
manifest_hash_before: sha256:42d73647ba8de29a7d219bf2218bad0a42dc2a11d7878cac12ee931be2a1a185
13+
manifest_hash_after: sha256:d900fe7e6d91300424b226cda0fd404bf281c4362a70131dbec116548b310ff2
14+
version_before: "1.0.0"
15+
version_after: "3.0.0"
16+
summary: >
17+
GPT4O_PROMPT.md updated for v3.0: manifest hash, generator string, bottom "reason"
18+
syntax, is_bottom note, V3-2 remote resolution section. __version__ bumped to 3.0.0
19+
(was never bumped from 1.0.0). Capability manifest and dispatch index regenerated.
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# FIND-G1 — believe arm value on continuation line
2+
3+
**Date:** 2026-05-14
4+
**Persona:** Quill
5+
**Trigger:** Gemini 2.5 Pro v3.0 case study — P3 failure
6+
7+
---
8+
9+
## Finding
10+
11+
A `believe` arm whose value appears on the next indented line after `=>` failed
12+
with `ParseError: unexpected end of expression`. The parser split each arm on
13+
`=>` within a single physical line; a line ending with `=>` had an empty
14+
right-hand side.
15+
16+
```codifide
17+
# Previously failed — value on next line
18+
believe label
19+
ge(conf(label), 0.0) =>
20+
if eq(label, "unsafe") then "blocked"
21+
else if eq(label, "safe") then "approved"
22+
else "escalate-to-human"
23+
else => bottom
24+
```
25+
26+
This is a natural formatting choice when the arm value is a long
27+
`if/then/else` expression. The error was not obvious from the docs.
28+
29+
---
30+
31+
## Fix
32+
33+
### Python parser (`codifide/parser/parser.py`)
34+
35+
`_parse_believe` updated: when the right-hand side of a `=>` arm is empty,
36+
call `_gather_expr` on the next indented line to collect the value (including
37+
multi-line `if/then/else` continuations). Same logic applied to `else =>` arms.
38+
39+
### Rust parser (`crates/codifide-interpreter/src/parser/mod.rs`)
40+
41+
`parse_believe` updated with the same logic: empty right-hand side after `=>`
42+
triggers `gather_expr` on the next line.
43+
44+
### Documentation (`docs/AGENT_QUICKREF.md`)
45+
46+
Added to "Surface rules that surprised other agents":
47+
48+
> `believe` arm values must be on the same line as `=>` — OR on the next
49+
> indented line. (After this fix, both work.)
50+
51+
Note: the quickref was updated to document the constraint before the fix was
52+
confirmed. The fix makes the constraint moot — both forms now work.
53+
54+
### Tests (`tests/test_parser.py`)
55+
56+
3 regression tests added:
57+
- `test_believe_arm_value_on_next_line` — simple value on next line
58+
- `test_believe_arm_value_multiline_if_on_next_line` — multi-line if/then/else (Gemini's exact pattern)
59+
- `test_believe_else_arm_value_on_next_line` — else arm value on next line
60+
61+
---
62+
63+
## Tests at close
64+
65+
386 passing, 0 skipped (383 + 3 new).
66+
67+
---
68+
69+
*Filed by: Douglas Jones + Claude*

0 commit comments

Comments
 (0)