Phase H.5 host integration: safe becomes a first-class OMC keyword

RandomCoder-lab · claude · RandomCoder-lab · commit 6e1b5f43c4a8 · 2026-05-14T15:52:34.000-05:00
Until now, `safe a / b` and `safe arr_get(a, idx)` only worked inside the OMC-written self-healing compiler demos (self_healing_h4.omc, h5.omc), which carry their own OMC-side parser/AST/encoder/executor. The host Rust parser/interpreter didn't know `safe` as a keyword — it would tokenize as an unknown identifier. This integration brings `safe` into the language proper: parser.rs: Token::Safe variant + "safe" keyword recognition ast.rs: Expression::Safe(Box<Expression>) variant parser.rs: parse_expression peeks for Safe and wraps the rest. Bare `safe arr_set(buf, i, v);` statements work via the existing expression-statement fallback. interpreter.rs: eval_expr dispatches Safe by inner shape: Div(l, r) → safe_divide(l, r) arr_get(a, idx) → safe_arr_get(a, idx) arr_set(a, idx,v) → safe_arr_set(a, idx, v) Unknown shapes evaluate inner directly. compiler.rs: Same dispatch for the Rust VM bytecode path, plus type inference delegates Safe to its inner. examples/safe_keyword_host.omc — new minimal smoke test that runs without any self-healing-compiler infrastructure. Eight outputs, all correct: safe 89 / 0 → 89 compute(144, 0) [dynamic /0] → 144 compute(144, 3) → 48 safe arr_get(xs, 999) → 20 [fold(999)=610, 610%3=1] safe arr_get(xs, 1) → 20 safe arr_set(xs, 999, 99) → xs becomes [10, 99, 30] The mutation case works cleanly through tree-walk because the interpreter pattern-matches Safe(Call("arr_set", ...)) before any synthetic-arg shim runs — safe_arr_set receives the actual Expression::Variable it needs and writes back to the caller's scope. Known gap: Safe(Call("arr_set", ...)) compiled to bytecode and run through the Rust VM still routes via vm_call_builtin's synthetic-arg shim and loses the mutation. This is the same shape gap V.7c closed for arr_set with Op::ArrSetNamed; a future Op::SafeArrSetNamed would close it here too. Tree-walk works cleanly today; the named-mutation gap is documented and bounded. Regression: V.9b ✓✓✓ still passes. H.5 OMC demo file (six sub-demos) all converge. No breakage of the existing surface. The Phase H story is no longer "fork the self-healing-compiler demo file." It's "write `safe` where you'd write a runtime guard." This commit also includes the (slightly delayed) Phase H.5.1 CHANGELOG entry covering the bytecode-VM gap closure shipped in commit 1deae52. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,6 +4,47 @@ All notable changes to OMNIcode will be documented in this file.
 
 ## [Unreleased]
 
+### Added (Phase H.5 host-language integration: `safe` as a first-class keyword, 2026-05-14)
+
+🎯 **`safe` is now a host-level OMC keyword — no self-healing-demo infrastructure required.**
+
+Until now, `safe a / b` and `safe arr_get(a, idx)` only worked inside the OMC-written self-healing compiler demos (`examples/self_healing_h4.omc`, `h5.omc`), which carry their own OMC-side parser, AST, encoder, and executor. The host Rust parser/interpreter didn't know `safe` as a keyword — it would tokenize as an unknown identifier.
+
+This integration brings `safe` into the language proper:
+
+| Layer | Change |
+|---|---|
+| Lexer (`parser.rs`) | New `Token::Safe`; `"safe"` keyword recognized |
+| AST (`ast.rs`) | New `Expression::Safe(Box<Expression>)` variant |
+| Parser (`parser.rs`) | `parse_expression` peeks for `Token::Safe`, wraps the rest of the expression. Bare statements (`safe arr_set(buf, i, v);`) work via the existing expression-statement fallback |
+| Interpreter (`interpreter.rs`) | `Expression::Safe(inner)` pattern-matches the inner shape: `Div(l, r)` → `safe_divide(l, r)`, `Call("arr_get", ...)` → `safe_arr_get(...)`, `Call("arr_set", ...)` → `safe_arr_set(...)`; unknown shapes evaluate the inner directly |
+| Compiler (`compiler.rs`) | `Expression::Safe(inner)` lowers to the matching `Op::Call("safe_*", n)` for known shapes; type inference delegates to the inner expression |
+
+#### Smoke test (`examples/safe_keyword_host.omc`)
+
+Eight assertions, all pass on the host interpreter without any OMC-written self-healing wrapper:
+
+- `safe 89 / 0 → 89`
+- `compute(144, 0) → 144` (dynamic zero healed)
+- `compute(144, 3) → 48`
+- `safe arr_get([10,20,30], 999) → 20` (fold(999)=610, 610%3=1)
+- `safe arr_get([10,20,30], 1) → 20`
+- `safe arr_set(xs, 999, 99)` writes xs[1]=99; xs[0] and xs[2] unchanged
+
+The mutation case (the H.5 named-store fix in OMC bytecode) is naturally clean through tree-walk because the interpreter pattern-matches `Safe(Call("arr_set", [Variable(name), ...]))` before any synthetic-arg shim runs — `safe_arr_set` receives the actual `Expression::Variable(name)` it needs and writes back to the caller's scope.
+
+#### What still doesn't work
+
+`Safe(Call("arr_set", ...))` compiled to bytecode and run through the Rust VM lowers to `Op::Call("safe_arr_set", 3)`, which routes via `vm_call_builtin`'s synthetic-arg shim → mutation lost. This is the same gap V.7c closed for `arr_set` with `Op::ArrSetNamed`. A future `Op::SafeArrSetNamed(String)` would close it here too. Tonight's scope kept the Rust-VM bytecode path on the existing call shim — tree-walk works cleanly, the named-mutation gap is documented and bounded.
+
+#### Why this matters
+
+The H.4/H.5 OMC-written demos remain the architecturally pure proof — the bytecode VM rewrites and executes `safe` semantics end-to-end on the φ-math substrate. But for a developer who just wants the feature in their OMC code, it's now a one-keyword opt-in at the language level. The Phase H story is no longer "fork the self-healing-compiler demo file." It's "write `safe` where you'd write a runtime guard."
+
+### Added (Phase H.5.1: close the safe arr_set bytecode-VM gap, 2026-05-14)
+
+`examples/self_healing_h5.omc` — `safe arr_set(VAR, idx, val)` works through the OMC bytecode VM, not just under tree-walk. New `SAFE_ARR_SET_NAMED varname` opcode in the OMC-written executor mirrors V.7c's `ARR_SET_NAMED` pattern: the variable name rides on the opcode itself rather than going through `CALL_BUILTIN`'s synthetic-arg shim that copies array arguments. Encoder detects bare-VAR first-arg shape and emits the named form; executor pops idx/val, looks up array in scope, computes fold-and-mod healed index, mutates, writes back. Demo 4b verifies: `[55, 13, 0, 0, 34]` buffer state after four `safe arr_set` writes with `idx ∈ {0, 100, -1, 6}`. Six demos, six convergences.
+
 ### Added (Phase H.5: array-bounds healing via fold_escape on the index, 2026-05-14)
 
 🎯 **`examples/self_healing_h5.omc` — `safe arr_get(a, idx)` and `safe arr_set(a, idx, v)` make out-of-bounds accesses total.**
diff --git a/README.md b/README.md
@@ -42,6 +42,7 @@ What this is **not**: a fast runtime, a production toolchain, a stable API, a de
 | Self-healing across two stages (token + AST), 5 bugs healed in one source | `examples/self_healing_h3.omc` | All four demos converge; `safe(8) → 8` on the integrated case |
 | User-declared runtime self-healing via `safe` keyword | `examples/self_healing_h4.omc` | `compute(144, 0) → 144` — runtime crash converted to finite answer on attractor |
 | Array-bounds healing — out-of-bounds reads become attractor-landing | `examples/self_healing_h5.omc` | Loop walking 8 indices off a 5-element array; every output has `φ=1.000` |
+| Host-level `safe` keyword — works in any OMC program, not just the self-healing demos | `examples/safe_keyword_host.omc` | `safe 89/0 → 89`, `safe arr_get(xs, 999) → 20`, `safe arr_set(xs, 999, 99)` mutates xs[1] |
 
 Run any of these with the binary built from this repo:
 
diff --git a/examples/safe_keyword_host.omc b/examples/safe_keyword_host.omc
@@ -0,0 +1,38 @@
+# =============================================================================
+# Host-level `safe` keyword demo
+# =============================================================================
+# As of Phase H.5 (host-language integration), the `safe` keyword is part of
+# OMC the language — recognised by the Rust parser, AST, and interpreter
+# directly. No self-healing compiler infrastructure (H.1-H.5 OMC-written
+# demo files) is required; any OMC program can use `safe` as a one-keyword
+# opt-in to runtime self-healing semantics.
+#
+# Three supported shapes today:
+#   safe a / b              → safe_divide(a, b)
+#   safe arr_get(a, idx)    → safe_arr_get(a, idx)
+#   safe arr_set(a, idx, v) → safe_arr_set(a, idx, v)
+#
+# This file is a minimal smoke test. Run with:
+#   ./target/release/omnimcode-standalone examples/safe_keyword_host.omc
+# =============================================================================
+
+# --- safe divide ---
+
+print(safe 89 / 0);                     # 89: fold(0)=1, 89/1=89
+
+fn compute(a, b) { return safe a / b; }
+print(compute(144, 0));                 # 144: dynamic zero healed at runtime
+print(compute(144, 3));                 # 48:  in-bounds division unchanged
+
+# --- safe array read out of bounds ---
+
+h xs = [10, 20, 30];
+print(safe arr_get(xs, 999));           # 20:  fold(999)=610, 610%3=1, xs[1]
+print(safe arr_get(xs, 1));             # 20:  in-bounds, no rewrite needed
+
+# --- safe array write out of bounds ---
+
+safe arr_set(xs, 999, 99);              # writes xs[fold(999) % 3] = xs[1] = 99
+print(arr_get(xs, 0));                  # 10:  unchanged
+print(arr_get(xs, 1));                  # 99:  the write landed here
+print(arr_get(xs, 2));                  # 30:  unchanged
diff --git a/omnimcode-core/src/ast.rs b/omnimcode-core/src/ast.rs
@@ -113,6 +113,17 @@ pub enum Expression {
     // Harmonic operations
     Resonance(Box<Expression>),
     Fold(Box<Expression>),
+
+    // H.5: user-declared runtime self-healing intent.
+    // `safe <expr>` wraps an expression in self-healing semantics.
+    // The interpreter pattern-matches the inner expression at eval
+    // time and routes to the appropriate ONN primitive:
+    //   safe a / b              → safe_divide(a, b)
+    //   safe arr_get(a, idx)    → safe_arr_get(a, idx)
+    //   safe arr_set(a, idx, v) → safe_arr_set(a, idx, v)
+    // Other shapes fall through to evaluating the inner expression
+    // directly (no-op), reserving the slot for future runtime guards.
+    Safe(Box<Expression>),
 }
 
 impl Expression {
diff --git a/omnimcode-core/src/compiler.rs b/omnimcode-core/src/compiler.rs
@@ -134,6 +134,12 @@ impl Compiler {
                 })
             }
             Expression::Index { .. } => None,
+            // H.5: `safe <expr>` evaluates to the same type as the inner
+            // expression after self-healing dispatch. For Div the result is
+            // int-or-float same as Div itself; for arr_get/arr_set the
+            // result mirrors the wrapped call. Delegating to the inner
+            // gives the right answer in every supported shape.
+            Expression::Safe(inner) => self.infer_type(inner),
         }
     }
 
@@ -405,6 +411,40 @@ impl Compiler {
                 }
                 self.emit(Op::Call(name.clone(), args.len()));
             }
+            Expression::Safe(inner) => {
+                // H.5 host-level: lower `safe <expr>` to the matching
+                // ONN primitive call. The host primitives (safe_divide,
+                // safe_arr_get, safe_arr_set) handle the fold-and-mod /
+                // fold-escape logic at runtime. For shapes we don't have
+                // a primitive for, just compile the inner directly.
+                //
+                // KNOWN GAP: Safe(arr_set(VAR, ...)) goes through Op::Call
+                // which routes via the vm_call_builtin shim — the mutation
+                // is lost when run through the Rust VM. Tree-walk works
+                // fine because the interpreter pattern-matches Safe before
+                // any shim. A future Op::SafeArrSetNamed would close this
+                // gap (same shape as Op::ArrSetNamed in the existing VM).
+                match inner.as_ref() {
+                    Expression::Div(l, r) => {
+                        self.compile_expr(l)?;
+                        self.compile_expr(r)?;
+                        self.emit(Op::Call("safe_divide".to_string(), 2));
+                    }
+                    Expression::Call { name, args } if name == "arr_get" && args.len() == 2 => {
+                        for arg in args {
+                            self.compile_expr(arg)?;
+                        }
+                        self.emit(Op::Call("safe_arr_get".to_string(), 2));
+                    }
+                    Expression::Call { name, args } if name == "arr_set" && args.len() == 3 => {
+                        for arg in args {
+                            self.compile_expr(arg)?;
+                        }
+                        self.emit(Op::Call("safe_arr_set".to_string(), 3));
+                    }
+                    _ => self.compile_expr(inner)?,
+                }
+            }
         }
         Ok(())
     }
diff --git a/omnimcode-core/src/interpreter.rs b/omnimcode-core/src/interpreter.rs
@@ -517,6 +517,25 @@ impl Interpreter {
                     _ => Ok(Value::HInt(HInt::new(0))),
                 }
             }
+            Expression::Safe(inner) => {
+                // H.5: dispatch user-declared safe semantics by inner shape.
+                // Known shapes route to the matching ONN primitive; everything
+                // else is evaluated unwrapped (reserves the slot for future
+                // runtime guards on more call patterns).
+                match inner.as_ref() {
+                    Expression::Div(l, r) => {
+                        let args = vec![(**l).clone(), (**r).clone()];
+                        self.call_function("safe_divide", &args)
+                    }
+                    Expression::Call { name, args } if name == "arr_get" && args.len() == 2 => {
+                        self.call_function("safe_arr_get", args)
+                    }
+                    Expression::Call { name, args } if name == "arr_set" && args.len() == 3 => {
+                        self.call_function("safe_arr_set", args)
+                    }
+                    _ => self.eval_expr(inner),
+                }
+            }
         }
     }
 
diff --git a/omnimcode-core/src/parser.rs b/omnimcode-core/src/parser.rs
@@ -23,7 +23,8 @@ pub enum Token {
     As,
     Res,
     Fold,
-    
+    Safe,        // H.5 host-level support: `safe <expr>` prefix
+
     // Identifiers and literals
     Ident(String),
     Number(i64),
@@ -326,6 +327,7 @@ impl Lexer {
                         "as" => Token::As,
                         "res" => Token::Res,
                         "fold" => Token::Fold,
+                        "safe" => Token::Safe,
                         "and" => Token::And,
                         "or" => Token::Or,
                         "not" => Token::Not,
@@ -968,6 +970,16 @@ impl Parser {
     }
 
     fn parse_expression(&mut self) -> Result<Expression, String> {
+        // H.5: `safe <expr>` prefix wraps the rest of the expression in
+        // self-healing semantics. The interpreter dispatches at eval time
+        // based on the inner shape (Div → safe_divide, arr_get → safe_arr_get,
+        // etc). Mirrors the OMC-written parser's behaviour in
+        // examples/self_healing_h5.omc.
+        if self.current() == Token::Safe {
+            self.advance();
+            let inner = self.parse_or()?;
+            return Ok(Expression::Safe(Box::new(inner)));
+        }
         self.parse_or()
     }