Skip to content

Commit 143e671

Browse files
Phase V.3: self-hosting parser + type-aware equality fix
examples/self_hosting_parser.omc — a recursive-descent parser written in OMNIcode that consumes the token stream from V.1/V.2 and emits an AST as nested tagged arrays (the canonical Python OMC convention). The language can now READ its own source (lexer V.1/V.2) and STRUCTURE it (parser V.3). Two of four self-hosting steps in place. ## AST node shapes ["NUMBER", "42"] ["FLOAT", "3.14"] ["STRING", "hello"] ["BOOL", "true"] ["VAR", "x"] ["BINOP", "+", left, right] ["CALL", name, [arg1, arg2, ...]] ["VARDECL", name, value] ["ASSIGN", name, value] ["IF", cond, then_body, else_body] ["WHILE", cond, body] ["RETURN", value_or_null] ["PRINT", expr] ["FNDEF", name, params, body] ["EXPRSTMT", expr] ## Precedence ladder parse_comparison (==, !=, <, <=, >, >=) -> parse_additive (+, -) -> parse_multiplicative (*, /, %) -> parse_primary (literals, parens, calls, variables) Mutually recursive across statements and expressions. Position is threaded explicitly as a return-array pair `[ast_node, next_pos]` because OMC has no mutable references. ## Verified on 4 demo inputs 1. h x = 89 + 144; → VARDECL x = (BINOP + 89 144) 2. if x == 89 { return x; } else { return 0; } → IF (==) then: RETURN(VAR x) else: RETURN(NUMBER 0) 3. fn fib(n) { return fib(n-1) + fib(n-2); } → FNDEF fib(n) body: RETURN(BINOP + (CALL fib (n-1)) (CALL fib (n-2))) 4. while i < 10 { sum = sum + i; i = i + 1; } → WHILE (< i 10) body: ASSIGN sum ... ASSIGN i ... Tree-walk and VM produce bit-identical output. ## Bug surfaced: type-aware equality The parser writes `if v == "null"` to check for body-less return. With the old equality rule (coerce both sides to int, compare), this was TRUE for any array v because: to_int(["VAR", "x"]) -> 0 to_int("null") -> 0 0 == 0 -> true Every RETURN value was being rendered as "(no value)". V.1 (commit e85bb01) fixed the narrower String==String case. This commit fixes the BROAD form via a values_equal helper used by both the tree-walk interpreter and the VM's cmp_op: - Same-type values: structural equality (recursive for arrays). - String vs non-string: only equal if the string parses as the corresponding numeric. - Mixed Array / Circuit / Singularity vs anything else: never equal. - Numeric / Bool / Null: standard int-or-float coercion. This is the THIRD silent bug self-hosting work has flushed out. The language stress-tests itself by being asked to do real work in itself. ## Tests 141 still passing. Canonical sweep still 22/30 in both modes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent 4740d76 commit 143e671

4 files changed

Lines changed: 806 additions & 22 deletions

File tree

CHANGELOG.md

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,48 @@ All notable changes to OMNIcode will be documented in this file.
44

55
## [Unreleased]
66

7+
### Added (Phase V.3: self-hosting parser, 2026-05-13)
8+
9+
`examples/self_hosting_parser.omc` — a recursive-descent parser written in OMNIcode that consumes a token stream from V.1/V.2 and emits an AST as **nested tagged arrays** (the canonical Python OMC convention). The OMC language can now both *read* its own source (lexer) and *structure* it (parser). Two of four steps toward true self-hosting are in place.
10+
11+
**AST node shapes:**
12+
- `["NUMBER", "42"]`, `["FLOAT", "3.14"]`, `["STRING", "hello"]`, `["BOOL", "true"]`
13+
- `["VAR", "x"]`
14+
- `["BINOP", "+", left, right]`
15+
- `["CALL", name, [arg1, arg2, ...]]`
16+
- `["VARDECL", name, value]`, `["ASSIGN", name, value]`
17+
- `["IF", cond, then_body, else_body]`
18+
- `["WHILE", cond, body]`
19+
- `["RETURN", value_or_null]`, `["PRINT", expr]`
20+
- `["FNDEF", name, params, body]`, `["EXPRSTMT", expr]`
21+
22+
**Precedence ladder:** `parse_comparison` (==, !=, <, <=, >, >=) → `parse_additive` (+, -) → `parse_multiplicative` (*, /, %) → `parse_primary`. Mutually recursive across statements and expressions. Position-threading via return-array pairs (no mutable references in OMC).
23+
24+
**Verified on 4 demo inputs:**
25+
1. `h x = 89 + 144;` → correct VARDECL with nested BINOP.
26+
2. `if x == 89 { return x; } else { return 0; }` → IF with proper then/else bodies, RETURN children intact.
27+
3. `fn fib(n) { return fib(n-1) + fib(n-2); }` → FNDEF with recursive CALL inside BINOP inside RETURN. The parser handles the full recursive depth.
28+
4. `while i < 10 { sum = sum + i; i = i + 1; }` → WHILE with assignment body.
29+
30+
Tree-walk and VM produce **bit-identical output**. 141 tests still pass.
31+
32+
### Fixed (surfaced by Phase V.3)
33+
34+
**Silent type-coercion bug in `==` / `!=`.** Already fixed string-vs-string in V.1 (commit `e85bb01`). The parser surfaced the BROADER form: `["VAR", "x"] == "null"` was returning *true* because:
35+
- `to_int(["VAR", "x"])` → 0 (arrays don't parse)
36+
- `to_int("null")` → 0 (string doesn't parse)
37+
- 0 == 0 → true
38+
39+
The parser's `print_ast` had `if v == "null"` to detect bodyless `RETURN;` — and every RETURN body was being rendered as `(no value)` because of this.
40+
41+
Fixed in both the tree-walk interpreter and the VM with a type-aware `values_equal` helper:
42+
- Same-type values: structural equality (recursive for arrays).
43+
- `String` vs non-string: only equal if the string parses as the corresponding numeric.
44+
- Mixed Array / Circuit / Singularity vs anything else: never equal.
45+
- All-numeric / Bool / Null: standard int-or-float coercion.
46+
47+
This is the third class of silent bug self-hosting work has flushed out (after string equality in V.1 and the VM array-mutation shim, also in V.1). The water keeps sanding.
48+
749
### Added (Phase V.2: self-hosting lexer polish, 2026-05-13)
850

951
`examples/self_hosting_lexer_v2.omc` — the milestone-1 lexer extended with everything needed to tokenize real-world OMC programs:

0 commit comments

Comments
 (0)