Skip to content

Releases: RandomCoder-lab/OMC

v0.0.2 — Language core: parser, VM, self-hosting

17 May 16:47

Choose a tag to compare

WHAT CHANGED

  • Phase A-G: HFloat, phi.X modules, pragmas, types, HSingularity as
    Value variant, stdlib + conformance tests, triple-quoted strings,
    imports, real module resolution.
  • Phase H-M: bytecode VM with tree-walk parity, bitwise ops, optimizer
    (constant folding + peephole), resonance caching, typed HIR.
  • Phase N: Phi-Field LLM kernel demo with OMNIweights.
  • Phase O: ONN self-healing primitives (Fibonacci alignment auto-repair).
  • Phase P-U: bytecode disassembler, VM inline cache, source positions,
    criterion bench suite.
  • Phase V (V.1 → V.9b): self-hosting lexer → parser → codegen → SELF-
    HOSTING FIXPOINT (OMC compiles its own compiler) → bytecode bootstrap
    fixpoint → UTF-8 safety → gen2 == gen3 byte-identical.

WHY IT MATTERS
This chapter is the foundation: a language exists, with two execution
engines kept byte-identical, a self-hosting compiler that's reflexively
stable (gen2 == gen3), HInt as the substrate primitive carrying
φ-resonance at construction, and conformance tests locking the semantics.

NOW POSSIBLE

  • Write OMC programs that exercise both engines and get identical
    results.
  • The compiler can recompile itself indefinitely without drift.
  • Everything that comes next builds on this — JIT, substrate
    algorithms, ML framework — all assume the core is stable.

See CHANGELOG.md#v0.0.2-language-core for the chapter index.

v0.8 — Substrate-Q: 4th attention component, -16.7% cumulative

17 May 18:51

Choose a tag to compare

THE PATH
v0.1 shipped K+S-MOD+V stacked for -8.94%. Q was the obvious 4th.
First attempt (Q1 = same post-projection resample as V) lost on
3 seeds — substrate-V's recipe doesn't generalize.

Per the user's hint that "Possible outcomes may relate to
different integral pieces to phi_pi_fib", broader sweep over
Q3-Q6 with different substrate operations. Q6 (phi_pi log-distance
scaling) wins decisively.

6-SEED Q6 CONFIRMATION
mean: Q0 3.128 vs Q6 2.748 (-12.15%, 6/6 seeds beat baseline)

THE RECIPE
log_d = log(|q|+1) / (π · ln φ)
modulation = exp(-γ · log_d) with γ=0.5
q_full = (x @ W_q) * modulation

Smooth magnitude regularizer keyed on phi_pi_fib structure.
NOT a snap-to-attractor — that's V's recipe and breaks Q.

PRINCIPLE

  • snap-to-attractor: helps quantities being AGGREGATED (V, K)
  • log-distance scaling: helps quantities that STEER (Q)

CUMULATIVE STACK
L0 vanilla: 3.301
L1-MH + S-MOD α=1.0: 3.084

  • V1 (v0.1): 3.006
  • Q6 (v0.8): 2.748 (-16.7% vs L0)

NOW POSSIBLE

  • 4 substrate-attention components stack at TinyShakespeare scale
  • Different phi_pi_fib operations for different roles in attention
  • Production attention block: K from CRT-Fib + S-MOD softmax +
    V resample + Q log-distance scaling

NOT IN v0.8

  • OMC-side cross-validation (needs tape_abs + tape_log)
  • Larger-scale verification (TinyShakespeare 1.1MB only)
  • γ tuning

See CHANGELOG.md#v0.8-substrate-q + SUBSTRATE_Q_WINS.md.

v0.7 — GPU scaffold: omnimcode-gpu + 4.04× on RX 580 via wgpu Vulkan

17 May 18:47

Choose a tag to compare

ARCHITECTURE

  • omnimcode-gpu crate (new):
    • ComputeBackend trait (one method: matmul)
    • CpuBackend (always-available ground truth)
    • WgpuBackend (feature wgpu) — Vulkan/Metal/DX12/OpenGL
    • pick_backend() — feature + OMC_GPU_BACKEND env override
  • Naive WGSL matmul (16x16 workgroup, no tiling)
  • 11/11 tests pass, including wgpu parity on real GPU

MEASURED ON AMD RX 580 (RADV POLARIS10 / Vulkan)
64x64: 0.23x (overhead dominates)
128x128: 0.83x (~crossover)
256x256: 2.24x
512x512: 3.39x
1024x1024: 4.04x

WHY WGPU OVER ROCM

  • Official ROCm dropped Polaris (gfx803) at version 4.0
  • Unofficial Polaris ROCm builds are fragile (Ollama "gets fussy")
  • wgpu via Vulkan works out of the box on RADV driver
  • Trait ready for ROCm/CUDA/Metal plug-ins when on supported hw

NOW POSSIBLE

  • Prometheus tape_matmul can route through this backend (v0.8 work)
  • Cross-vendor GPU compute via one trait + feature flag
  • The user's existing hardware actually gets GPU acceleration
    without driver pain

WHAT'S NOT IN v0.7

  • Prometheus integration (next chapter)
  • GPU backward pass
  • Tiled/shared-memory kernels
  • f16/bf16

See CHANGELOG.md#v0.7-gpu-scaffold + omnimcode-gpu/README.md.

v0.6 — Fibtier-memory: bounded eviction with hash-recoverable evictions

17 May 18:35

Choose a tag to compare

WHAT CHANGED

  • MemoryStore::max_entries_per_namespace: Option
  • FIBTIER_DEFAULT_SIZES mirrors fibtier.omc
  • FIBTIER_DEFAULT_MAX_ENTRIES = 232 (sum of first 10 tier sizes)
  • OMC_MEMORY_MAX_ENTRIES env var (0 = unbounded)
  • evict_to_cap(namespace, keep) helper
  • Index-only eviction — body files stay on disk so recall(hash)
    still works for entries that fell out of the chronological list

NEW MCP TOOL

  • omc_memory_evict(namespace, keep) → {namespace, dropped, kept}
  • omc_memory_stats now includes fibtier_cap

TESTS
32/32 MCP integration tests pass. 15/15 memory module unit tests.

WHY IT MATTERS
A 100-turn agent session now uses BOUNDED memory rather than the
10MB+ it would otherwise accumulate. The default 232-entry cap
covers ~hour-long conversations; v0.5's 10x context compression
holds across arbitrarily long sessions as a result.

HONEST FRAMING
Index-only eviction, not full deletion. Long-running agents would
benefit from external file cleanup. v0.6.1 candidate: physical
eviction with optional cold-storage archival.

See CHANGELOG.md#v0.6-fibtier-memory for the chapter.

v0.5 — Substrate-memory: 10.61× LLM context-budget reduction (target hit)

17 May 18:15

Choose a tag to compare

WHAT CHANGED

  • New module omnimcode-core/src/memory.rs:
    • MemoryStore { root } — filesystem at ~/.omc/memory//.txt
    • store / recall / list / stats
    • Namespace sanitization (alphanumeric + _-) prevents path traversal
    • OMC_MEMORY_ROOT env for isolation
  • Four new MCP tools:
    • omc_memory_store(text, namespace?) → {content_hash, namespace, bytes}
    • omc_memory_recall(content_hash, namespace?) → {found, text, bytes}
    • omc_memory_list(namespace?, limit?) → entries with preview, no body
    • omc_memory_stats(namespace?) → diagnostics

MEASURED COMPRESSION (20-turn agent task, top_k=10, examples/lib)
Baseline (full transcript inline): 869,761 B (100%)
v0.4 only (compressed predict + transcript): 423,030 B (48.6%, 2.06x)
v0.5 full (memory hashes + compressed): 82,008 B ( 9.4%, 10.61x)

Baseline grows quadratically; v0.5 grows linearly. Crossover at
turn ~5, 10x by turn 20.

WHY IT COMPOSES
The substrate's identity primitive (tokenizer::fnv1a_64) is shared
across all chapters — v0.3 predict, v0.3.1 fetch, v0.4
compress/decompress, v0.5 memory. An LLM agent mixes tools freely;
no tool needs to know which other tool produced a hash. That's
what makes the 10x win COMPOSE across the chapters instead of
being an isolated effect.

NOW POSSIBLE

  • LLM agents can run multi-turn conversations at ~10% of baseline
    context budget.
  • Each turn's content survives MCP process restart (filesystem
    persistence) — agents can be paused/resumed without losing
    substrate-keyed state.
  • Different conversation threads stay isolated via namespaces.

HONEST FRAMING

  • The 10x is the COMBINED v0.4 + v0.5 stack. Either alone tops
    at 2-3x.
  • Win scales with conversation length; at 5 turns v0.5 is at
    parity, 10x kicks in around turn 15+.
  • Memory grows unbounded — long-running agents need pruning
    (v0.6 candidate: wire fibtier's tier-bounded eviction).

TESTS
27/27 MCP integration + 10/10 memory unit tests.

See CHANGELOG.md#v0.5-substrate-memory +
experiments/substrate_context/FINDING_v05.md for the chapter.

v0.4 — Substrate-context: symbolic compression end-to-end, 2-3× LLM budget reduction

17 May 17:59

Choose a tag to compare

WHAT CHANGED

  • omc_compress_context(text, every_n?) — substrate-keyed codec
    payload for arbitrary OMC source.
  • omc_decompress(paths, codec | canonical_hash) — generalization
    of omc_fetch_by_hash. Recovers source via library lookup
    against corpus (alpha-rename invariant).
  • omc_predict format=codec — bounded substrate-thumbnail (≤16
    sampled tokens + canonical hash). Sits between signature
    (text-only) and full (everything).
  • paths can now be DIRECTORIES — recursively walked for *.omc
    files. Cross-corpus blending: ["examples/lib"] ingests 320
    fns across 16 files.
  • Hash unification: omc_predict's canonical_hash and
    omc_compress_context's content_hash use the same primitive
    (tokenizer::code_hash) and are interchangeable.

MEASURED COMPRESSION
10-task representative LLM workflow against examples/lib (320 fns):
top_k=5, 1 fetch: 14142 B → 6864 B (2.06x smaller)
top_k=10, 1 fetch: 27828 B → 10318 B (2.70x smaller)
top_k=20, 1 fetch: 39902 B → 14188 B (2.81x smaller)

The win amplifies with browse depth — per-candidate cost stays
at the substrate floor (~50 B for the hash) while bodies stay
un-paid-for unless committed to.

WHY IT MATTERS
Three primitives already in OMC compose without modification:
canonicalize (alpha-rename invariance), tokenizer::encode +
code_hash (substrate-aware identity), the substrate codec from
v0.0.5 (library-lookup recovery). v0.4 wires them through the
MCP surface so an LLM client has them as first-class tools.

NOW POSSIBLE

  • LLM agents can hold 20 candidate continuations in context for
    the byte cost previously required for 7 full bodies.
  • Branching is free at the context-budget level — agents can
    explore wider without burning their window.
  • Cross-corpus queries (project + stdlib + registry) cost the
    same as single-file queries because hashes are global.
  • LLM "remembers" arbitrary code chunks via omc_compress_context,
    getting them back losslessly via library lookup.

HONEST LIMITS
The original ask was 10% of the context budget (~10x). The
structural ceiling for hash-browse + on-demand-fetch alone is
closer to 3x; the 10x claim requires v0.5 (substrate-keyed
conversation memory). v0.4 ships the primitives; v0.5 wires
conversation transcripts through the same substrate.

TESTS
20/20 MCP integration tests pass.

See CHANGELOG.md#v0.4-substrate-context +
experiments/substrate_context/FINDING.md for the full chapter.

v0.3 — Symbolic prediction: substrate-indexed code completion

17 May 17:24

Choose a tag to compare

The synthesis of two earlier substrates — tokenizer::encode
(symbol stream) and canonical_hash + attractor_distance (substrate
metric) — into one primitive that LLM agents (and humans) can use
while writing OMC to find out "what could come next here?" with
each result carrying a substrate-distance score and a pointer back
to the source function it came from. Branching is first-class:
every result is a viable continuation.

WHAT CHANGED

  • New module omnimcode-core/src/predict.rs (~370 lines):
    • CorpusEntry { fn_name, source, file, symbol_stream,
      canonical_hash, attractor }
    • PrefixTrie — each node accumulates corpus indices whose stream
      passes through it
    • CodeCorpus — entries + trie; ingest_fn and ingest_file
    • predict_continuations(corpus, prefix_source, top_k)
    • Ranking: (longest prefix match desc, smallest substrate
      distance asc, corpus index asc)
  • Two new builtins:
    • omc_predict_files(paths_array, prefix_source, top_k) → array
      of dicts (stateless)
    • omc_corpus_size(paths_array) → int (diagnostic)
  • Result dict fields: fn_name, source, file, canonical_hash,
    attractor, prefix_match_len, substrate_distance, query_attractor.
  • 10 Rust unit tests + 11 OMC end-to-end tests.

WIN CONDITION (verified)
Prefix fn prom_linear_ against the 70-fn Prometheus corpus
returns exactly the three prom_linear_* fns ranked by substrate
distance. Wider prefix fn prom_attention_ surfaces 5 attention
fns with substrate distances ~3 orders of magnitude tighter than
the linear namespace — substrate distance reflects code-shape
similarity inside a namespace.

WHY IT MATTERS
Three primitives already in OMC — canonicalize (alpha-rename
invariance), tokenizer::encode (substrate-aware symbol stream),
code_hash (substrate-routed identity) — combine without modification.
The trie is a 50-line data structure on top. No embedding model,
no neural inference. Deterministic: same corpus + same prefix
→ same top-k, every run.

NOW POSSIBLE

  • An LLM agent can query "what previous code came next at this
    shape?" as a single tool call.
  • Branching is first-class — each result is a viable continuation.
  • Provenance is content-addressed: every suggestion includes its
    source file path AND its canonical hash, so a downstream agent
    can verify integrity by recompute.
  • The corpus is just file paths; no index-build step, no
    maintenance overhead.

TESTS
223 Rust pass, 1087/1087 OMC pass (was 213/1076).

DEFERRED

  • Prometheus rerank pass (structural substrate ranking + learned
    probability overlay)
  • Stateful corpus API for repeated queries
  • MCP tool surface
  • Streaming + cross-corpus blending

See CHANGELOG.md#v0.3-symbolic-prediction +
experiments/symbolic_prediction/FINDING.md for the chapter detail.

v0.2 — Ergonomics: OMC becomes forgiving

17 May 16:48

Choose a tag to compare

WHAT CHANGED

  • Python-idiom builtins: len() polymorphic over array/string/dict/null;
    range(start, end, step) with negative step; getenv(name, default);
    to_hex / from_hex round-trip; parse_int / parse_float aliases.
  • Negative array indexing (Python-style): xs[-1], arr_get(xs, -1),
    arr_set(xs, -1, v) all work. Out-of-bounds errors name the array,
    report length, hint at safe_arr_get for wrap-around.
  • Compound assignment: +=, -=, *=, /=, %= desugared at parse time.
  • For-loop iterables expanded: for k in dict iterates keys; for c in
    string iterates chars. Anything else errors instead of no-op'ing.
  • Self-healing pass: two new classes — null_arith (null + 5 → 0 + 5)
    and if_numeric (if 0 flagged as constant branch). 11 heal classes
    total.
  • Did-you-mean for undefined variables (substrate-bucketed close-name
    lookup over current scope).
  • Cross-container hints: arr_get(some_dict, k) suggests dict_get;
    symmetric for dict_get(arr, k).
  • Parser hints: h h = 1 → "'h' is a reserved keyword; try hval".
    if x = 5 → "did you mean ==?". Friendlier unexpected-token msgs.
  • Runtime errors carry call-stack traces in the CLI.
  • Type-mismatch errors report received type with hint.

WHY IT MATTERS
The most common bites a Python user hit on first contact — cryptic
{:?} token names in parser errors, no +=, silent no-op for-loops over
dicts, undefined-variable errors with no suggestion — are gone. The
language now lives up to its "forgiving by default" pitch instead of
just promising it.

NOW POSSIBLE

  • A new user can write OMC reaching for Python intuitions (len(d),
    range(0, 10, 2), x += 1, for key in scores) and have it Just Work.
  • Runtime errors debuggable from the message alone, including call chain.
  • Mistakes surface at the right layer (parser vs heal-pass vs runtime).

TESTS
+29 new Rust tests, +28 new OMC tests. Final: 213 Rust pass,
1073/1076 OMC pass.

See CHANGELOG.md#v0.2-ergonomics for the chapter index.

v0.1 — Substrate attention: K + S-MOD + V stack to -8.94% val on TinyShakespeare

17 May 16:48

Choose a tag to compare

The substrate-attention thesis — that the K matrix, attention
softmax, and V projection can each be replaced by substrate-derived
alternatives that match or beat learned components — finally lands
as a STACK. None of these wins are individually new; the chapter's
point is that they stack inside one transformer block at real scale.

WHAT CHANGED

  • Substrate-K (1462d45, SUBSTRATE_K_FINDING.md): replace learned W_K
    with CRT-Fibonacci positional table. K structurally pre-built; Q
    and V stay learned. -6.3% val at multi-head TinyShakespeare scale
    (2/3 seeds), ~10% fewer attention parameters.
  • S-MOD softmax (761180f, SUBSTRATE_SOFTMAX_FINDING.md): replace
    softmax(s) with softmax(s) × 1/(1 + α·attractor_distance(s)),
    then renormalize. Off-attractor weights dampened. 3-seed α sweep
    found α=1.0 wins -6.57% vs vanilla softmax.
  • Substrate-V resample (1080da2, SUBSTRATE_V_FINDING.md): apply
    substrate_resample(x @ W_v) to V post-projection (W_v stays
    learned). Off-attractor V-magnitudes dampened. -2.52% on top of
    L1-MH + S-MOD (3/3 seeds).

CUMULATIVE RESULT
L0 (vanilla softmax + learned V): 3.301
L1-MH + S-MOD α=1.0: 3.084
L1-MH + S-MOD α=1.0 + V1 (production): 3.006 = -8.94% val

WHY IT MATTERS
Each substrate replacement is a MODULATION, not a wholesale swap of
the learned projection. The substrate composes with task learning
instead of replacing it. The opposite recipe (substrate-V with no
learned W_v and no S-MOD) lost decisively the day prior. The
principle: substrate modulation works when applied to a quantity
that already has integer-coherent structure; substrate replacement
of learned projections does not.

NOW POSSIBLE

  • Substrate-aware attention is the production default in Prometheus.
  • Three substrate-component wins now stack in a single transformer
    block on real data (TinyShakespeare 1.1MB).
  • Future component swaps (Q, FF, layernorm) measured against this
    stacked baseline rather than vanilla.
  • Cross-runtime parity: every result reproduced in pure-OMC
    Prometheus AND PyTorch.

See CHANGELOG.md#v0.1-substrate-attention for the chapter index.

V0.0.1

01 May 04:03
f76fb4b

Choose a tag to compare

OMNIcode is a native, standalone Rust implementation of a harmonic computing language designed for genetic circuit evolution. It compiles to a single 509 KB portable binary with zero external dependencies, making it
ideal for embedded systems and game engines. Tiers 1-4 complete: circuit evaluation, optimization, Fibonacci search, and LRU caching. 49/49 tests passing. Real benchmarks show 50-230× speedup over Python DEAP depending
on circuit complexity.
GitHub Repository Description (60 chars max)

 Fast native circuit evolution. Zero deps. 509 KB binary.                                                                                                                                                                         
                                                                                                                                                                                                                                  
 For Technical Audiences                                                                                                                                                                                                          
 OMNIcode is a zero-dependency genetic algorithm framework for evolving Boolean circuits. Written in pure Rust with LTO optimization, it delivers 4.64M fitness evaluations/second with no interpreter overhead. Features         
 constant folding, algebraic simplification, and O(log φ n) search. Portable across Linux/Unix systems via single static binary.                                                                                                  
 For Game Developers                                                                                                                                                                                                              
 OMNIcode - Embeddable circuit evolution engine for game AI and procedural generation. 509 KB binary, zero dependencies, no runtime overhead. Evolve logic circuits 50-230× faster than Python frameworks. Perfect for game       
 mods, tools, and offline processing pipelines.                                                                                                                                                                                   
 For Researchers                                                                                                                                                                                                                  
 A benchmarked implementation of genetic programming for Boolean circuits, with real Criterion data confirming native performance advantages. Honest documentation, transparent limitations, reproducible results. Suitable       
 for research and experimentation; not production-grade.

Full Changelog: https://github.com/RandomCoder-lab/OMC/commits/V0.0.1