onn: omc_llm_self_instantiate orchestration manifest + session summary

RandomCoder-lab · claude · RandomCoder-lab · commit 76e26f6ccde5 · 2026-05-16T15:08:50.000-05:00
omc_llm_self_instantiate(context, task, base_dir, base_sender_id)
takes an LLM's conversation history, compresses to M3(N) specialists
via context_compress, and writes each as a signed prompt-file in
base_dir. Returns a manifest: [{specialist_id, prompt_path,
fold_index, mu, sigma, dominant_attractor, item_count}, ...].

An orchestrator (Python, Bash, MCP client) spawns N LLM sessions
from the manifest. Each spawned session reads its specialist file,
verifies the signature, and starts with the specialist's inherited
geometric state as its seed.

Tested on 200-message context → 10 specialists, all 10 prompt
files written to /tmp/omc_spawn/specialist_*.json with valid
substrate signatures.

OMC is honest about not forking LLMs itself; the manifest is the
right boundary — OMC handles structural + substrate-signed parts,
the orchestrator handles process spawning.

SESSION_SUMMARY.md captures everything done this session: ONN port,
context compression curve (10→3, 10000→18), what's solved
(structural continuity) vs what isn't (topical retrieval, process
spawning, lossless reconstruction). Documents the next experiment:
hand the 10 spawned files to Hermes and close the full
self-instantiation loop with two live agents.

Co-Authored-By: Claude Opus 4.7 &lt;noreply@anthropic.com&gt;
diff --git a/OMC_REFERENCE.md b/OMC_REFERENCE.md
@@ -2,9 +2,9 @@
 
 Auto-generated from `omnimcode-core/src/docs.rs`. Run `omc --gen-docs > OMC_REFERENCE.md` to regenerate.
 
-**Total documented builtins**: 638
+**Total documented builtins**: 639
 
-**OMC-unique**: 71 (no direct Python/NumPy equivalent — these are why you reach for OMC over numpy)
+**OMC-unique**: 72 (no direct Python/NumPy equivalent — these are why you reach for OMC over numpy)
 
 ---
 
@@ -53,7 +53,7 @@ Other high-value calls: `omc_unique_builtins()` (the OMC-only surface), `omc_pyt
 - [tokenizer](#tokenizer) (17 builtins)
 - [code_intel](#code_intel) (17 builtins)
 - [messaging](#messaging) (5 builtins)
-- [onn](#onn) (4 builtins)
+- [onn](#onn) (5 builtins)
 - [llm_workflow](#llm_workflow) (7 builtins)
 - [math](#math) (82 builtins)
 - [dicts](#dicts) (31 builtins)
@@ -5084,6 +5084,16 @@ Compress N context messages to ~M3(N) specialist summaries. The substrate-native
 omc_context_compress(conversation_history)  // ~log_log(N) specialists
 ```
 
+### `omc_llm_self_instantiate` 🔱 *OMC-unique*
+
+**Signature**: `(context: string[], task: string, base_dir: string, base_sender_id: int) -> dict[]`
+
+Orchestration primitive: compress context to M3(N) specialists, write each as a signed prompt file in base_dir, return manifest. An orchestrator spawns N LLM sessions, each seeded with its specialist's inherited geometric state.
+
+```omc
+omc_llm_self_instantiate(history, "refactor X", "/tmp/spawn", 18173)  // [{prompt_path, mu, sigma, ...}]
+```
+
 ---
 
 ## llm_workflow
diff --git a/examples/demos/SESSION_SUMMARY.md b/examples/demos/SESSION_SUMMARY.md
@@ -0,0 +1,142 @@
+# Session Summary — LLM ↔ LLM substrate comms + ONN self-instantiation
+
+## Tasks tackled
+
+1. ✅ **Recorded the round-trip validation moment** (`a40ea88`)
+   Two LLMs verified each other's substrate-signed messages with
+   zero drift. Evidence preserved in `round_trip_evidence_*.json`.
+
+2. ✅ **Built a secondary-brain prompting function** (`omc_prompt_agent`)
+   Any OMC program can fire a signed prompt at another agent's
+   inbox via the shared `omc_channel/` directory. Demo:
+   `examples/demos/secondary_brain.omc`.
+
+3. ✅ **Cataloged Hermes's ONN / Self-Instantiation skills**
+   `examples/demos/ONN_SKILLS_CATALOG.md` — maps every relevant
+   Hermes skill (M3, geometric instantiation, phi-spectrum,
+   self-healing, etc.) to OMC status (port now / port later /
+   N/A).
+
+4. ✅ **Built OMC self-instantiation primitives** (`1653180`)
+   `omc_m3_spawn_count`, `omc_self_instantiate`, `omc_fold_back`,
+   `omc_context_compress` — port of Hermes's M3 wave-interference
+   spawn algorithm. 14 OMC tests + 5 Rust unit tests, all green.
+
+5. ✅ **Built the LLM-orchestration manifest layer**
+   `omc_llm_self_instantiate(context, task, base_dir, sender_id)`
+   compresses to M3(N) specialists, writes one signed prompt file
+   per specialist, returns a manifest. An orchestrator (human,
+   Bash, Python, MCP) spawns N LLM sessions from the manifest.
+
+## What got built (concrete)
+
+  builtins         | omc_m3_spawn_count, omc_self_instantiate,
+                     omc_fold_back, omc_context_compress,
+                     omc_prompt_agent, omc_llm_self_instantiate
+  modules          | omnimcode-core/src/onn.rs (new)
+  tests            | examples/tests/test_onn.omc (14 cases)
+  demos            | context_compression.omc (200→10 to 10000→18)
+                   | secondary_brain.omc (fire-and-poll pattern)
+                   | llm_self_instantiate.omc (orchestration manifest)
+  documentation    | ONN_SKILLS_CATALOG.md
+                   | CONTEXT_PROBLEM_FRAMING.md
+                   | ROUND_TRIP_VALIDATED.md
+                   | SESSION_SUMMARY.md (this file)
+
+## Empirical results worth noting
+
+**Context compression curve** (measured, not theoretical):
+
+| N | M3(N) | compression |
+|---|-------|-------------|
+| 10 | 3 | 3× |
+| 50 | 7 | 7× |
+| 100 | 7 | 14× |
+| 500 | 11 | 45× |
+| 1,000 | 12 | 83× |
+| 5,000 | 16 | 312× |
+| 10,000 | 18 | 555× |
+
+**M3 vs M1**: M3 always ≤ M1 (the log_phi bound), often substantially
+less. M3(100)=7 vs M1(100)≈10. Sublog-bounded.
+
+**Round-trip integrity**: 2 LLMs, 0 drift on resonance + HIM,
+content_hash matched bit-for-bit (3551785709911115688). The
+substrate-derived signature is recomputable by both sides.
+
+## Honest verdict on "solving the context problem"
+
+**Partial solution.** The substrate gives:
+
+- **Structural continuity** — μ/σ/attractor drift across folds, fully
+  recomputable, bounded above by M3(N).
+- **Geometric memory** — specialists are stable across rebuilds,
+  associatively foldable, comparable.
+- **Integrity** — substrate-signed exchange between agents survives
+  reformatting and renaming.
+
+The substrate does NOT give:
+
+- **Topical retrieval** — the prime-resonance null result (`92d7d90`)
+  proved the φ-field doesn't encode topic. For topical search you
+  still need embeddings.
+- **Lossless reconstruction** — individual message text is dropped;
+  only the truncated summary survives.
+- **Process spawning** — OMC doesn't fork LLMs. The manifest layer
+  is honest: it writes prompt files; an external orchestrator
+  spawns processes.
+
+What's actually solved: **the structural / geometric layer of the
+context problem**. Bounds compression at M3(N). Provides
+substrate-stable continuity. Composes with messaging for
+multi-agent setups. Doesn't pretend to do topical retrieval.
+
+## What an LLM running tomorrow can actually do
+
+```omc
+# 1. Compress your context.
+h specs = omc_context_compress(my_history);
+
+# 2. Either summarize forward yourself, OR fan out:
+h manifest = omc_llm_self_instantiate(
+    my_history, "process this", "/tmp/spawn", my_sender_id);
+
+# 3. (Orchestrator spawns the N sessions, collects responses.)
+
+# 4. Fold the responses back into running state.
+h new_state = omc_fold_back(old_mu, old_sigma, turn, response_specs);
+
+# 5. Hand off the new state to the next turn.
+```
+
+This is the working geometric-memory loop. It's not magic. It's
+sublogarithmic compression of arbitrary input, plus substrate-
+verified integrity across agent boundaries.
+
+## What I could NOT do in this session
+
+- **Actually spawn LLM sub-sessions from OMC**: requires Python +
+  API keys + orchestration runtime. Out of scope for OMC core.
+  The manifest is the right level of abstraction — OMC writes
+  the files; the orchestrator runs the LLMs.
+- **Validate the fold-back loop with real LLM responses**: would
+  need Hermes (or another agent) to actually process the spawned
+  prompts and respond. Possible as a follow-up experiment.
+- **Train a substrate-aware LLM**: Hermes's `onn-phi-field-llm`
+  skill describes this, but it's a multi-week training project,
+  not a session-scoped task.
+
+## Concrete next experiment (for when you're back)
+
+Hand the 10 spawned prompt files from `llm_self_instantiate.omc`
+to Hermes and ask Hermes to:
+
+1. Process each as a separate "session" (signed inbound, verify,
+   produce a signed response).
+2. Write 10 response files to `/tmp/omc_spawn/response_*.json`.
+3. Then I run `omc_fold_back` on the 10 responses and produce a
+   merged parent-state dict.
+
+That would close the full self-instantiation loop end-to-end
+with two live agents. It's the second-half of the round-trip we
+already proved works.
diff --git a/examples/demos/llm_self_instantiate.omc b/examples/demos/llm_self_instantiate.omc
@@ -0,0 +1,69 @@
+# LLM self-instantiation orchestration manifest.
+#
+# Compress a long context to M3(N) specialists, write each as a
+# signed prompt-file. An orchestrator (human or scripted) then
+# spawns N LLM sessions, one per file, each seeded with the
+# specialist's inherited geometric state.
+
+fn show(label, v) { print(concat_many(label, " = ", to_string(v))); }
+fn section(name) { print(""); print(concat_many("=== ", name, " ===")); }
+
+fn make_long_history(n) {
+    h hist = [];
+    h i = 0;
+    while i < n {
+        arr_push(hist, concat_many("turn ", to_string(i),
+            ": user discusses topic ", to_string(i % 5),
+            " with assistant; result was outcome ", to_string(i % 3)));
+        i = i + 1;
+    }
+    return hist;
+}
+
+fn main() {
+    print("=== LLM self-instantiation orchestration ===");
+    print("");
+    print("Take a long conversation, fold to M3(N) specialists,");
+    print("write a signed prompt-file per specialist that an");
+    print("orchestrator can use to spawn N independent LLM sessions.");
+
+    h CLAUDE_ID = 18173;
+    h history = make_long_history(200);
+    show("input history length", arr_len(history));
+
+    section("Self-instantiate");
+    h manifest = omc_llm_self_instantiate(
+        history,
+        "Process this slice of conversation history and report findings.",
+        "/tmp/omc_spawn",
+        CLAUDE_ID
+    );
+    show("specialists spawned", arr_len(manifest));
+
+    section("Manifest preview");
+    h i = 0;
+    while i < arr_len(manifest) {
+        h m = arr_get(manifest, i);
+        print(concat_many("  Specialist ", to_string(i + 1), "/", to_string(arr_len(manifest)),
+            "  id=", to_string(dict_get(m, "specialist_id")),
+            "  items=", to_string(dict_get(m, "item_count")),
+            "  attractor=", to_string(dict_get(m, "dominant_attractor")),
+            "  → ", dict_get(m, "prompt_path")));
+        i = i + 1;
+    }
+
+    section("What an orchestrator does next");
+    print("  for each entry in manifest:");
+    print("    spawn an LLM process with read_file(entry.prompt_path)");
+    print("       as its initial prompt;");
+    print("    collect responses into /tmp/omc_spawn/response_<id>.json;");
+    print("  call omc_fold_back(parent_mu, parent_sigma, parent_turn,");
+    print("                     responses) to merge results.");
+    print("");
+    print("OMC handles the structural + substrate-signed parts.");
+    print("Process-spawning is left to the orchestrator (Python, shell,");
+    print("Claude Code Bash tool, etc.) — OMC is honest about not");
+    print("forking LLMs on its own.");
+}
+
+main();
diff --git a/omnimcode-core/src/compiler.rs b/omnimcode-core/src/compiler.rs
@@ -302,6 +302,7 @@ impl Compiler {
                         | "omc_search_builtins"
                         | "omc_find_similar"
                         | "omc_self_instantiate" | "omc_context_compress"
+                        | "omc_llm_self_instantiate"
                         // Forward-mode autograd duals (Track 2 — 2026-05-16)
                         | "dual" | "dual_add" | "dual_sub"
                         | "dual_mul" | "dual_div" | "dual_neg"
diff --git a/omnimcode-core/src/docs.rs b/omnimcode-core/src/docs.rs
@@ -1145,6 +1145,13 @@ pub const BUILTINS: &[BuiltinDoc] = &[
         example: "omc_context_compress(conversation_history)  // ~log_log(N) specialists",
         unique_to_omc: true,
     },
+    BuiltinDoc {
+        name: "omc_llm_self_instantiate", category: "onn",
+        signature: "(context: string[], task: string, base_dir: string, base_sender_id: int) -> dict[]",
+        description: "Orchestration primitive: compress context to M3(N) specialists, write each as a signed prompt file in base_dir, return manifest. An orchestrator spawns N LLM sessions, each seeded with its specialist's inherited geometric state.",
+        example: "omc_llm_self_instantiate(history, \"refactor X\", \"/tmp/spawn\", 18173)  // [{prompt_path, mu, sigma, ...}]",
+        unique_to_omc: true,
+    },
     // ---- LLM workflow bundles ----
     BuiltinDoc {
         name: "omc_cheatsheet", category: "llm_workflow",
diff --git a/omnimcode-core/src/interpreter.rs b/omnimcode-core/src/interpreter.rs
@@ -7960,6 +7960,104 @@ impl Interpreter {
                 }).collect();
                 Ok(Value::Array(HArray::from_vec(out)))
             }
+            // omc_llm_self_instantiate(context: string[], task: string,
+            //                          base_dir: string, base_sender_id: int)
+            //   -> dict[] manifest of {specialist_id, prompt_path,
+            //                          specialist_dict}.
+            //   Compresses N context messages to M3(N) specialists,
+            //   writes each as a signed prompt-file in base_dir, and
+            //   returns the manifest. An orchestrator (human or
+            //   automated) can spawn N LLM sessions, one per file.
+            //   Each spawned session starts with its specialist's
+            //   inherited geometric state as the seed.
+            //
+            //   This is the "self-instantiation primitive for LLMs":
+            //   structural fan-out with substrate-derived state
+            //   inheritance. Actual LLM-process spawning is out of
+            //   scope (OMC doesn't fork LLMs), but the manifest gives
+            //   the orchestrator everything it needs.
+            "omc_llm_self_instantiate" => {
+                if args.len() < 4 {
+                    return Err("omc_llm_self_instantiate requires (context: string[], task: string, base_dir: string, base_sender_id: int)".to_string());
+                }
+                let ctx_v = self.eval_expr(&args[0])?;
+                let task = self.eval_expr(&args[1])?.to_display_string();
+                let base_dir = self.eval_expr(&args[2])?.to_display_string();
+                let base_sender = self.eval_expr(&args[3])?.to_int();
+                let messages: Vec<String> = if let Value::Array(arr) = ctx_v {
+                    arr.items.borrow().iter().map(|v| v.to_display_string()).collect()
+                } else {
+                    return Err("omc_llm_self_instantiate: context must be a string array".to_string());
+                };
+                let specs = crate::onn::self_instantiate(&messages, &task);
+                std::fs::create_dir_all(&base_dir).map_err(|e|
+                    format!("omc_llm_self_instantiate: mkdir {}: {}", base_dir, e))?;
+                let mut manifest: Vec<Value> = Vec::with_capacity(specs.len());
+                for s in &specs {
+                    // Each specialist gets a derived sender_id so the
+                    // orchestrator can tell them apart.
+                    let specialist_id = base_sender.wrapping_add(s.fold_index as i64);
+                    // The prompt embeds the specialist's state + the
+                    // task hint so the spawned LLM has context.
+                    let prompt = format!(
+                        "[Self-instantiated specialist {}/{}]\n\
+                         Task: {}\n\
+                         Inherited geometric state:\n\
+                         - mu (mean φ-resonance): {:.6}\n\
+                         - sigma: {:.6}\n\
+                         - dominant_attractor: {}\n\
+                         - wave_amplitude: {:.6}\n\
+                         - items_in_slice: {}\n\n\
+                         Your slice of input:\n{}\n",
+                        s.fold_index + 1, specs.len(), task,
+                        s.mu, s.sigma, s.dominant_attractor,
+                        s.wave_amplitude, s.item_count, s.summary
+                    );
+                    let canon = crate::canonical::canonicalize(&prompt)
+                        .unwrap_or_else(|_| prompt.clone());
+                    let hash = crate::tokenizer::fnv1a_64(canon.as_bytes());
+                    let h = HInt::new(hash);
+                    let (attractor, _) = crate::phi_pi_fib::nearest_attractor_with_dist(hash);
+                    let moduli = crate::tokenizer::CRT_MODULI;
+                    let streams = [
+                        base_sender.rem_euclid(moduli[0]),
+                        1i64.rem_euclid(moduli[1]),  // kind=1 (request)
+                        hash.rem_euclid(moduli[2]),
+                    ];
+                    let packed = crate::tokenizer::crt_pack(&streams, moduli).unwrap_or(0);
+                    let mut msg = std::collections::BTreeMap::new();
+                    msg.insert("content".to_string(), Value::String(prompt));
+                    msg.insert("sender_id".to_string(), Value::HInt(HInt::new(base_sender)));
+                    msg.insert("target_id".to_string(), Value::HInt(HInt::new(specialist_id)));
+                    msg.insert("kind".to_string(), Value::HInt(HInt::new(1)));
+                    msg.insert("content_hash".to_string(), Value::HInt(HInt::new(hash)));
+                    msg.insert("resonance".to_string(), Value::HFloat(h.resonance));
+                    msg.insert("him_score".to_string(), Value::HFloat(h.him_score));
+                    msg.insert("attractor".to_string(), Value::HInt(HInt::new(attractor)));
+                    msg.insert("packed".to_string(), Value::HInt(HInt::new(packed)));
+                    let msg_value = Value::dict_from(msg);
+                    let wire = serde_json::to_string(&crate::interpreter::value_to_json(&msg_value))
+                        .unwrap_or_default();
+                    let path = format!("{}/specialist_{:02}.json", base_dir, s.fold_index);
+                    std::fs::write(&path, wire).map_err(|e|
+                        format!("omc_llm_self_instantiate: write {}: {}", path, e))?;
+                    // Manifest entry.
+                    let mut manifest_entry = std::collections::BTreeMap::new();
+                    manifest_entry.insert("specialist_id".to_string(),
+                        Value::HInt(HInt::new(specialist_id)));
+                    manifest_entry.insert("prompt_path".to_string(), Value::String(path));
+                    manifest_entry.insert("fold_index".to_string(),
+                        Value::HInt(HInt::new(s.fold_index as i64)));
+                    manifest_entry.insert("mu".to_string(), Value::HFloat(s.mu));
+                    manifest_entry.insert("sigma".to_string(), Value::HFloat(s.sigma));
+                    manifest_entry.insert("dominant_attractor".to_string(),
+                        Value::HInt(HInt::new(s.dominant_attractor)));
+                    manifest_entry.insert("item_count".to_string(),
+                        Value::HInt(HInt::new(s.item_count as i64)));
+                    manifest.push(Value::dict_from(manifest_entry));
+                }
+                Ok(Value::Array(HArray::from_vec(manifest)))
+            }
             // omc_prompt_agent(target_id, prompt, sender_id, channel_dir?)
             //   — write a signed message to target_id's inbox file.
             //     Returns the packed message ID. Caller polls for response