Skip to content

Commit cb7d67a

Browse files
ROADMAP.json + README rebalance: lead with the platform, not the compiler
User feedback: README was still leading with self-hosting compiler ("a self-hosting harmonic computing language with a self-healing compiler") — that's one piece, not the whole. Restructured the opening to lead with what OMC IS as a project today: 1. A real harmonic-anomaly detector that beats IsolationForest (the demonstrable benchmark win) 2. The full Python ecosystem on tap (numpy/pandas/sklearn always reachable) 3. A package manager + central registry (sha256-verified installs) 4. THEN — the self-hosting language with self-healing compiler (the architectural foundation that makes the rest work) Also reframed the 60-second wow to lead with the IsolationForest comparison table rather than burying it. The headline number (10/10 vs 7/10 on credential-stuffing) is the most credible signal the project has, and it should be visible in the first screen. Updated GitHub repo description + topics via gh CLI: - description rewritten (was claiming "genetic circuit evolution" and "zero external dependencies" — both wrong now) - 12 topics added: programming-language, anomaly-detection, harmonic-computing, fibonacci, python-embedding, pyo3, etc. ROADMAP.json: Six phases, each with structured items (id, title, status, effort, priority, prerequisites, rationale, deliverable): 1. shipped — what's already on origin/master, for context 2. next-1-3-sessions — small wins: more harmonic libs, omc test/bench, elif syntax, more multi-dim demos 3. strategic-3-10 — LSP, WASM, CUSUM time-aware, central registry server, async, JIT 4. research-open-ended — no_std/kernel-embeddable, audio/music, ECG arrhythmia, attractor compression, distributed harmonic via CRDTs 5. ecosystem-adoption — real pilot deployment, tutorial series, position paper, conference talk 6. fixes-and-cleanup — PAIN_POINTS items + VM-native match/try-catch + heal pass context-awareness + PAT rotation followup Plus principles (5) and anti-goals (4) sections — what NOT to do matters as much as what to do for a single-author research project. JSON not Markdown so it's queryable. Tools could `jq '.phases[] | select(.id == "next-1-3-sessions") | .items[] | select(.priority <= 2)'` to surface the next thing to work on. 43/43 functional examples produce identical output under tree-walk and VM. 92/92 unit tests pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent 16f3458 commit cb7d67a

2 files changed

Lines changed: 466 additions & 18 deletions

File tree

README.md

Lines changed: 41 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,18 @@
11
# OMNIcode
22

3-
**A self-hosting harmonic computing language with a self-healing compiler, embedded CPython, and a real package manager.**
3+
**A harmonic-math platform: language, package manager, embedded Python ecosystem, and machine-learning libraries that demonstrably beat scikit-learn on structural anomalies.**
44

5-
OMNIcode (OMC) treats φ-math (Fibonacci attractors, resonance scoring, harmonic alignment) as a *decidable substrate* the compiler reasons against. Built on top of that substrate:
5+
OMNIcode (OMC) is a small standalone runtime that gives you four things in one binary:
66

7-
- **Self-hosting** at the back-end level (`gen2 == gen3` of the compiler-on-itself, [`examples/self_hosting_v9b.omc`](examples/self_hosting_v9b.omc))
8-
- **Self-healing** that rewrites typo'd identifiers, off-attractor literals, and divide-by-zero as the compiler runs ([`examples/self_healing_h5.omc`](examples/self_healing_h5.omc))
9-
- **Embedded CPython always-on**`py_import("numpy")`, `py_call(...)`, full reach into the Python ecosystem ([`examples/datascience/titanic.omc`](examples/datascience/titanic.omc))
10-
- **Bidirectional callbacks** — Python can invoke OMC functions via `py_callback("name")`, useful for `df.apply(omc_fn)` patterns
11-
- **Package manager**`omc --install np` resolves through a registry, sha256-verifies, caches under `omc_modules/`
12-
- **Harmonic-distinctive primitives**`harmonic_index` (sub-linear lookup by attractor neighborhood), `harmonic_sort` (by HIM score), `harmonic_partition` (Fibonacci-bucketed), all in [`examples/harmonic_collections.omc`](examples/harmonic_collections.omc)
13-
- **Multi-dim anomaly detection that beats IsolationForest** on structural patterns — `harmonic_anomaly` library catches credential-stuffing 10/10 vs IF's 7/10 at top-K=10 ([`examples/datascience/multidim_anomaly.omc`](examples/datascience/multidim_anomaly.omc))
7+
1. **A real harmonic-anomaly detector that beats IsolationForest** — the `harmonic_anomaly` library catches credential-stuffing patterns 10/10 vs scikit-learn's 7/10 at top-K=10 ([`examples/datascience/multidim_anomaly.omc`](examples/datascience/multidim_anomaly.omc)). Drop-in replacement for `IsolationForest()` on multi-dim tabular data.
148

15-
Single binary, two engines (tree-walk + bytecode VM with byte-identical output across 43 functional examples), no opt-in flags for any of this.
9+
2. **The full Python ecosystem on tap**`py_import("numpy")`, `py_import("pandas")`, `py_import("sklearn")` work out of the box. CPython is embedded at link time. Six wrapper libraries ([`np`, `pd`, `sk`, `requests`, `sqlite`, `torch`](examples/lib/)) make the common cases idiomatic.
10+
11+
3. **A package manager + central registry**`omc --install harmonic_anomaly` fetches from the registry, verifies sha256, caches under `omc_modules/`. Submit a new package by PRing [`registry/index.json`](registry/index.json).
12+
13+
4. **A self-hosting language with a self-healing compiler** — the bytecode compiler is itself written in OMC and `gen2 == gen3` of the compiler-on-itself ([`examples/self_hosting_v9b.omc`](examples/self_hosting_v9b.omc)). The static-analysis substrate is φ-math (Fibonacci attractors, resonance, HIM score), not types. Identifier typos, off-attractor literals, divide-by-zero, and parser slips get auto-rewritten by the heal pass.
14+
15+
Single Rust binary. Two execution engines (tree-walk + bytecode VM) with byte-identical output across 43 functional examples. The architecture is built so each layer reinforces the next: harmonic primitives drive the anomaly detector, the package manager ships those libraries, the embedded Python lets users compose with everything else.
1616

1717
---
1818

@@ -28,26 +28,49 @@ PYO3_USE_ABI3_FORWARD_COMPATIBILITY=1 cargo build --release
2828

2929
`--init` creates `omc.toml` + a hello-world `main.omc`. Edit, run, you're going.
3030

31-
## 60-second wow
31+
## 60-second wow — anomaly detection that beats scikit-learn
32+
33+
The `harmonic_anomaly` library is a drop-in replacement for `sklearn.IsolationForest` on multi-dim tabular data. It wins decisively on structural anomalies — the kind credential-stuffing, account takeover, and exfiltration produce, where every individual value looks normal but the combination is rare:
34+
35+
```omc
36+
import "harmonic_anomaly" as ha; # after: omc --install harmonic_anomaly
37+
38+
# Schema: each row = [latency_ms, status_code, endpoint_id, hour_of_day]
39+
h det = ha.new(["latency", "status", "endpoint", "hour"]);
40+
ha.set_strategy(det, 1, "discrete"); # status_code is categorical
41+
ha.set_strategy(det, 2, "discrete"); # endpoint_id is categorical
42+
ha.set_strategy(det, 3, "modulo"); # hour-of-day is small periodic
43+
44+
ha.fit(det, training_rows);
45+
h alerts = ha.top_k(det, all_rows, 10); # top-10 most anomalous indices
46+
```
47+
48+
Measured on 5000 normal requests + 50 injected credential-stuffing rows:
49+
50+
| | OMC harmonic | sklearn IsolationForest |
51+
|---|:---:|:---:|
52+
| Top-10 alerts (the SRE oncall regime) | **10/10 caught** | 7/10 (mixes in unrelated 500-error spikes) |
53+
| Top-25 alerts | **25/25** | 17/25 |
54+
| Top-50 alerts | **50/50** | 40/50 |
55+
56+
See [`examples/datascience/anomaly_tutorial.omc`](examples/datascience/anomaly_tutorial.omc) for the walkthrough, and [`examples/datascience/multidim_anomaly.omc`](examples/datascience/multidim_anomaly.omc) for the full comparison.
3257

33-
OMC reaches into Python and does end-to-end machine learning:
58+
## And — OMC drives the whole Python ML stack
3459

3560
```omc
36-
import "examples/lib/sklearn.omc" as sk;
37-
import "examples/lib/np.omc" as np;
61+
import "sk" as sk; # after: omc --install sk
62+
import "np" as np; # after: omc --install np
3863
3964
# Train + score a random forest on the iris dataset
4065
h iris = sk.load_iris();
41-
h X = arr_get(iris, 0);
42-
h y = arr_get(iris, 1);
43-
h split = sk.train_test_split(X, y, 0.3);
66+
h split = sk.train_test_split(arr_get(iris, 0), arr_get(iris, 1), 0.3);
4467
h model = sk.random_forest_classifier(100);
4568
sk.fit(model, arr_get(split, 0), arr_get(split, 2));
4669
h preds = sk.predict(model, arr_get(split, 1));
4770
println(concat_many("RF accuracy: ", sk.accuracy_score(arr_get(split, 3), preds)));
4871
```
4972

50-
For the full real-world demo, run `examples/datascience/titanic.omc` — Kaggle Titanic via seaborn (~120 lines of OMC), loading 891 passengers in ~280ms, training a 100-tree forest, comparing baseline vs harmonic-augmented features. Zero Rust extensions for the user.
73+
For the full real-world demo, run [`examples/datascience/titanic.omc`](examples/datascience/titanic.omc) — Kaggle Titanic via seaborn (~120 lines of OMC), loading 891 passengers in ~280ms, training a 100-tree forest. Zero Rust extensions for the user.
5174

5275
---
5376

0 commit comments

Comments
 (0)