You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Replaces internal Gitea URLs (gitea.neuralempowerment.xyz,
git-ssh.neuralempowerment.xyz) with the public GitHub URLs.
Removes personal email from Cargo.toml authors. Applies cargo fmt
across the codebase. Adds markdown-explorer/ to .gitignore.
A follow-up filter-repo pass scrubs PII from past commit blobs.
1.`storage::load_examples` reads `routes.jsonl`; `embed_examples` produces `EmbeddedExample`s. Same for `hard_negatives.jsonl` → `EmbeddedHardNegative`.
44
44
2.`embedding::EmbeddingProvider` (trait) embeds the input; `normalize` makes cosine = dot product.
45
45
3.`scoring::score_routes` groups example similarities by route, averages the top-K, then subtracts a penalty for nearby hard negatives (`hard_negative_penalty` × max sim to any hard-negative).
46
-
4.`decision::make_decision` applies `minimum_score` and `minimum_margin` from config and emits a `RouteDecision` with status (`accepted` / `ambiguous` / `below_threshold` / `needs_review`) and candidate scores. semrouter is a pure classifier — risk assessment and confirmation gating belong in the consumer's plugin layer, not here.
46
+
4.`decision::make_decision` applies `minimum_score` and `minimum_margin` from config and emits a `RouteDecision` with status (`accepted` / `ambiguous` / `below_threshold` / `needs_review`) and candidate scores. semrouter is a pure classifier: risk assessment and confirmation gating belong in the consumer's plugin layer, not here.
47
47
48
48
**Embedders** (`src/embedding.rs`):
49
-
-`MockEmbedder` — 64-dim keyword-bag, deterministic, no network. Used by tests and as CLI default. Score range ~0.25–0.60.
50
-
-`FastEmbedEmbedder` — local ONNX `AllMiniLML6V2` via `fastembed` crate, 384-dim. Caches model under `.fastembed_cache/`. Score range ~0.22–0.62.
51
-
-`HttpEmbedder` — OpenAI-compatible `/v1/embeddings`. Endpoint from `[embedding].endpoint` in `router.toml` or `OPENAI_BASE_URL` env var.
49
+
-`MockEmbedder`: 64-dim keyword-bag, deterministic, no network. Used by tests and as CLI default. Score range ~0.25–0.60.
50
+
-`FastEmbedEmbedder`: local ONNX `AllMiniLML6V2` via `fastembed` crate, 384-dim. Caches model under `.fastembed_cache/`. Score range ~0.22–0.62.
51
+
-`HttpEmbedder`: OpenAI-compatible `/v1/embeddings`. Endpoint from `[embedding].endpoint` in `router.toml` or `OPENAI_BASE_URL` env var.
52
52
53
-
Thresholds in `router.toml` are tuned **per embedder**. The committed values target `fastembed` (`minimum_score = 0.22`, `minimum_margin = 0.005`); for`mock` use ~0.25 / 0.04. Changing embedder usually means re-tuning thresholds and re-running `eval`.
53
+
Thresholds in `router.toml` are tuned **per embedder**. The committed values target `fastembed` (`minimum_score = 0.22`, `minimum_margin = 0.005`). For`mock`, use ~0.25 / 0.04. Changing embedder usually means re-tuning thresholds and re-running `eval`.
54
54
55
55
**Eval / experiments** (`src/eval.rs`, `src/experiment.rs`): `eval` command computes accuracy, top-2 accuracy, per-route precision/recall/F1, and confusion pairs against `eval.jsonl`. `--save-experiment` writes a timestamped JSON snapshot (config + metrics + embedder label) into `experiments/` for cross-run comparison.
56
56
@@ -60,12 +60,12 @@ Thresholds in `router.toml` are tuned **per embedder**. The committed values tar
There is no `feedback.jsonl` / `decisions.jsonl` / `index/` yet — those are reserved for later phases (see README "Implementation Phases").
68
+
There is no `feedback.jsonl` / `decisions.jsonl` / `index/` yet; those are reserved for later phases (see README "Implementation Phases").
69
69
70
70
## Testing principles
71
71
@@ -76,4 +76,8 @@ There is no `feedback.jsonl` / `decisions.jsonl` / `index/` yet — those are re
76
76
77
77
- Errors flow through `RouterError` (`src/error.rs`); `main.rs` prints and `exit(1)`s on each command boundary.
78
78
- All vectors are unit-normalized before scoring. If you add a new embedder, normalize in the provider or rely on the `normalize()` call in `SemanticRouter::route`.
79
-
-`storage::load_examples` skips blank lines and surfaces parse errors with line numbers — keep that behavior when extending JSONL formats.
79
+
-`storage::load_examples` skips blank lines and surfaces parse errors with line numbers; keep that behavior when extending JSONL formats.
80
+
81
+
## Writing style
82
+
83
+
**No em dashes (—) anywhere in source, docs, or commit messages.** Restructure the sentence instead: use a colon for an explanation, a comma for a brief aside, parentheses for a parenthetical, or split into two sentences. This applies to all Markdown, Rust doc comments, code comments, README, CHANGELOG, ADRs, and PR descriptions. En dashes (–) in numeric ranges like `0.25–0.60` are fine.
A lightweight, file-based semantic router for agent / model / workflow dispatch. Routes input text to a labeled route by comparing embeddings against a curated set of examples. **Zero default dependencies beyond `serde`, `serde_json`, `toml`, and `thiserror`**— bundle a local embedder via the `fastembed` feature, or bring your own. No LLM in the hot path. Sub-millisecond routing.
11
+
A lightweight, file-based semantic router for agent / model / workflow dispatch. Routes input text to a labeled route by comparing embeddings against a curated set of examples. **Zero default dependencies beyond `serde`, `serde_json`, `toml`, and `thiserror`.**Bundle a local embedder via the `fastembed` feature, or bring your own. No LLM in the hot path. Sub-millisecond routing.
12
12
13
13
<palign="center">
14
14
<imgsrc="https://raw.githubusercontent.com/AgentParadise/semrouter/main/assets/flow-diagram.svg"alt="semrouter routing pipeline: input text → embed → cosine vs. examples → top-K per route → threshold + margin → decision" />
15
15
</p>
16
16
17
17
## Why
18
18
19
-
If you're building an AI agent, voice assistant, or workflow system, you need to dispatch user input to one of N specialized handlers. The naive options — keyword matching (brittle), LLM classifier (slow, expensive, cloud round-trip) — both have real costs. semrouter splits the difference: a tiny local embedding model gives you semantic understanding, and a flat file of labeled examples gives you a router you can edit and version-control.
19
+
If you're building an AI agent, voice assistant, or workflow system, you need to dispatch user input to one of N specialized handlers. The naive options, keyword matching (brittle) and LLM classifier (slow, expensive, cloud round-trip), both have real costs. semrouter splits the difference: a tiny local embedding model gives you semantic understanding, and a flat file of labeled examples gives you a router you can edit and version-control.
20
20
21
-
semrouter is a **pure classifier**. Risk classification, confirmation prompts, and dispatch live in your application — they don't belong in the router. This separation keeps risk policies next to the code that actually runs the dangerous thing.
21
+
semrouter is a **pure classifier**. Risk classification, confirmation prompts, and dispatch live in your application; they don't belong in the router. This separation keeps risk policies next to the code that actually runs the dangerous thing.
22
22
23
23
## Install
24
24
25
-
**Batteries-included (default — bundles fastembed local embedder):**
25
+
**Batteries-included (default, bundles fastembed local embedder):**
26
26
27
27
```toml
28
28
[dependencies]
29
29
semrouter = "0.1"
30
30
```
31
31
32
-
**Lean (lib only — bring your own `EmbeddingProvider`):**
32
+
**Lean (lib only, bring your own `EmbeddingProvider`):**
33
33
34
34
```toml
35
35
[dependencies]
@@ -125,7 +125,7 @@ impl EmbeddingProvider for OpenAIEmbedder {
125
125
}
126
126
```
127
127
128
-
That's the full HTTP-embedder surface. semrouter doesn't ship one because every consumer wants different things from their HTTP client (retry, batching, observability) — pick yours.
128
+
That's the full HTTP-embedder surface. semrouter doesn't ship one because every consumer wants different things from their HTTP client (retry, batching, observability); pick yours.
129
129
130
130
## Decision shape
131
131
@@ -200,16 +200,16 @@ The 9.1% "incorrect" cases are correctly routed to the `direct_llm` fallback int
200
200
201
201
## Status
202
202
203
-
semrouter is **pre-1.0**. The public API surface is unstable — minor version bumps may include breaking changes. Pin to a specific version (`semrouter = "=0.1.1"`) for exact reproducibility.
203
+
semrouter is **pre-1.0**. The public API surface is unstable; minor version bumps may include breaking changes. Pin to a specific version (`semrouter = "=0.1.1"`) for exact reproducibility.
204
204
205
205
`v1.0.0` will freeze the API.
206
206
207
207
## Roadmap
208
208
209
-
-**v0.2.0** — Configurable embedder. Pick any fastembed-supported model from `router.toml` (`fastembed/AllMiniLML6V2`, `fastembed/BGESmallENV15`, `fastembed/MiniLML12V2`, etc.) with a tradeoff guide in docs.
210
-
-**v0.3.0** — Closed-loop learning. `semrouter tag` (interactive CLI to mark recent decisions correct/wrong) + `semrouter promote` (ingest tagged feedback as new routing examples + run `EvalSuite` to gate regression). The router gets better the more you use it.
211
-
-**v1.0.0** — API freeze + crates.io 1.0.
209
+
-**v0.2.0**: Configurable embedder. Pick any fastembed-supported model from `router.toml` (`fastembed/AllMiniLML6V2`, `fastembed/BGESmallENV15`, `fastembed/MiniLML12V2`, etc.) with a tradeoff guide in docs.
210
+
-**v0.3.0**: Closed-loop learning. `semrouter tag` (interactive CLI to mark recent decisions correct/wrong) + `semrouter promote` (ingest tagged feedback as new routing examples + run `EvalSuite` to gate regression). The router gets better the more you use it.
Copy file name to clipboardExpand all lines: docs/ADRs/0001-zero-dependencies-goal.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,9 +4,9 @@
4
4
5
5
## Context
6
6
7
-
Rust crates that target the AI/agent ecosystem often pull in 200+ transitive dependencies — async runtimes, HTTP clients, JSON/YAML/TOML parsers, ML frameworks, etc. Each transitive dep is a build-time cost (cold builds, CI minutes), a security surface (supply-chain attacks, CVEs to track), and a friction point for downstream consumers (especially those targeting embedded, WASM, or constrained CI environments).
7
+
Rust crates that target the AI/agent ecosystem often pull in 200+ transitive dependencies: async runtimes, HTTP clients, JSON/YAML/TOML parsers, ML frameworks, etc. Each transitive dep is a build-time cost (cold builds, CI minutes), a security surface (supply-chain attacks, CVEs to track), and a friction point for downstream consumers (especially those targeting embedded, WASM, or constrained CI environments).
8
8
9
-
semrouter is a small library — at its core, it does:
9
+
semrouter is a small library. At its core, it does:
10
10
1. Read JSONL + TOML files
11
11
2. Compute cosine similarity between embedding vectors
12
12
3. Apply thresholds and emit a decision struct
@@ -17,7 +17,7 @@ The pre-v0.1.1 dep tree was 254 transitive crates, dominated by fastembed (190),
17
17
18
18
## Decision
19
19
20
-
**The default dep tree must stay minimal — under 25 transitive crates for `default-features = false` builds.** Every dependency must justify its existence:
20
+
**The default dep tree must stay minimal: under 25 transitive crates for `default-features = false` builds.** Every dependency must justify its existence:
21
21
22
22
1.**Mandatory deps** (`serde`, `serde_json`, `toml`, `thiserror`): the data model is JSONL + TOML, and consumers need typed errors. ~12 transitive crates total.
23
23
2.**Optional deps behind feature flags** (`fastembed`, `clap`): batteries-included for users who want them, opt-out for users who don't.
@@ -29,7 +29,7 @@ CI enforces this with a regression guard: the `lean-build` job runs `cargo tree
29
29
30
30
-**`anyhow`** in the library. Public APIs deserve typed errors (`thiserror::Error` enums). `anyhow` is fine in binaries; not in library code.
31
31
-**`chrono`** for timestamps. `std::time::SystemTime` plus a 60-line `civil_from_days` helper covers our needs (ISO-8601 + compact filename formats).
32
-
-**`reqwest` / `tokio`** for HTTP embedding. semrouter is not async; its hot path is dot-product math. A user wanting an HTTP-backed embedder implements the public `EmbeddingProvider` trait themselves — the surface is one method, easy to roll with `ureq` (~5 deps) or whatever client they prefer.
32
+
-**`reqwest` / `tokio`** for HTTP embedding. semrouter is not async; its hot path is dot-product math. A user wanting an HTTP-backed embedder implements the public `EmbeddingProvider` trait themselves: the surface is one method, easy to roll with `ureq` (~5 deps) or whatever client they prefer.
33
33
-**`async_trait`** (entire ecosystem). `EmbeddingProvider::embed` is sync. Async-in-sync via `tokio::runtime::Runtime::new().block_on(...)` is a smell; we removed it.
34
34
35
35
## Consequences
@@ -49,7 +49,7 @@ CI enforces this with a regression guard: the `lean-build` job runs `cargo tree
49
49
50
50
### Neutral
51
51
52
-
-`fastembed` (default-on, ~190 transitive deps) is the elephant in the room. It's there because most users want batteries-included local embeddings, but consumers who bring their own embedder pay zero cost for it via `default-features = false`. This is the only acceptable form of "fat" dep — opt-out, never opt-in-required.
52
+
-`fastembed` (default-on, ~190 transitive deps) is the elephant in the room. It's there because most users want batteries-included local embeddings, but consumers who bring their own embedder pay zero cost for it via `default-features = false`. This is the only acceptable form of "fat" dep: opt-out, never opt-in-required.
Copy file name to clipboardExpand all lines: docs/ADRs/0002-pure-classifier-architecture.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@
4
4
5
5
## Context
6
6
7
-
Earlier semrouter prototypes tracked a "risk policy" — the router would emit a `policy` block on `RouteDecision` indicating whether a route required user confirmation, was high-risk, etc. The classification was driven by hardcoded substring matching against route names (`"execute_shell_command"` → high-risk, `"send_email"` → requires_confirmation, etc.).
7
+
Earlier semrouter prototypes tracked a "risk policy": the router would emit a `policy` block on `RouteDecision` indicating whether a route required user confirmation, was high-risk, etc. The classification was driven by hardcoded substring matching against route names (`"execute_shell_command"` → high-risk, `"send_email"` → requires_confirmation, etc.).
8
8
9
9
Two problems with this:
10
10
@@ -38,9 +38,9 @@ The risk logic sits **next to the dangerous code**, where the author cannot forg
38
38
39
39
### What semrouter still tracks
40
40
41
-
- Score thresholds (`minimum_score`, `minimum_margin` from `router.toml`) — these are about **classification confidence**, not policy.
42
-
- Hard negatives (counter-examples that penalize specific routes for specific inputs) — also classification, not policy.
43
-
- Latency metrics — observability of the classifier itself.
41
+
- Score thresholds (`minimum_score`, `minimum_margin` from `router.toml`): these are about **classification confidence**, not policy.
42
+
- Hard negatives (counter-examples that penalize specific routes for specific inputs): also classification, not policy.
43
+
- Latency metrics: observability of the classifier itself.
44
44
45
45
These all sit on the classification side of the line.
46
46
@@ -64,5 +64,5 @@ These all sit on the classification side of the line.
64
64
## References
65
65
66
66
- CHANGELOG.md v0.1.0 (the risk-policy removal, before public release)
67
-
-`src/decision.rs` — the `DecisionStatus` enum and `RouteDecision` struct (no `policy` field)
68
-
-`docs/integration-example.md` — the Plugin / Dispatcher pattern for consumers
67
+
-`src/decision.rs`: the `DecisionStatus` enum and `RouteDecision` struct (no `policy` field)
68
+
-`docs/integration-example.md`: the Plugin / Dispatcher pattern for consumers
Copy file name to clipboardExpand all lines: docs/ADRs/0003-byo-embedder-trait.md
+6-6Lines changed: 6 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,9 +6,9 @@
6
6
7
7
Pre-v0.1.1, semrouter shipped three embedder backends:
8
8
9
-
1.`MockEmbedder` — 64-dim keyword-bag, deterministic, zero deps. For testing.
10
-
2.`FastEmbedEmbedder` — local ONNX MiniLM via the `fastembed` crate. The recommended production embedder.
11
-
3.`HttpEmbedder` — OpenAI-compatible `/v1/embeddings` HTTP client built on `reqwest` + `tokio`.
9
+
1.`MockEmbedder`: 64-dim keyword-bag, deterministic, zero deps. For testing.
10
+
2.`FastEmbedEmbedder`: local ONNX MiniLM via the `fastembed` crate. The recommended production embedder.
11
+
3.`HttpEmbedder`: OpenAI-compatible `/v1/embeddings` HTTP client built on `reqwest` + `tokio`.
12
12
13
13
The HTTP embedder pulled in ~85 transitive crates (the entire async stack) for what is, in its core form, a 30-line HTTP request. Worse, it shipped a half-baked implementation: no connection pooling, no retry/backoff, no rate limiting, no observability hooks. Real HTTP-embedding consumers replace that on day one with their own implementation tuned for their service.
14
14
@@ -19,7 +19,7 @@ The HTTP embedder pulled in ~85 transitive crates (the entire async stack) for w
For any other backend — HTTP API, custom local model, candle-based embedder, GPU-accelerated ort directly — consumers implement the public `EmbeddingProvider` trait:
22
+
For any other backend (HTTP API, custom local model, candle-based embedder, GPU-accelerated ort directly) consumers implement the public `EmbeddingProvider` trait:
23
23
24
24
```rust
25
25
pubtraitEmbeddingProvider:Send+Sync {
@@ -54,7 +54,7 @@ The full README has this example.
54
54
55
55
- semrouter's default dep tree drops by ~85 crates (no `reqwest`, no `tokio`).
56
56
- Consumers pick their own HTTP client (`ureq` for ~5 deps; `reqwest` for full bells and whistles; `hyper` for fanatics; `tokio` if they're already async).
57
-
- Consumers control retry, backoff, batching, observability — all the details that vary per service.
57
+
- Consumers control retry, backoff, batching, observability: all the details that vary per service.
58
58
- The `EmbeddingProvider` trait is sync, which keeps the library sync. Async creep is contained.
59
59
60
60
### Negative
@@ -69,5 +69,5 @@ The full README has this example.
0 commit comments