SEVO (Sacred EVolutionary Objective Search) is a population-based evolutionary training system that mutates not only scalar hyperparameters (learning rate, gradient clipping, warmup) but the training objective itself (NTP, JEPA, NCA, hybrid) across a population of workers. SEVO combines three mechanisms — ASHA successive halving, PBT hyperparameter mutation, and diversity-preserving objective injection — under sacred 3^k constraints derived from ternary resonance theory. Implemented in 6,700+ lines of pure Zig with zero dependencies, SEVO orchestrates 108 concurrent workers on commodity CPU infrastructure at $120/month total cost.
The name SEVO is clean — no conflicts exist in the ML/AI namespace.
Population: N workers {w_1, ..., w_N}, each with configuration:
config_i = (objective, lr, gc, warmup, context, schedule, batch)
Where:
objective∈ {NTP, NTP-81+, JEPA, hybrid} — the training loss functionlr∈ phi-grid: {base × φ^p | p ∈ {-3, -2, -1, 0, 1, 2, 3}}gc∈ {0.5, 1.0, 2.0} — gradient clipping thresholdwarmup∈ {243, 729, 2187} — sacred warmup steps (3^5, 3^6, 3^7)context∈ {27, 81, 243} — sacred context lengths (3^3, 3^4, 3^5)schedule= cosine (ONLY — flat schedule banned)batch∈ {27, 54, 66} — batch sizes
Evolution loop: At each SACRED_RUNG (3^k steps: 2187, 6561, 19683, 59049):
- Evaluate: collect PPL from all workers at rung
- Halve (ASHA): kill bottom fraction by PPL
- Mutate (PBT): top workers → copy weights → mutate config (including objective)
- Inject: compute quota deficit → inject new workers for underrepresented objectives
- Fresh detect: objective/context changed → FRESH=1, scalar-only → FRESH=0
| Dimension | PBT (Jaderberg 2017) | ASHA (Li 2020) | NAS (Zoph 2017) | SEVO |
|---|---|---|---|---|
| Mutates LR/batch | Yes | No | No | Yes |
| Mutates objective | No | No | No | Yes |
| Copies best weights | Yes | No | No | Yes |
| Successive halving | No | Yes | No | Yes |
| Diversity quotas | No | No | No | Yes |
| Sacred 3^k constraints | No | No | No | Yes |
| Fresh/resume detection | No | No | No | Yes |
Key insight: PBT searches a hyperparameter hypercube; SEVO searches a hyperparameter × objective product space with diversity enforcement.
SEVO(population_size=108, rungs=[2187, 6561, 19683, 59049]):
# Initialize population with quota-balanced objectives
population = inject_batch(population_size, DEFAULT_QUOTAS)
for rung in rungs:
# Wait for all workers to reach rung
wait_until(all workers at step >= rung)
# Collect fitness (PPL) from all workers
fitness = {w: w.ppl for w in population}
# ASHA: successive halving — kill bottom fraction
survivors = top_fraction(population, by=fitness)
killed = population - survivors
# PBT: mutate surviving workers
for w in survivors:
if w in top_k:
w.config = mutateConfigSacred(w.config)
# May change objective: NTP → JEPA, etc.
# Injection: restore diversity quotas
deficit = computeQuotaDeficit(survivors, DEFAULT_QUOTAS)
new_workers = inject_batch(deficit, quotas)
for w in new_workers:
w.ppl = 0.0 # Sentinel: prevents immediate re-kill
# Fresh detection
for w in mutated_or_new:
if w.obj_changed or w.ctx_changed:
w.FRESH = 1 # Restart from scratch
else:
w.FRESH = 0 # Resume from checkpoint
population = survivors + new_workers
return best(population, by=ppl)
A method for training neural networks comprising: (a) maintaining a population of N training workers, each with a configuration including a training objective field; (b) the training objective field accepting values from a set of distinct loss functions {NTP, JEPA, NCA, hybrid}; (c) during evolutionary mutation, modifying the training objective field of a worker configuration, thereby changing the loss function used for subsequent training; (d) selecting objective values according to predefined diversity quotas to maintain population diversity across objective types.
A method for maintaining population diversity in evolutionary training comprising: (a) detecting underrepresented training objective types via quota deficit computation; (b) injecting new workers with underrepresented objectives; (c) assigning a sentinel fitness value (PPL=0.0) to newly injected workers; (d) the sentinel value preventing freshly injected workers from being eliminated as worst-performing in the subsequent evolution cycle.
A method for determining training continuation strategy after evolutionary mutation comprising: (a) comparing the mutated configuration with the parent configuration; (b) if the training objective or context length has changed, setting FRESH=1 to trigger training restart from random initialization; (c) if only scalar hyperparameters (learning rate, gradient clipping, warmup) have changed, setting FRESH=0 to enable training resumption from the parent checkpoint; (d) thereby minimizing wasted compute by reusing checkpoints when mutation is compatible.
A method for constraining the hyperparameter search space in evolutionary training comprising: (a) restricting evaluation checkpoints (rungs) to steps that are powers of 3; (b) constraining learning rates to a golden-ratio grid: base × φ^p for integer p; (c) constraining warmup steps and context lengths to powers of 3; (d) these constraints reducing the search space to values empirically correlated with resonant performance in ternary neural networks.
The method of Claims 1 and 4, further comprising: (a) successive halving (ASHA) at sacred rungs to eliminate underperforming workers; (b) population-based training (PBT) mutation of surviving workers' configurations; (c) objective-type injection based on quota deficit computation; wherein all three mechanisms operate in a single evolutionary loop at each sacred rung.
The method of Claim 2, wherein: (a) diversity health H is computed as the product of standard deviations across hyperparameter dimensions: H = ∏(σ_lr, σ_gc, σ_wu); (b) when H falls below a predefined threshold, automatic injection is triggered; (c) injection targets the objective type with the largest quota deficit.
The method of Claim 1, further comprising: (a) tracking generation number (G0, G1, G2, ...) for each worker through evolutionary mutations; (b) computing mutation yield as the fraction of mutated workers that improve PPL over their parent; (c) using mutation yield to adapt mutation aggressiveness in subsequent generations.
All SEVO experiments were conducted on a CPU-only cloud farm built from low-cost Railway accounts. Each account provides a small pool of commodity x86 CPU containers with no GPU or TPU accelerators. A total of six Railway accounts were used, each on a 20 USD/month plan, yielding an effective compute budget of ≈120 USD/month for the entire training farm.
The HSLM-1.95M ternary model (1.58 bits/parameter, ≈390 KB weights) was trained using SEVO across 104 concurrent workers scheduled over these CPU resources. Population-based evolution was orchestrated entirely through lightweight process scheduling and log-based control, without any dedicated cluster manager or GPU queueing system.
Key point: The CPU/Railway cost is an implementation advantage demonstrating SEVO's efficiency, NOT a patent claim. The algorithm itself is hardware-agnostic.
- YES: Objective mutation within PBT population — novel search dimension
- YES: Diversity-preserving injection with PPL=0.0 sentinel — prevents premature elimination
- YES: FRESH/RESUME detection — automatic restart vs resume decision
- YES: Sacred 3^k constraint system — empirically-grounded search space restriction
- YES: Combined ASHA + PBT + objective injection — three mechanisms in one loop
- YES: Selection metrics: diversity health H, mutation yield
- NO: Railway deployment details — infrastructure choice, not algorithm
- NO: 6 accounts at $20/month — commercial arrangement
- NO: CPU-farm specifics — hardware platform, not method
- NO: Zig implementation — language choice
The engineering context is valuable for the paper narrative (accessibility, reproducibility, cost-efficiency) and go-to-market story, but patent claims should focus on the algorithmic innovations.
Filing: P5 — SEVO Sacred EVolutionary Objective Search Priority: HIGH Estimated cost: $1,500–2,000 (provisional patent application) Priority date: Establishes priority on the entire SEVO method
- Objective mutation in PBT (Claim 1)
- Diversity injection with sentinel (Claim 2)
- FRESH/RESUME detection (Claim 3)
- Sacred 3^k constraints (Claim 4)
- Combined ASHA+PBT+objective (Claim 5)
| Factor | Assessment |
|---|---|
| Novelty | No prior PBT system mutates training objective |
| Non-obviousness | Objective mutation requires FRESH detection — not trivial extension |
| Experimental evidence | 108 workers, 6 accounts, G0→G3 evolution, PPL=4.6 best |
| Prior art gap | PBT (US12314856B2) mutates scalars only; NAS searches architecture not objective |
| 101 risk | LOW — concrete algorithm with measurable metrics, not abstract idea |
| Reproducibility | Open source (MIT), fully specified in 6,700 LOC |
- Closed-loop integration with Arena for quality-aware evolution: using ELO-based model evaluation scores as an additional fitness signal beyond perplexity
- Integration with autonomous code improvement (Ouroboros) for end-to-end self-improvement: code quality metrics feeding back into SEVO population decisions
- File provisional: establishes priority date immediately
- Convert to utility patent within 12 months
- Consider PCT filing for international protection