Skip to content

Crystalka228/Dual-Frontier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

990 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dual Frontier

A falsifiable test of LLM-augmented systems engineering at one-person scale.


Claim under test

The hypothesis is that a single developer, in the role of contract architect, can build a non-trivial systems-software artifact through structured work with a tiered LLM pipeline, where the LLMs operate as executors inside contract boundaries rather than as substitutes for engineering judgement.

The claim is operationalized as the production of a moddable simulation engine with declared invariants — a multithreaded ECS, capability-based mod isolation, and a replaceable native core — built solo, with measured pipeline throughput and a recorded defect rate. The colony-simulator content sits on top of the engine as a realistic load; the engine exists to stress-test the methodology under non-trivial workload.

Falsifiability conditions

The claim is rejected if any of the following hold over a sustained development window:

  1. Defect rate. The shipped artifact accumulates production-class defects that the contract-and-test infrastructure was supposed to prevent. Current state: 0 known production bugs across the closed phases; full test counts and acceptance criteria are recorded in docs/ROADMAP.md.
  2. Architectural integrity. The architecture drifts under sustained activity — locked specifications stop reflecting the code, contracts weaken to accommodate executor limitations, or isolation guarantees erode. Current state: architectural decisions and their rejected alternatives are recorded in docs/architecture/ARCHITECTURE.md, docs/architecture/MOD_OS_ARCHITECTURE.md, and the normalization audit in docs/reports/NORMALIZATION_REPORT.md.
  3. Pipeline economics. The pipeline cannot sustain its own throughput under a fixed monthly subscription and spills into pay-as-you-go API consumption to keep moving. Current state: two consecutive weekly windows under different operational profiles converge on the same headroom band; measurements are recorded in docs/methodology/PIPELINE_METRICS.md §3.

Each condition has a documented source of truth; the present README does not restate the numbers.

Pipeline

The pipeline configures N agents in an architect-executor split with rigid contracts at boundaries: a human as direction owner; one or more LLM instances operating as architect (deliberation, brief authoring, QA review) and executor (mechanical application against authored briefs). The architect-executor split with contracts at boundaries is invariant across configurations; specific N, the boundary type between architect and executor (model-tier boundary, session-mode boundary, or mixed), and tier mix vary by pipeline configuration.

The agents do not communicate directly; coordination happens through LOCKED documents in the repository and through the human as session router.

Current configuration (v1.6, 2026-05-10). N=2: Crystalka (direction owner) plus a unified Claude Desktop session that switches between deliberation mode (chat interface, architectural decision recording per K8.0 / K-L3.1 / A'.0.7 precedent) and execution mode (Claude Code agent, autonomous tool-loop per A'.0.5 precedent). Boundary type: session-mode.

v1.x era (Phase 0–8, ending 2026-05-09) used model-tier boundary с N=4 (local quantized Gemma executor + cloud Sonnet prompt-generator + cloud Opus architect + human direction owner). Empirical record preserved in docs/methodology/PIPELINE_METRICS.md с per-metric transferability annotations.

Full pipeline configuration, empirical task metrics, subscription headroom data, and reproducibility requirements documented in docs/methodology/PIPELINE_METRICS.md. Full methodology documented in docs/methodology/METHODOLOGY.md. The methodology and deeper documents are authored under agent-as-primary-reader assumption — readers unfamiliar с the project's cross-reference density should use AI tooling for navigation through the documentation corpus.

If a contract is rigid enough that an executor produces correct code under it on the first build at a measurable rate (target <30% requiring second execution), the contract will hold under any stronger executor or any restructured boundary type. Isolation from executor errors is a structural property of the contract, not of the executor's specific capacity.

What the engine demonstrates

The engine is the stress test. Without a non-trivial workload, the pipeline claim reduces to a statement about toy problems. The engine carries three properties that make the workload non-trivial:

  • A multithreaded ECS with declarative system access ([SystemAccess]), a Kahn-sorted dependency graph, and compile-time isolation enforcement. [SystemAccess] declarations are consumed by DependencyGraph for edge-building; the future A'.9 Roslyn analyzer extends this enforcement to call sites. The runtime guard methods that previously threw IsolationViolationException were deleted in K8.3+K8.4 (A'.5 closure, 2026-05-14) — the safety model is compile-time + analyzer, not runtime.
  • Capability-based mod isolation: each mod loads into its own AssemblyLoadContext, sees only DualFrontier.Contracts, and interacts with the kernel through reflection-scanned capabilities. The architecture is documented as an OS-style design in docs/architecture/MOD_OS_ARCHITECTURE.md.
  • A native ECS storage backend (NativeWorld) — the sole production component-storage path after A'.5 K8.3+K8.4 (2026-05-14). The prior managed World is retired from production and survives only as a test fixture (ManagedTestWorld). An earlier exploration of a separate C++ kernel as a replaceable boundary produced a measured negative result with criterion reformulation, recorded in docs/reports/NATIVE_CORE_EXPERIMENT.md.

What this is not

This repository is not a game release, a competitor to Bevy or Unity DOTS, or a claim that LLM pipelines can replace software engineers. It is also not a claim about generalizability beyond systems software with formal, machine-checkable contracts. The boundaries of applicability are recorded in docs/methodology/METHODOLOGY.md §6.

Hardware requirements

Dual Frontier targets modern GPU hardware с Vulkan 1.3 + async compute queue family support per the K-L19 architectural commitment (docs/architecture/KERNEL_ARCHITECTURE.md Part 0, K-L19).

Minimum tier:

  • NVIDIA: Turing or newer — GeForce GTX 1660 / RTX 20-series и later
  • AMD: RDNA 1 or newer — Radeon RX 5500 и later
  • Intel: Arc Alchemist or newer — Arc A380 и later
  • Integrated GPUs: most NOT supported (lack async compute queue family)

Pre-Turing NVIDIA, pre-RDNA AMD, pre-Arc Intel hardware will fail at startup с a clear diagnostic message. This is an intentional architectural choice supporting clean implementation per K-L14 («performance derives from architectural cleanliness»); it is not a hardware-discrimination decision. By Dual Frontier release timeline, the target hardware tier represents the majority of gaming hardware.

Verification: launch Dual Frontier — if startup fails с HardwareCapabilityException, upgrade GPU driver or hardware. Run vulkaninfo.exe (from the Vulkan SDK or GPU driver) to verify that the host hardware supports Vulkan 1.3 + a compute-capable queue family.

OS support: Windows 10/11 x64. Linux/macOS deferred per docs/architecture/VULKAN_SUBSTRATE.md L7 LOCKED.

Three primary documents

Repository layout

The full documentation index is in docs/README.md. Source layout is described in docs/architecture/ARCHITECTURE.md; without it the assembly structure looks excessive.

License

This project is distributed under the PolyForm Noncommercial 1.0.0 license. Commercial use of the engine code requires a separate agreement.

Architecture documents

  • docs/methodology/METHODOLOGY.md — pipeline и methodology
  • docs/methodology/CODING_STANDARDS.md — coding conventions
  • docs/architecture/MOD_OS_ARCHITECTURE.md — modding architecture
  • docs/architecture/VULKAN_SUBSTRATE.md — Vulkan substrate (V) — rendering + compute use cases unified per Q-G-1 LOCK
  • docs/architecture/KERNEL_ARCHITECTURE.md — native ECS kernel layer (K0-K8)
  • docs/reports/CPP_KERNEL_BRANCH_REPORT.md — Discovery report (experimental branch)

Auto-generated from docs/governance/REGISTER.yaml — DO NOT EDIT MANUALLY

Manual edits overwritten by sync_register.ps1 on next sync.

register_id: DOC-G-README category: G tier: 2 lifecycle: Live owner: Crystalka version: "Live" next_review_due: 2026-Q3 register_view_url: docs/governance/REGISTER_RENDER.md#DOC-G-README

About

Can one developer build complex systems software through structured LLM delegation? Repo as the test: agent pipeline ( Opus + human) producing a simulation microkernel — multithreaded ECS, capability-declared mod isolation, OS-style architecture. Engine stresses the methodology; a colony sim stresses the engine.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors