diff --git a/README.md b/README.md
index 707085d..0c5095d 100644
--- a/README.md
+++ b/README.md
@@ -2,348 +2,279 @@
-Comptextv7
+CompText V7
- Deterministic Operational Replay Evaluation Infrastructure
-
-
-
- A replay-native evaluation layer for long-horizon AI agents.
+ Deterministic replay-survivability validation for compressed operational state in long-horizon AI agents.
No embeddings • No vector DB • No semantic scoring • No LLM judges
-
- Comptextv7 evaluates whether compressed agent state
- remains operationally admissible
- under deterministic replay constraints.
-
-
-
- Not semantically.
- Operationally.
-
-
-
- Monaco Showcase →
- · Research Positioning
+ Research Positioning
· Benchmark Details
- · Replay Degradation
- · Validation Report
+ · Multi-Family Benchmark
+ · Failure Taxonomy
----
-
-## Why this exists
-
-Most AI memory systems optimize for:
-
-- semantic similarity
-- retrieval quality
-- conversational continuity
-- long-context recall
-
-Comptextv7 evaluates something different:
-
-> Can compressed operational state still remain replayable and admissible after deterministic reconstruction?
-
-This includes:
-
-- evidence preservation
-- constraint survival
-- blocker continuity
-- dependency integrity
-- tool-order preservation
-- recovery-path continuity
-- relational admissibility
-
-All deterministically validated.
+CompText V7 does not ask whether a compressed summary sounds good. It asks whether the compressed state can still replay the operational facts required to continue the work.
---
-## Core thesis
-
-Comptextv7 measures what operationally survives compression.
-
-Not whether outputs sound similar.
-
-But whether replayed state still preserves:
-
-- operational invariants
-- admissible execution structure
-- dependency continuity
-- reconstructable agent behavior
-
----
-
-## Invariant-first evaluation
-
-Comptextv7 does not ask:
-
-> “Does the replay semantically resemble the original?”
+## In 30 seconds
-It asks:
+Long-horizon agents compress prior work into smaller summaries. Those summaries can silently lose blockers, constraints, evidence, dependency order, recovery paths, and tool order.
-> “Which operational invariants survived replay?”
-
-Examples:
-
-- evidence integrity
-- dependency reachability
-- blocker attachment
-- causal continuity
-- tool sequencing
-- operational admissibility
-- constraint preservation
-
-This makes the system auditable, reproducible, CI-compatible, and deterministic.
+CompText V7 treats that as a deterministic replay-validation problem. It checks whether compressed operational state remains admissible after reconstruction using fixture-defined contracts, exact scoring, failure labels, committed artifacts, and CI gates.
---
-## Architecture
-
-```mermaid
-flowchart TD
- A[Raw Agent Trace
or Research Paper]
- --> B[Operational State Extraction]
-
- B --> C[Operational State
Evidence
Constraints
Dependencies
Tool Order
Recovery Paths]
-
- C --> D[Compression Profiles]
- D --> E[CONSERVATIVE
BALANCED
AGGRESSIVE]
+## What CompText V7 is
- E --> F[Compact Replay State]
- F --> G[Deterministic Replay Reconstruction]
+- Deterministic replay-validation infrastructure for operational state.
+- Fixture-bound and contract-linked.
+- Artifact-backed with reproducible JSON/SVG outputs.
+- CI-reproducible through repository checks.
+- Focused on operational admissibility, not prose quality.
- G --> H[Invariant Validation Engine
+ Failure Taxonomy]
+## What CompText V7 is not
- H --> I[Structural Drift
Relational Drift
Operational Drift]
-
- I --> J[Committed JSON Artifacts
+ Deterministic CI Validation]
-
- style F fill:#172554,stroke:#60a5fa,stroke-width:2px,color:#ffffff
- style H fill:#0f172a,stroke:#38bdf8,stroke-width:2px,color:#ffffff
- style J fill:#064e3b,stroke:#34d399,stroke-width:2px,color:#ffffff
-```
+- Agent framework.
+- Workflow orchestrator.
+- Learned compressor.
+- Vector memory system.
+- RAG replacement.
+- KV-cache optimizer.
+- Production telemetry platform.
+- Clinical-grade system.
+- Universal AI-memory solution.
+- LLM judge.
---
-## Deterministic replay degradation
+## Replay validation model
```mermaid
-flowchart TD
- A[Aggressive Compression]
- --> B[Structural Drift]
- B --> C[Relational Drift]
- C --> D[Operational Drift]
- D --> E[Admissibility Collapse]
-
- style A fill:#1e3a8a,stroke:#60a5fa,color:#ffffff
- style E fill:#7f1d1d,stroke:#f87171,color:#ffffff
+flowchart LR
+ A["Checked-in fixture"] --> B["Original operational state"]
+ B --> C["Reconstructed replay state"]
+ C --> D["Contract validator"]
+ D --> E["Admissibility scorer"]
+ E --> F["Failure labels"]
+ E --> G["Committed artifacts"]
+ G --> H["CI gates"]
+ F --> H
```
---
-## Current deterministic results
-
-| Profile | Replay Consistency | Evidence Survival | Operational Drift | Failure Labels |
-|---|---:|---:|---:|---|
-| `CONSERVATIVE` | `0.895833` | `0.916667` | `0.104167` | `EVIDENCE_LOSS` |
-| `BALANCED` | `0.250000` | `0.166667` | `0.750000` | `EVIDENCE_LOSS`, `CONSTRAINT_DRIFT` |
-| `AGGRESSIVE` | `0.125000` | `0.083333` | `0.875000` | `EVIDENCE_LOSS`, `CONSTRAINT_DRIFT`, `BLOCKER_DETACHMENT` |
+## Current fixture-bound signal
-Values are fixture-bound and CI-validated against committed replay artifacts.
+- Three manifest-registered operational fixture families.
+- Standard levels: `baseline`, `mild`, `moderate`, `severe`.
+- Deterministic evaluation mode.
+- Exact rational scoring.
+- Reproducible artifacts.
+- No LLM judges or external APIs.
-### Additional replay baselines
+These are internal fixture-bound results, not external benchmark claims, production-readiness claims, or solved-memory claims.
| Signal | Current fixture-bound result |
-|---|---:|
+| --- | ---: |
| Agent trace replay consistency | `1.000000` |
-| Agent avg compression | `1.773954x` |
-| Agent operational drift | `0.000000` |
| Paper replay consistency | `0.791667` |
-| Paper avg compression | `1.347063x` |
-
-Interpretation: the profile comparison shows monotonic degradation under increasing compression pressure. These results are internal fixture-bound observations, not external benchmark, production-readiness, or solved-memory claims.
-
----
-
-## Failure taxonomy
-
-Comptextv7 classifies replay degradation into deterministic failure classes.
-
-| Failure Type | Meaning |
-|---|---|
-| `EVIDENCE_LOSS` | Critical evidence disappeared |
-| `HIGH_CRITICAL_EVIDENCE_LOSS` | High-priority evidence degraded |
-| `CONSTRAINT_DRIFT` | Operational constraints degraded |
-| `BLOCKER_DETACHMENT` | Blocking dependencies became orphaned |
-| `RELATIONAL_DRIFT` | Dependency graph fragmentation |
-| `TEMPORAL_DRIFT` | Replay ordering degraded |
-
-No probabilistic scoring.
-
-No hidden heuristics.
-
-Every failure is reproducible.
+| `CONSERVATIVE` replay consistency | `0.895833` |
+| `BALANCED` replay consistency | `0.250000` |
+| `AGGRESSIVE` replay consistency | `0.125000` |
+| Paper avg compression | `1.347063` |
+| Agent avg compression | `1.773954` |
+| Agent replay consistency | `1.000000` |
+| Agent operational drift | `0.000000` |
---
-## What makes Comptextv7 different
+## Artifact evidence pipeline
-| Traditional systems | Comptextv7 |
-|---|---|
-| Semantic similarity | Operational admissibility |
-| LLM judges | Deterministic validation |
-| Embedding recall | Invariant preservation |
-| Memory retrieval | Replay degradation analysis |
-| Conversational continuity | Operational continuity |
-| Black-box scoring | Auditable metrics |
+```mermaid
+flowchart LR
+ A["fixtures/manifest.json"] --> B["Fixture families"]
+ B --> C["DegradationCurveGenerator"]
+ B --> D["AdmissibilityScorer"]
+ C --> E["multi_family_admissibility_curves.svg"]
+ D --> F["layered_admissibility_results.json"]
+ D --> G["multi_family_admissibility_results.json"]
+ F --> H["Reproducibility tests"]
+ G --> H
+ E --> I["Progression tests"]
+ H --> J["GitHub Actions"]
+ I --> J
+```
---
-## What Comptextv7 is not
-
-Comptextv7 is not:
-
-- an agent runtime
-- a vector memory system
-- a RAG framework
-- a semantic evaluator
-- an orchestration engine
-- an LLM-judge benchmark
-- a universal memory layer
-
-It is deterministic replay evaluation infrastructure for operational state degradation analysis.
+## Minimal deterministic example
+
+```json
+{
+ "original_operational_state": {
+ "policy_steps": ["identify_owner", "collect_evidence", "execute_recovery"],
+ "causal_dependencies": [["alert", "triage"], ["triage", "recovery"]],
+ "recovery_paths": ["ack -> mitigation_runbook"]
+ },
+ "reconstructed_state": {
+ "policy_steps": ["collect_evidence", "identify_owner", "execute_recovery"],
+ "causal_dependencies": [["alert", "recovery"]],
+ "recovery_paths": []
+ },
+ "deterministic_validation_result": {
+ "admissible": false,
+ "failure_labels": [
+ "POLICY_ORDER_BROKEN",
+ "CAUSAL_DEPENDENCY_LOSS",
+ "RECOVERY_PATH_INVALID",
+ "INVARIANT_VIOLATION"
+ ]
+ }
+}
+```
---
-## Research positioning
-
-Current long-context benchmarks mainly focus on:
-
-- conversational memory
-- semantic retrieval
-- QA recall
-- context-window scaling
-
-Comptextv7 focuses on:
+## Proof artifacts
-- operational admissibility
-- invariant preservation
-- replay integrity
-- deterministic degradation analysis
-- relational continuity
-- execution-faithful reconstruction
-
-This positions Comptextv7 closer to replayable execution systems, event-sourced orchestration, execution lineage validation, operational semantics, and deterministic governance infrastructure.
-
-For conservative scope boundaries and benchmark interpretation, see [Research Positioning](docs/research_positioning.md), [Iterative Replay Degradation](docs/iterative_replay_degradation.md), the [Benchmark Explanation](docs/BENCHMARK_EXPLANATION.md), and the committed [iterative replay degradation summary](artifacts/iterative_replay_degradation_results.summary.md).
+| Artifact | Purpose |
+| --- | --- |
+| `artifacts/layered_admissibility_results.json` | Layered admissibility outputs. |
+| `artifacts/multi_family_admissibility_results.json` | Multi-family deterministic aggregates. |
+| `artifacts/multi_family_admissibility_curves.svg` | Deterministic degradation curve rendering. |
+| `docs/benchmarks/multi_family_admissibility_benchmark.md` | Benchmark method and interpretation boundaries. |
+| `docs/failure_taxonomy.md` | Failure label documentation. |
---
-## Benchmark family
-
-### Paper Replay Benchmark
+## Verify locally
-- Validates whether dense technical paper summaries preserve entities, metrics, limitations, and section structure after deterministic replay compression.
-- Artifact: [`artifacts/paper_replay_results.json`](artifacts/paper_replay_results.json).
-- Method: [`docs/benchmarks/paper_replay.md`](docs/benchmarks/paper_replay.md).
-- Current avg compression: `1.347063x`.
-- Current replay consistency: `0.791667`.
-
-### Agent Trace Replay Benchmark
-
-- Validates whether multi-step agent workflows preserve active tasks, constraints, dependencies, tool sequences, unresolved blockers, deployment requirements, and recovery actions.
-- Artifact: [`artifacts/agent_trace_replay_results.json`](artifacts/agent_trace_replay_results.json).
-- Method: [`docs/benchmarks/agent_trace_replay.md`](docs/benchmarks/agent_trace_replay.md).
-- Current avg compression: `1.773954x`.
-- Current replay consistency: `1.000000`.
-- Operational drift: `0.000000`.
-
-### Multi-Family Operational Admissibility Benchmark
-
-- Validates deterministic multi-family operational admissibility with manifest-driven fixture selection, exact scoring, reproducible JSON artifacts, and progression-regression checks.
-- Method: [`docs/benchmarks/multi_family_admissibility_benchmark.md`](docs/benchmarks/multi_family_admissibility_benchmark.md).
-
-### Iterative Replay Degradation Prototype
-
-- Validates how checked-in paper and agent-trace fixtures degrade across bounded repeated compact/replay cycles.
-- Method: [`docs/iterative_replay_degradation.md`](docs/iterative_replay_degradation.md).
-- Profile comparison: fixture-bound aggregates for collapse rate, replay consistency, operational drift, evidence survival, and deterministic failure labels.
-- Sensitivity analysis: bounded variations of `max_context_units`, `max_families`, `max_bursts`, `replay_window_seconds`, `replay_cycles`, and `compression_budget_scale`.
+```bash
+python -m pip install -e '.[test]'
+npm install --no-save --no-package-lock
+npm run check
+pytest tests/test_failure_taxonomy.py -q
+pytest tests/test_multi_family_admissibility_artifact.py -q
+pytest tests/test_multi_family_svg_renderer.py -q
+pytest tests/test_paper_replay_bench.py tests/test_agent_trace_replay.py -q
+```
---
-## Integrity model
-
-- no LLM judging
-- no embeddings
-- no vector databases
-- no external APIs
-- artifact-backed JSON + CI checks
-- deterministic hashing foundation: [`docs/deterministic_hashing.md`](docs/deterministic_hashing.md)
-- audit-friendly and CI reproducible
+## Benchmark families
-Foundational deterministic components:
+- `coding_workflow_pr_review`
+- `incident_response_page_triage`
+- `cross_domain_operational_dependency_workflow`
-- `ReferenceIndex` and `EventLogArtifactAdapter`: track context references and deterministically fingerprint event payloads.
-- `ReplayArtifactWriter v1-alpha.1`: generates deterministic standalone JSON artifacts for verifiable snapshots.
+```mermaid
+flowchart LR
+ A["coding_workflow_pr_review"] --> L1["baseline"]
+ A --> L2["mild"]
+ A --> L3["moderate"]
+ A --> L4["severe"]
+ B["incident_response_page_triage"] --> L1
+ B --> L2
+ B --> L3
+ B --> L4
+ C["cross_domain_operational_dependency_workflow"] --> L1
+ C --> L2
+ C --> L3
+ C --> L4
+ L1 --> M["manifest registration"]
+ L2 --> M
+ L3 --> M
+ L4 --> M
+ M --> N["multi-family artifact"]
+ N --> O["deterministic SVG"]
+```
---
-## Quick start
-
-Install the Python test dependency set:
+## Failure labels
+
+Primary registered labels used across deterministic admissibility validation:
+
+- `POLICY_ORDER_BROKEN`: required policy order failed.
+- `TOOL_ORDER_VIOLATION`: replayed tool sequence violated required order.
+- `CAUSAL_DEPENDENCY_LOSS`: required causal edges were not preserved.
+- `DEPENDENCY_CHAIN_BREAK`: required dependency chain broke.
+- `RECOVERY_PATH_INVALID`: recovery reachability contract failed.
+- `RECOVERY_PATH_LOSS`: required recovery route was not preserved.
+- `INVARIANT_VIOLATION`: declared invariant failed.
+- `EVIDENCE_LOSS`: required evidence did not survive replay.
+- `EVIDENCE_SURVIVAL_LOSS`: expected evidence units were not preserved.
+- `HIGH_CRITICAL_EVIDENCE_LOSS`: high-critical evidence was lost.
+- `CONSTRAINT_DRIFT`: constraint preservation drifted.
+- `BLOCKER_DETACHMENT`: blocker attachment was lost.
+- `GOVERNANCE_DRIFT`: governance constraint drifted.
+- `ARTIFACT_INTEGRITY_VIOLATION`: artifact integrity drifted.
+- `REPLAY_NON_REPRODUCIBLE`: deterministic replay was not reproducible.
-```bash
-python -m pip install -e '.[test]'
+```mermaid
+flowchart LR
+ O1["POLICY_ORDER_BROKEN"] --> C1["ordering"]
+ O2["TOOL_ORDER_VIOLATION"] --> C1
+ D1["CAUSAL_DEPENDENCY_LOSS"] --> C2["causality/dependency"]
+ D2["DEPENDENCY_CHAIN_BREAK"] --> C2
+ R1["RECOVERY_PATH_INVALID"] --> C3["recovery/reachability"]
+ R2["RECOVERY_PATH_LOSS"] --> C3
+ I1["INVARIANT_VIOLATION"] --> C4["invariant/no-orphan"]
+ E1["EVIDENCE_LOSS"] --> C5["evidence/criticality"]
+ E2["EVIDENCE_SURVIVAL_LOSS"] --> C5
+ E3["HIGH_CRITICAL_EVIDENCE_LOSS"] --> C5
+ E4["CONSTRAINT_DRIFT"] --> C5
+ E5["BLOCKER_DETACHMENT"] --> C5
+ E6["GOVERNANCE_DRIFT"] --> C5
+ A1["ARTIFACT_INTEGRITY_VIOLATION"] --> C6["artifact/reproducibility"]
+ A2["REPLAY_NON_REPRODUCIBLE"] --> C6
```
-Run the full reviewer check:
+---
-```bash
-npm run check
-```
+## How this differs from adjacent systems
-Run core replay tests:
+| System type | Stores state | Compresses context | Orchestrates agents | Deterministically validates replay loss |
+| --- | --- | --- | --- | --- |
+| Workflow runtimes | Sometimes | No | Yes | No |
+| Agent frameworks | Sometimes | Sometimes | Yes | Usually no |
+| Vector memory / RAG | Yes | Retrieval-centric | No | No |
+| Learned prompt compressors | Sometimes | Yes | No | Usually no |
+| LLM-as-judge evaluators | Sometimes | N/A | No | No |
+| CompText V7 | Yes | Yes | No | Yes |
-```bash
-pytest tests/test_multi_family_admissibility_artifact.py -q
-pytest tests/test_failure_taxonomy.py -q
-pytest tests/test_evidence_metrics_adaptive_policy.py -q
-pytest tests/test_paper_replay_bench.py tests/test_agent_trace_replay.py -q
-```
+---
-Regenerate deterministic replay artifacts:
+## CI and merge gate
-```bash
-python tests/utils/paper_replay_runner.py
-python tests/utils/agent_trace_replay_runner.py
-python scripts/generate_iterative_replay_degradation_artifacts.py
+```mermaid
+flowchart LR
+ A["PR head SHA"] --> B["GitHub Actions"]
+ B --> C["Agent Workflow Checks"]
+ B --> D["hash-companion-validation"]
+ B --> E["CompText V7 Industrial Validation"]
+ C --> F["all success"]
+ D --> F
+ E --> F
+ F --> G["squash merge"]
```
-Additional validation helpers:
-
-```bash
-python scripts/validate.py replay
-python scripts/validate.py token
-python scripts/validate.py forensic
-python scripts/validate_contracts.py
-python scripts/validate_api_exports.py
-```
+Vercel/Netlify/deployment previews are not merge gates unless explicitly scoped.
---
@@ -351,61 +282,45 @@ python scripts/validate_api_exports.py
```text
Comptextv7/
-├── artifacts/ # committed deterministic replay benchmark JSON
-├── benchmarks/ # deterministic compression, replay, and audit runners
-├── contracts/ # machine-readable validation and handoff contracts
-├── dashboard/ # backend plus React operations console
-├── docs/ # benchmark, artifact, research, and legacy showcase notes
-├── reports/replay_continuity/ # adversarial continuity metrics and SVG charts
-├── scripts/ # validation, reporting, and artifact tooling
-├── showcase/app/ # legacy in-repo Vite app; Monaco UI lives in external repo
-├── src/ # compression, audit, and validation modules
-├── tests/ # Python regression and replay validation tests
-└── README.md
+├── artifacts/
+├── docs/
+├── fixtures/
+├── reports/
+├── scripts/
+├── tests/
+└── src/
+ ├── core/
+ └── validation/
```
---
-## Cloud-first validation
+## Replay-validation roadmap
-Comptextv7 is biased toward artifact-backed review rather than local machine trust.
+```mermaid
+flowchart LR
+ A["failure taxonomy"] --> B["cross-domain fixture families"]
+ B --> C["forensic reports"]
+ C --> D["schema stabilization"]
+ D --> E["cross-family comparison"]
+ E --> F["integrity gates"]
+ F --> G["golden corpus"]
+ G --> H["offline import/export"]
+```
-| Workflow | Role |
-|---|---|
-| [`ci.yml`](.github/workflows/ci.yml) | Runs deterministic replay, tests, telemetry, and validation gates. |
-| [`agent-checks.yml`](.github/workflows/agent-checks.yml) | Runs repository, report, contract, and dashboard validation. |
-| [`validation_runner.yml`](.github/workflows/validation_runner.yml) | Publishes compact cloud validation result artifacts. |
+- Forensic audit reports with deterministic exports.
+- Artifact schema stabilization.
+- Cross-family degradation comparison.
+- Minimal artifact integrity gates.
+- Golden corpus foundation.
+- Offline import/export schemas only.
---
## Limitations
-- Metrics are fixture-bound baselines and do not reflect universal real-world correctness.
-- Fixtures are curated and checked in.
-- Structured agent traces currently replay near-losslessly.
-- This is not solved AI memory.
-- This is not production telemetry.
-- This is not an autonomous agent framework.
-- Iterative degradation remains a bounded fixture prototype.
-
----
-
-## Safety boundaries
-
-Do not commit:
-
-- proprietary customer data
-- secrets, API keys, tokens, cookies, or credentials
-- raw production logs
-- unsanitized replay fixtures
-- private deployment credentials or environment dumps
-
-Comptextv7 is a deterministic, synthetic-only research prototype for operational replay persistence and reviewable diagnostic infrastructure.
-
----
-
-## Final principle
-
-> Comptextv7 does not evaluate whether an agent remembers.
->
-> It evaluates whether compressed operational state remains admissible under deterministic replay.
+- Metrics are fixture-bound and internal to checked-in datasets.
+- Fixtures are curated and checked in, not live production traces.
+- This is a deterministic prototype, not a production-readiness claim.
+- This is not a universal AI-memory claim.
+- This does not claim runtime integration or orchestration coverage.