From 0b46ec3e6064ffb6d53bcb1b0adb599810e6422d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Alexander=20K=C3=B6lnberger?=
 <159939812+ProfRandom92@users.noreply.github.com>
Date: Wed, 20 May 2026 01:52:23 -0700
Subject: [PATCH 1/3] Resolve README conflict against current main

---
 README.md | 530 +++++++++++++++++++++++-------------------------------
 1 file changed, 221 insertions(+), 309 deletions(-)
diff --git a/README.md b/README.md
index 707085d..b27dc4a 100644
--- a/README.md
+++ b/README.md
@@ -5,345 +5,273 @@
 <h1 align="center">Comptextv7</h1>
 
 <p align="center">
-  <strong>Deterministic Operational Replay Evaluation Infrastructure</strong>
-</p>
-
-<p align="center">
-  A replay-native evaluation layer for long-horizon AI agents.
+  <strong>Deterministic replay-survivability validation for compressed operational state in long-horizon AI agents.</strong>
 </p>
 
 <p align="center">
   <strong>No embeddings • No vector DB • No semantic scoring • No LLM judges</strong>
 </p>
 
-<p align="center">
-  Comptextv7 evaluates whether compressed agent state<br>
-  remains <strong>operationally admissible</strong><br>
-  under deterministic replay constraints.
-</p>
-
-<p align="center">
-  Not semantically.<br>
-  <strong>Operationally.</strong>
-</p>
-
 <p align="center">
   <a href="https://github.com/ProfRandom92/Comptextv7/actions/workflows/ci.yml"><img alt="CI" src="https://github.com/ProfRandom92/Comptextv7/actions/workflows/ci.yml/badge.svg" /></a>
   <img alt="Python" src="https://img.shields.io/badge/Python-3.11%2B-3776ab" />
   <img alt="Deterministic Replay" src="https://img.shields.io/badge/Deterministic%20Replay-CI%20audited-0f766e" />
   <img alt="Replay Native" src="https://img.shields.io/badge/Replay-Native%20Infrastructure-1d4ed8" />
-  <img alt="No LLM Judging" src="https://img.shields.io/badge/No%20LLM%20Judging-deterministic-7c3aed" />
   <img alt="Replay Artifacts" src="https://img.shields.io/badge/Committed%20Artifacts-JSON%20%2B%20CI-2563eb" />
 </p>
 
 <p align="center">
-  <a href="https://github.com/ProfRandom92/comptext-v7-monaco-showcase"><strong>Monaco Showcase →</strong></a>
-  · <a href="docs/research_positioning.md">Research Positioning</a>
+  <a href="docs/research_positioning.md">Research Positioning</a>
   · <a href="docs/BENCHMARK_EXPLANATION.md">Benchmark Details</a>
-  · <a href="docs/iterative_replay_degradation.md">Replay Degradation</a>
-  · <a href="reports/replay_continuity/validation_report.md">Validation Report</a>
+  · <a href="docs/benchmarks/multi_family_admissibility_benchmark.md">Multi-Family Benchmark</a>
+  · <a href="docs/failure_taxonomy.md">Failure Taxonomy</a>
 </p>
 
----
-
-## Why this exists
-
-Most AI memory systems optimize for:
-
-- semantic similarity
-- retrieval quality
-- conversational continuity
-- long-context recall
-
-Comptextv7 evaluates something different:
-
-> Can compressed operational state still remain replayable and admissible after deterministic reconstruction?
-
-This includes:
-
-- evidence preservation
-- constraint survival
-- blocker continuity
-- dependency integrity
-- tool-order preservation
-- recovery-path continuity
-- relational admissibility
-
-All deterministically validated.
+CompTextv7 does not ask whether a compressed summary sounds good. It asks whether the compressed state can still replay the operational facts required to continue the work.
 
 ---
 
-## Core thesis
-
-Comptextv7 measures what operationally survives compression.
-
-Not whether outputs sound similar.
-
-But whether replayed state still preserves:
-
-- operational invariants
-- admissible execution structure
-- dependency continuity
-- reconstructable agent behavior
-
----
-
-## Invariant-first evaluation
-
-Comptextv7 does not ask:
+## In 30 seconds
 
-> “Does the replay semantically resemble the original?”
+Long-horizon agents compress prior work into smaller summaries. Those summaries can silently lose blockers, constraints, evidence, dependency order, recovery paths, and tool order.
 
-It asks:
-
-> “Which operational invariants survived replay?”
-
-Examples:
-
-- evidence integrity
-- dependency reachability
-- blocker attachment
-- causal continuity
-- tool sequencing
-- operational admissibility
-- constraint preservation
-
-This makes the system auditable, reproducible, CI-compatible, and deterministic.
+CompTextv7 treats that as a deterministic replay-validation problem. It checks whether compressed operational state remains admissible after reconstruction using fixture-defined contracts, exact scoring, failure labels, committed artifacts, and CI gates.
 
 ---
 
-## Architecture
-
-```mermaid
-flowchart TD
-    A[Raw Agent Trace<br/>or Research Paper]
-    --> B[Operational State Extraction]
-
-    B --> C[Operational State<br/>Evidence<br/>Constraints<br/>Dependencies<br/>Tool Order<br/>Recovery Paths]
+## What CompTextv7 is
 
-    C --> D[Compression Profiles]
-    D --> E[CONSERVATIVE<br/>BALANCED<br/>AGGRESSIVE]
+- Deterministic replay-validation infrastructure for operational state.
+- Fixture-bound and contract-linked.
+- Artifact-backed with reproducible JSON/SVG outputs.
+- CI-reproducible through repository checks.
+- Focused on operational admissibility, not prose quality.
 
-    E --> F[Compact Replay State]
-    F --> G[Deterministic Replay Reconstruction]
+## What CompTextv7 is not
 
-    G --> H[Invariant Validation Engine<br/>+ Failure Taxonomy]
-
-    H --> I[Structural Drift<br/>Relational Drift<br/>Operational Drift]
-
-    I --> J[Committed JSON Artifacts<br/>+ Deterministic CI Validation]
-
-    style F fill:#172554,stroke:#60a5fa,stroke-width:2px,color:#ffffff
-    style H fill:#0f172a,stroke:#38bdf8,stroke-width:2px,color:#ffffff
-    style J fill:#064e3b,stroke:#34d399,stroke-width:2px,color:#ffffff
-```
+- Agent framework.
+- Workflow orchestrator.
+- Learned compressor.
+- Vector memory system.
+- RAG replacement.
+- KV-cache optimizer.
+- Production telemetry platform.
+- Clinical-grade system.
+- Universal AI-memory solution.
+- LLM judge.
 
 ---
 
-## Deterministic replay degradation
+## Replay validation model
 
 ```mermaid
-flowchart TD
-    A[Aggressive Compression]
-    --> B[Structural Drift]
-    B --> C[Relational Drift]
-    C --> D[Operational Drift]
-    D --> E[Admissibility Collapse]
-
-    style A fill:#1e3a8a,stroke:#60a5fa,color:#ffffff
-    style E fill:#7f1d1d,stroke:#f87171,color:#ffffff
+flowchart LR
+    A["Checked-in fixture"] --> B["Original operational state"]
+    B --> C["Reconstructed replay state"]
+    C --> D["Contract validator"]
+    D --> E["Admissibility scorer"]
+    E --> F["Failure labels"]
+    E --> G["Committed artifacts"]
+    G --> H["CI gates"]
+    F --> H
 ```
 
 ---
 
-## Current deterministic results
-
-| Profile | Replay Consistency | Evidence Survival | Operational Drift | Failure Labels |
-|---|---:|---:|---:|---|
-| `CONSERVATIVE` | `0.895833` | `0.916667` | `0.104167` | `EVIDENCE_LOSS` |
-| `BALANCED` | `0.250000` | `0.166667` | `0.750000` | `EVIDENCE_LOSS`, `CONSTRAINT_DRIFT` |
-| `AGGRESSIVE` | `0.125000` | `0.083333` | `0.875000` | `EVIDENCE_LOSS`, `CONSTRAINT_DRIFT`, `BLOCKER_DETACHMENT` |
+## Current fixture-bound signal
 
-Values are fixture-bound and CI-validated against committed replay artifacts.
+- Three manifest-registered operational fixture families.
+- Standard levels: `baseline`, `mild`, `moderate`, `severe`.
+- Deterministic evaluation mode.
+- Exact rational scoring.
+- Reproducible artifacts.
+- No LLM judges or external APIs.
 
-### Additional replay baselines
+These are internal fixture-bound results, not external benchmark claims, production-readiness claims, or solved-memory claims.
 
 | Signal | Current fixture-bound result |
-|---|---:|
+| --- | ---: |
 | Agent trace replay consistency | `1.000000` |
-| Agent avg compression | `1.773954x` |
-| Agent operational drift | `0.000000` |
 | Paper replay consistency | `0.791667` |
-| Paper avg compression | `1.347063x` |
-
-Interpretation: the profile comparison shows monotonic degradation under increasing compression pressure. These results are internal fixture-bound observations, not external benchmark, production-readiness, or solved-memory claims.
-
----
-
-## Failure taxonomy
-
-Comptextv7 classifies replay degradation into deterministic failure classes.
-
-| Failure Type | Meaning |
-|---|---|
-| `EVIDENCE_LOSS` | Critical evidence disappeared |
-| `HIGH_CRITICAL_EVIDENCE_LOSS` | High-priority evidence degraded |
-| `CONSTRAINT_DRIFT` | Operational constraints degraded |
-| `BLOCKER_DETACHMENT` | Blocking dependencies became orphaned |
-| `RELATIONAL_DRIFT` | Dependency graph fragmentation |
-| `TEMPORAL_DRIFT` | Replay ordering degraded |
-
-No probabilistic scoring.
-
-No hidden heuristics.
-
-Every failure is reproducible.
+| `CONSERVATIVE` replay consistency | `0.895833` |
+| `BALANCED` replay consistency | `0.250000` |
+| `AGGRESSIVE` replay consistency | `0.125000` |
+| Paper avg compression | `1.347063` |
+| Agent avg compression | `1.773954` |
+| Agent operational drift | `0.000000` |
 
 ---
 
-## What makes Comptextv7 different
+## Artifact evidence pipeline
 
-| Traditional systems | Comptextv7 |
-|---|---|
-| Semantic similarity | Operational admissibility |
-| LLM judges | Deterministic validation |
-| Embedding recall | Invariant preservation |
-| Memory retrieval | Replay degradation analysis |
-| Conversational continuity | Operational continuity |
-| Black-box scoring | Auditable metrics |
+```mermaid
+flowchart LR
+    A["fixtures/manifest.json"] --> B["Fixture families"]
+    B --> C["DegradationCurveGenerator"]
+    B --> D["AdmissibilityScorer"]
+    C --> E["multi_family_admissibility_curves.svg"]
+    D --> F["layered_admissibility_results.json"]
+    D --> G["multi_family_admissibility_results.json"]
+    F --> H["Reproducibility tests"]
+    G --> H
+    E --> I["Progression tests"]
+    H --> J["GitHub Actions"]
+    I --> J
+```
 
 ---
 
-## What Comptextv7 is not
-
-Comptextv7 is not:
-
-- an agent runtime
-- a vector memory system
-- a RAG framework
-- a semantic evaluator
-- an orchestration engine
-- an LLM-judge benchmark
-- a universal memory layer
-
-It is deterministic replay evaluation infrastructure for operational state degradation analysis.
+## Minimal deterministic example
+
+```json
+{
+  "original_operational_state": {
+    "policy_steps": ["identify_owner", "collect_evidence", "execute_recovery"],
+    "causal_dependencies": [["alert", "triage"], ["triage", "recovery"]],
+    "recovery_paths": ["ack -> mitigation_runbook"]
+  },
+  "reconstructed_state": {
+    "policy_steps": ["collect_evidence", "identify_owner", "execute_recovery"],
+    "causal_dependencies": [["alert", "recovery"]],
+    "recovery_paths": []
+  },
+  "deterministic_validation_result": {
+    "admissible": false,
+    "failure_labels": [
+      "POLICY_ORDER_BROKEN",
+      "CAUSAL_DEPENDENCY_LOSS",
+      "RECOVERY_PATH_INVALID",
+      "INVARIANT_VIOLATION"
+    ]
+  }
+}
+```
 
 ---
 
-## Research positioning
-
-Current long-context benchmarks mainly focus on:
-
-- conversational memory
-- semantic retrieval
-- QA recall
-- context-window scaling
-
-Comptextv7 focuses on:
-
-- operational admissibility
-- invariant preservation
-- replay integrity
-- deterministic degradation analysis
-- relational continuity
-- execution-faithful reconstruction
+## Proof artifacts
 
-This positions Comptextv7 closer to replayable execution systems, event-sourced orchestration, execution lineage validation, operational semantics, and deterministic governance infrastructure.
-
-For conservative scope boundaries and benchmark interpretation, see [Research Positioning](docs/research_positioning.md), [Iterative Replay Degradation](docs/iterative_replay_degradation.md), the [Benchmark Explanation](docs/BENCHMARK_EXPLANATION.md), and the committed [iterative replay degradation summary](artifacts/iterative_replay_degradation_results.summary.md).
+| Artifact | Purpose |
+| --- | --- |
+| `artifacts/layered_admissibility_results.json` | Layered admissibility outputs. |
+| `artifacts/multi_family_admissibility_results.json` | Multi-family deterministic aggregates. |
+| `artifacts/multi_family_admissibility_curves.svg` | Deterministic degradation curve rendering. |
+| `docs/benchmarks/multi_family_admissibility_benchmark.md` | Benchmark method and interpretation boundaries. |
+| `docs/failure_taxonomy.md` | Failure label documentation. |
 
 ---
 
-## Benchmark family
-
-### Paper Replay Benchmark
-
-- Validates whether dense technical paper summaries preserve entities, metrics, limitations, and section structure after deterministic replay compression.
-- Artifact: [`artifacts/paper_replay_results.json`](artifacts/paper_replay_results.json).
-- Method: [`docs/benchmarks/paper_replay.md`](docs/benchmarks/paper_replay.md).
-- Current avg compression: `1.347063x`.
-- Current replay consistency: `0.791667`.
-
-### Agent Trace Replay Benchmark
-
-- Validates whether multi-step agent workflows preserve active tasks, constraints, dependencies, tool sequences, unresolved blockers, deployment requirements, and recovery actions.
-- Artifact: [`artifacts/agent_trace_replay_results.json`](artifacts/agent_trace_replay_results.json).
-- Method: [`docs/benchmarks/agent_trace_replay.md`](docs/benchmarks/agent_trace_replay.md).
-- Current avg compression: `1.773954x`.
-- Current replay consistency: `1.000000`.
-- Operational drift: `0.000000`.
-
-### Multi-Family Operational Admissibility Benchmark
+## Verify locally
 
-- Validates deterministic multi-family operational admissibility with manifest-driven fixture selection, exact scoring, reproducible JSON artifacts, and progression-regression checks.
-- Method: [`docs/benchmarks/multi_family_admissibility_benchmark.md`](docs/benchmarks/multi_family_admissibility_benchmark.md).
-
-### Iterative Replay Degradation Prototype
-
-- Validates how checked-in paper and agent-trace fixtures degrade across bounded repeated compact/replay cycles.
-- Method: [`docs/iterative_replay_degradation.md`](docs/iterative_replay_degradation.md).
-- Profile comparison: fixture-bound aggregates for collapse rate, replay consistency, operational drift, evidence survival, and deterministic failure labels.
-- Sensitivity analysis: bounded variations of `max_context_units`, `max_families`, `max_bursts`, `replay_window_seconds`, `replay_cycles`, and `compression_budget_scale`.
+```bash
+npm install --no-save --no-package-lock
+npm run check
+pytest tests/test_failure_taxonomy.py -q
+pytest tests/test_multi_family_admissibility_artifact.py -q
+pytest tests/test_multi_family_svg_renderer.py -q
+```
 
 ---
 
-## Integrity model
+## Benchmark families
 
-- no LLM judging
-- no embeddings
-- no vector databases
-- no external APIs
-- artifact-backed JSON + CI checks
-- deterministic hashing foundation: [`docs/deterministic_hashing.md`](docs/deterministic_hashing.md)
-- audit-friendly and CI reproducible
+- `coding_workflow_pr_review`
+- `incident_response_page_triage`
+- `cross_domain_operational_dependency_workflow`
 
-Foundational deterministic components:
-
-- `ReferenceIndex` and `EventLogArtifactAdapter`: track context references and deterministically fingerprint event payloads.
-- `ReplayArtifactWriter v1-alpha.1`: generates deterministic standalone JSON artifacts for verifiable snapshots.
+```mermaid
+flowchart LR
+    A["coding_workflow_pr_review"] --> L1["baseline"]
+    A --> L2["mild"]
+    A --> L3["moderate"]
+    A --> L4["severe"]
+    B["incident_response_page_triage"] --> L1
+    B --> L2
+    B --> L3
+    B --> L4
+    C["cross_domain_operational_dependency_workflow"] --> L1
+    C --> L2
+    C --> L3
+    C --> L4
+    L1 --> M["manifest registration"]
+    L2 --> M
+    L3 --> M
+    L4 --> M
+    M --> N["multi-family artifact"]
+    N --> O["deterministic SVG"]
+```
 
 ---
 
-## Quick start
+## Failure labels
+
+Primary registered labels used across deterministic admissibility validation:
+
+- `POLICY_ORDER_BROKEN`: required policy order failed.
+- `TOOL_ORDER_VIOLATION`: replayed tool sequence violated required order.
+- `CAUSAL_DEPENDENCY_LOSS`: required causal edges were not preserved.
+- `DEPENDENCY_CHAIN_BREAK`: required dependency chain broke.
+- `RECOVERY_PATH_INVALID`: recovery reachability contract failed.
+- `RECOVERY_PATH_LOSS`: required recovery route was not preserved.
+- `INVARIANT_VIOLATION`: declared invariant failed.
+- `EVIDENCE_LOSS`: required evidence did not survive replay.
+- `EVIDENCE_SURVIVAL_LOSS`: expected evidence units were not preserved.
+- `HIGH_CRITICAL_EVIDENCE_LOSS`: high-critical evidence was lost.
+- `CONSTRAINT_DRIFT`: constraint preservation drifted.
+- `BLOCKER_DETACHMENT`: blocker attachment was lost.
+- `GOVERNANCE_DRIFT`: governance constraint drifted.
+- `ARTIFACT_INTEGRITY_VIOLATION`: artifact integrity drifted.
+- `REPLAY_NON_REPRODUCIBLE`: deterministic replay was not reproducible.
 
-Install the Python test dependency set:
-
-```bash
-python -m pip install -e '.[test]'
+```mermaid
+flowchart LR
+    O1["POLICY_ORDER_BROKEN"] --> C1["ordering"]
+    O2["TOOL_ORDER_VIOLATION"] --> C1
+    D1["CAUSAL_DEPENDENCY_LOSS"] --> C2["causality/dependency"]
+    D2["DEPENDENCY_CHAIN_BREAK"] --> C2
+    R1["RECOVERY_PATH_INVALID"] --> C3["recovery/reachability"]
+    R2["RECOVERY_PATH_LOSS"] --> C3
+    I1["INVARIANT_VIOLATION"] --> C4["invariant/no-orphan"]
+    E1["EVIDENCE_LOSS"] --> C5["evidence/criticality"]
+    E2["EVIDENCE_SURVIVAL_LOSS"] --> C5
+    E3["HIGH_CRITICAL_EVIDENCE_LOSS"] --> C5
+    E4["CONSTRAINT_DRIFT"] --> C5
+    E5["BLOCKER_DETACHMENT"] --> C5
+    E6["GOVERNANCE_DRIFT"] --> C5
+    A1["ARTIFACT_INTEGRITY_VIOLATION"] --> C6["artifact/reproducibility"]
+    A2["REPLAY_NON_REPRODUCIBLE"] --> C6
 ```
 
-Run the full reviewer check:
+---
 
-```bash
-npm run check
-```
+## How this differs from adjacent systems
 
-Run core replay tests:
+| System type | Stores state | Compresses context | Orchestrates agents | Deterministically validates replay loss |
+| --- | --- | --- | --- | --- |
+| Workflow runtimes | Sometimes | No | Yes | No |
+| Agent frameworks | Sometimes | Sometimes | Yes | Usually no |
+| Vector memory / RAG | Yes | Retrieval-centric | No | No |
+| Learned prompt compressors | Sometimes | Yes | No | Usually no |
+| LLM-as-judge evaluators | Sometimes | N/A | No | No |
+| CompTextv7 | Yes | Yes | No | Yes |
 
-```bash
-pytest tests/test_multi_family_admissibility_artifact.py -q
-pytest tests/test_failure_taxonomy.py -q
-pytest tests/test_evidence_metrics_adaptive_policy.py -q
-pytest tests/test_paper_replay_bench.py tests/test_agent_trace_replay.py -q
-```
+---
 
-Regenerate deterministic replay artifacts:
+## CI and merge gate
 
-```bash
-python tests/utils/paper_replay_runner.py
-python tests/utils/agent_trace_replay_runner.py
-python scripts/generate_iterative_replay_degradation_artifacts.py
+```mermaid
+flowchart LR
+    A["PR head SHA"] --> B["GitHub Actions"]
+    B --> C["Agent Workflow Checks"]
+    B --> D["hash-companion-validation"]
+    B --> E["CompText V7 Industrial Validation"]
+    C --> F["all success"]
+    D --> F
+    E --> F
+    F --> G["squash merge"]
 ```
 
-Additional validation helpers:
-
-```bash
-python scripts/validate.py replay
-python scripts/validate.py token
-python scripts/validate.py forensic
-python scripts/validate_contracts.py
-python scripts/validate_api_exports.py
-```
+Vercel/Netlify/deployment previews are not merge gates unless explicitly scoped.
 
 ---
 
@@ -351,61 +279,45 @@ python scripts/validate_api_exports.py
 
 ```text
 Comptextv7/
-├── artifacts/                  # committed deterministic replay benchmark JSON
-├── benchmarks/                 # deterministic compression, replay, and audit runners
-├── contracts/                  # machine-readable validation and handoff contracts
-├── dashboard/                  # backend plus React operations console
-├── docs/                       # benchmark, artifact, research, and legacy showcase notes
-├── reports/replay_continuity/  # adversarial continuity metrics and SVG charts
-├── scripts/                    # validation, reporting, and artifact tooling
-├── showcase/app/               # legacy in-repo Vite app; Monaco UI lives in external repo
-├── src/                        # compression, audit, and validation modules
-├── tests/                      # Python regression and replay validation tests
-└── README.md
+├── artifacts/
+├── docs/
+├── fixtures/
+├── reports/
+├── scripts/
+├── tests/
+└── src/
+    ├── core/
+    └── validation/
 ```
 
 ---
 
-## Cloud-first validation
+## Replay-validation roadmap
 
-Comptextv7 is biased toward artifact-backed review rather than local machine trust.
+```mermaid
+flowchart LR
+    A["failure taxonomy"] --> B["cross-domain fixture families"]
+    B --> C["forensic reports"]
+    C --> D["schema stabilization"]
+    D --> E["cross-family comparison"]
+    E --> F["integrity gates"]
+    F --> G["golden corpus"]
+    G --> H["offline import/export"]
+```
 
-| Workflow | Role |
-|---|---|
-| [`ci.yml`](.github/workflows/ci.yml) | Runs deterministic replay, tests, telemetry, and validation gates. |
-| [`agent-checks.yml`](.github/workflows/agent-checks.yml) | Runs repository, report, contract, and dashboard validation. |
-| [`validation_runner.yml`](.github/workflows/validation_runner.yml) | Publishes compact cloud validation result artifacts. |
+- Forensic audit reports with deterministic exports.
+- Artifact schema stabilization.
+- Cross-family degradation comparison.
+- Minimal artifact integrity gates.
+- Golden corpus foundation.
+- Offline import/export schemas only.
 
 ---
 
 ## Limitations
 
-- Metrics are fixture-bound baselines and do not reflect universal real-world correctness.
-- Fixtures are curated and checked in.
-- Structured agent traces currently replay near-losslessly.
-- This is not solved AI memory.
-- This is not production telemetry.
-- This is not an autonomous agent framework.
-- Iterative degradation remains a bounded fixture prototype.
-
----
-
-## Safety boundaries
-
-Do not commit:
-
-- proprietary customer data
-- secrets, API keys, tokens, cookies, or credentials
-- raw production logs
-- unsanitized replay fixtures
-- private deployment credentials or environment dumps
-
-Comptextv7 is a deterministic, synthetic-only research prototype for operational replay persistence and reviewable diagnostic infrastructure.
-
----
-
-## Final principle
-
-> Comptextv7 does not evaluate whether an agent remembers.
->
-> It evaluates whether compressed operational state remains admissible under deterministic replay.
+- Metrics are fixture-bound and internal to checked-in datasets.
+- Fixtures are curated and checked in, not live production traces.
+- This is a deterministic prototype, not a production-readiness claim.
+- This is not a universal AI-memory claim.
+- This does not claim runtime integration or orchestration coverage.

From 75fccb41d095416bc79bba1feecbb362a118f2d5 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Alexander=20K=C3=B6lnberger?=
 <159939812+ProfRandom92@users.noreply.github.com>
Date: Wed, 20 May 2026 01:58:47 -0700
Subject: [PATCH 2/3] Add drift-locked agent replay consistency metric

---
 README.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/README.md b/README.md
index b27dc4a..e4722c8 100644
--- a/README.md
+++ b/README.md
@@ -98,6 +98,7 @@ These are internal fixture-bound results, not external benchmark claims, product
 | `AGGRESSIVE` replay consistency | `0.125000` |
 | Paper avg compression | `1.347063` |
 | Agent avg compression | `1.773954` |
+| Agent replay consistency | `1.000000` |
 | Agent operational drift | `0.000000` |
 
 ---

From c1101fceb02dbfdceabf399b0b21cc947f0cea9a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Alexander=20K=C3=B6lnberger?=
 <159939812+ProfRandom92@users.noreply.github.com>
Date: Wed, 20 May 2026 02:01:51 -0700
Subject: [PATCH 3/3] Address README review feedback

---
 README.md | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/README.md b/README.md
index e4722c8..0c5095d 100644
--- a/README.md
+++ b/README.md
@@ -2,7 +2,7 @@
   <img src="docs/assets/comptextv7-logo.svg" alt="Comptextv7 logo" width="420" />
 </p>
 
-<h1 align="center">Comptextv7</h1>
+<h1 align="center">CompText V7</h1>
 
 <p align="center">
   <strong>Deterministic replay-survivability validation for compressed operational state in long-horizon AI agents.</strong>
@@ -27,7 +27,7 @@
   · <a href="docs/failure_taxonomy.md">Failure Taxonomy</a>
 </p>
 
-CompTextv7 does not ask whether a compressed summary sounds good. It asks whether the compressed state can still replay the operational facts required to continue the work.
+CompText V7 does not ask whether a compressed summary sounds good. It asks whether the compressed state can still replay the operational facts required to continue the work.
 
 ---
 
@@ -35,11 +35,11 @@ CompTextv7 does not ask whether a compressed summary sounds good. It asks whethe
 
 Long-horizon agents compress prior work into smaller summaries. Those summaries can silently lose blockers, constraints, evidence, dependency order, recovery paths, and tool order.
 
-CompTextv7 treats that as a deterministic replay-validation problem. It checks whether compressed operational state remains admissible after reconstruction using fixture-defined contracts, exact scoring, failure labels, committed artifacts, and CI gates.
+CompText V7 treats that as a deterministic replay-validation problem. It checks whether compressed operational state remains admissible after reconstruction using fixture-defined contracts, exact scoring, failure labels, committed artifacts, and CI gates.
 
 ---
 
-## What CompTextv7 is
+## What CompText V7 is
 
 - Deterministic replay-validation infrastructure for operational state.
 - Fixture-bound and contract-linked.
@@ -47,7 +47,7 @@ CompTextv7 treats that as a deterministic replay-validation problem. It checks w
 - CI-reproducible through repository checks.
 - Focused on operational admissibility, not prose quality.
 
-## What CompTextv7 is not
+## What CompText V7 is not
 
 - Agent framework.
 - Workflow orchestrator.
@@ -165,11 +165,13 @@ flowchart LR
 ## Verify locally
 
 ```bash
+python -m pip install -e '.[test]'
 npm install --no-save --no-package-lock
 npm run check
 pytest tests/test_failure_taxonomy.py -q
 pytest tests/test_multi_family_admissibility_artifact.py -q
 pytest tests/test_multi_family_svg_renderer.py -q
+pytest tests/test_paper_replay_bench.py tests/test_agent_trace_replay.py -q
 ```
 
 ---
@@ -254,7 +256,7 @@ flowchart LR
 | Vector memory / RAG | Yes | Retrieval-centric | No | No |
 | Learned prompt compressors | Sometimes | Yes | No | Usually no |
 | LLM-as-judge evaluators | Sometimes | N/A | No | No |
-| CompTextv7 | Yes | Yes | No | Yes |
+| CompText V7 | Yes | Yes | No | Yes |
 
 ---