ossirytk
diff --git a/‎.github/workflows/quality_gate.yml‎
Lines changed: 44 additions & 0 deletions b/‎.github/workflows/quality_gate.yml‎
Lines changed: 44 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 37 additions & 7 deletions b/‎README.md‎
Lines changed: 37 additions & 7 deletions
diff --git a/‎core/conversation_manager.py‎
Lines changed: 1 addition & 0 deletions b/‎core/conversation_manager.py‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎core/conversation_response_mixin.py‎
Lines changed: 12 additions & 1 deletion b/‎core/conversation_response_mixin.py‎
Lines changed: 12 additions & 1 deletion
diff --git a/‎core/conversation_retrieval_orchestration_mixin.py‎
Lines changed: 11 additions & 8 deletions b/‎core/conversation_retrieval_orchestration_mixin.py‎
Lines changed: 11 additions & 8 deletions
diff --git a/‎docs/RAG_SCRIPTS_GUIDE.md‎
Lines changed: 61 additions & 16 deletions b/‎docs/RAG_SCRIPTS_GUIDE.md‎
Lines changed: 61 additions & 16 deletions
diff --git a/‎docs/configs/00_README.md‎
Lines changed: 1 addition & 0 deletions b/‎docs/configs/00_README.md‎
Lines changed: 1 addition & 0 deletions
@@ -0,0 +1,44 @@
+name: Quality Gate
+
+on:
+  push:
+    branches: ["**"]
+  pull_request:
+    branches: ["**"]
+
+jobs:
+  quality-gate:
+    runs-on: ubuntu-latest
+
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.13"
+
+      - name: Install uv
+        uses: astral-sh/setup-uv@v4
+
+      - name: Install dependencies
+        run: uv sync --dev
+
+      - name: Lint (ruff)
+        run: uv run ruff check .
+
+      - name: Format check (ruff)
+        run: uv run ruff format --check .
+
+      - name: Unit tests
+        run: uv run pytest -q
+
+      - name: Capture baselines (idempotent)
+        run: uv run python -m scripts.conversation.capture_baselines
+
+      - name: Quality gate
+        run: >
+          uv run python -m scripts.quality_gate
+          --seed 42
+          --skip-retrieval
+          --baselines-dir logs/conversation_quality/baselines
@@ -89,31 +89,54 @@ Python requirement is defined in `pyproject.toml` (`>=3.13`).
 
 ## Quick RAG Workflow
 
+Use module-style invocation for the active RAG scripts:
+
+```bash
+uv run python -m scripts.rag.<script_name> ...
+```
+
+This is the preferred form in the docs because it is more reliable for package imports than calling nested script paths directly.
+
 1. Analyze source text and generate metadata:
 
 ```bash
-uv run python scripts/rag/analyze_rag_text.py analyze rag_data/shodan.txt -o rag_data/shodan.json --strict
+uv run python -m scripts.rag.analyze_rag_text analyze rag_data/shodan.txt \
+  -o rag_data/shodan.json \
+  --strict \
+  --review-report rag_data/shodan_review.json
 ```
 
 2. Validate metadata:
 
 ```bash
-uv run python scripts/rag/analyze_rag_text.py validate rag_data/shodan.json
+uv run python -m scripts.rag.analyze_rag_text validate rag_data/shodan.json
+```
+
+3. Optional quality gates before push:
+
+```bash
+uv run python -m scripts.rag.manage_collections coverage score \
+  --metadata-file rag_data/shodan.json \
+  --source-file rag_data/shodan.txt \
+  --threshold 0.75
+
+uv run python -m scripts.rag.manage_collections lint message-examples --fix
 ```
 
-3. Push text into a collection:
+4. Push lore and message examples into collections:
 
 ```bash
-uv run python scripts/rag/push_rag_data.py rag_data/shodan.txt -c shodan -w
+uv run python -m scripts.rag.push_rag_data rag_data/shodan.txt -c shodan -w
+uv run python -m scripts.rag.push_rag_data rag_data/shodan_message_examples.txt -c shodan_mes -w
 ```
 
-4. Test retrieval quality:
+5. Spot-check retrieval quality:
 
 ```bash
-uv run python scripts/rag/manage_collections.py test shodan -q "SHODAN origin" -k 5
+uv run python -m scripts.rag.manage_collections test shodan -q "SHODAN origin" -k 5
 ```
 
-5. Evaluate retrieval fixtures with summary metrics:
+6. Evaluate retrieval fixtures with summary metrics:
 
 ```bash
 uv run python -m scripts.rag.manage_collections evaluate-fixtures --fixture-file tests/fixtures/retrieval_fixtures.json
@@ -162,6 +185,8 @@ Notes:
 
 - Leading HTML header comments are stripped before chunking.
 - Metadata auto-detection maps `<name>.txt` and `<name>_message_examples.txt` to `<name>.json`.
+- If metadata exists, push runs a source-coverage quality gate before writing.
+- Category threshold flags are informational at push time; change category assignment by regenerating metadata with `analyze_rag_text`.
 
 ### `scripts/rag/manage_collections.py`
 
@@ -173,6 +198,11 @@ Commands:
 - `test`
 - `export`
 - `info`
+- `evaluate-fixtures`
+- `benchmark-rerank`
+- `backfill-embedding-fingerprint`
+- `coverage score`
+- `lint message-examples`
 
 ### Compatibility wrappers
 
 
@@ -124,6 +124,7 @@ def __init__(self) -> None:
         self._last_summary_topic_terms: set[str] = set()
         drift_window = max(1, self.runtime_config.persona_drift_history_window)
         self.persona_drift_history: deque[float] = deque(maxlen=drift_window)
+        self.persona_drift_trace: deque[dict[str, object]] = deque(maxlen=drift_window)
         self.last_persona_drift: dict[str, object] | None = None
         self.persona_drift_scorer = PersonaDriftScorer(
             PersonaAnchor(
 
@@ -40,7 +40,7 @@ def _record_persona_drift(self, response: str) -> None:
         history.append(float(result.drift_score))
         summary = self._persona_drift_summary()
         turn_number = min(len(self.user_message_history), len(self.ai_message_history))
-        self.last_persona_drift = {
+        drift_record = {
             "turn": turn_number,
             "drift_score": float(result.drift_score),
             "persona_fidelity": float(result.persona_fidelity),
@@ -50,6 +50,10 @@ def _record_persona_drift(self, response: str) -> None:
             "has_user_turn_pattern": bool(result.has_user_turn_pattern),
             "rolling_avg": float(summary["avg"]),
         }
+        self.last_persona_drift = drift_record
+        trace = getattr(self, "persona_drift_trace", None)
+        if trace is not None:
+            trace.append(dict(drift_record))
 
         warning_threshold = float(getattr(self.runtime_config, "persona_drift_warning_threshold", 1.0))
         if result.drift_score >= warning_threshold:
@@ -200,6 +204,8 @@ def clear_conversation_state(self) -> None:
         self._last_summary_topic_terms = set()
         if hasattr(self, "persona_drift_history"):
             self.persona_drift_history.clear()
+        if hasattr(self, "persona_drift_trace"):
+            self.persona_drift_trace.clear()
         self.last_persona_drift = None
 
     def export_conversation_state(self) -> dict[str, object]:
@@ -211,6 +217,7 @@ def export_conversation_state(self) -> dict[str, object]:
             "history_summaries": list(self.history_summaries),
             "last_summary_topic_terms": sorted(self._last_summary_topic_terms),
             "persona_drift_history": list(getattr(self, "persona_drift_history", [])),
+            "persona_drift_trace": list(getattr(self, "persona_drift_trace", [])),
             "persona_drift_last": self.last_persona_drift,
             "persona_drift_avg": drift_summary["avg"],
         }
@@ -222,6 +229,7 @@ def import_conversation_state(self, state: dict[str, object]) -> None:
         history_summaries = state.get("history_summaries", [])
         summary_terms = state.get("last_summary_topic_terms", [])
         drift_history = state.get("persona_drift_history", [])
+        drift_trace = state.get("persona_drift_trace", [])
         drift_last = state.get("persona_drift_last")
 
         normalized_user_history = [item for item in user_history if isinstance(item, str)]
@@ -240,6 +248,9 @@ def import_conversation_state(self, state: dict[str, object]) -> None:
         if hasattr(self, "persona_drift_history"):
             normalized_drift = [float(value) for value in drift_history if isinstance(value, int | float)]
             self.persona_drift_history = deque(normalized_drift, maxlen=self.persona_drift_history.maxlen)
+        if hasattr(self, "persona_drift_trace"):
+            normalized_trace = [item for item in drift_trace if isinstance(item, dict)]
+            self.persona_drift_trace = deque(normalized_trace, maxlen=self.persona_drift_trace.maxlen)
         self.last_persona_drift = drift_last if isinstance(drift_last, dict) else None
 
     _STRAY_TOKENS: tuple[str, ...] = ("[/INST]", "<|im_end|>", "</s>", "<|eot_id|>", "<s>", "<|end|>")
 
@@ -265,14 +265,17 @@ def _get_vector_context(self, query: str, k: int | None = None, *, include_mes:
                 k=k_mes,
             )
         else:
-            mes_chunks, mes_trace = [], {
-                "mode": "disabled",
-                "filter_path": "none",
-                "candidates": 0,
-                "returned": 0,
-                "queries": 0,
-                "rerank_applied": False,
-            }
+            mes_chunks, mes_trace = (
+                [],
+                {
+                    "mode": "disabled",
+                    "filter_path": "none",
+                    "candidates": 0,
+                    "returned": 0,
+                    "queries": 0,
+                    "rerank_applied": False,
+                },
+            )
         context_chunks = self._filter_context_chunks(context_chunks)
         mes_chunks = self._filter_context_chunks(mes_chunks)
         context_chunks, mes_chunks, cross_removed = self._dedupe_cross_collection_chunks(context_chunks, mes_chunks)
 
@@ -4,6 +4,14 @@ Last verified: 2026-03-12
 
 This guide documents the current CLI behavior for scripts in `scripts/rag/`.
 
+Use module-style invocation for active commands:
+
+```bash
+uv run python -m scripts.rag.<script_name> ...
+```
+
+Top-level wrappers in `scripts/*.py` still exist for compatibility, but module invocation is the preferred form for day-to-day RAG data management.
+
 ## Docs Quick Links
 
 - RAG management docs hub: `docs/rag_management/00_README.md`
@@ -23,7 +31,42 @@ For detailed per-script documentation, see:
 4. `scripts/rag/old_prepare_rag.py` (legacy batch helper)
 5. `scripts/context/fetch_character_context.py`
 
-Top-level wrappers in `scripts/*.py` are kept for compatibility.
+## Canonical RAG Data Management Process
+
+The clearest routine workflow for one character or corpus is:
+
+1. Prepare the source text in `rag_data/<name>.txt`.
+2. Generate metadata with `analyze_rag_text`.
+3. Validate the metadata file.
+4. Optionally run quality gates:
+   - `coverage score` for source-to-metadata coverage
+   - `lint message-examples` when `*_message_examples.txt` exists
+5. Push lore and message examples with `push_rag_data`.
+6. Spot-check retrieval with `manage_collections test`.
+7. Run `evaluate-fixtures` when you need regression metrics.
+
+Example:
+
+```bash
+uv run python -m scripts.rag.analyze_rag_text analyze rag_data/shodan.txt \
+  -o rag_data/shodan.json \
+  --strict \
+  --review-report rag_data/shodan_review.json
+
+uv run python -m scripts.rag.analyze_rag_text validate rag_data/shodan.json
+
+uv run python -m scripts.rag.manage_collections coverage score \
+  --metadata-file rag_data/shodan.json \
+  --source-file rag_data/shodan.txt \
+  --threshold 0.75
+
+uv run python -m scripts.rag.manage_collections lint message-examples --fix
+
+uv run python -m scripts.rag.push_rag_data rag_data/shodan.txt -c shodan -w
+uv run python -m scripts.rag.push_rag_data rag_data/shodan_message_examples.txt -c shodan_mes -w
+
+uv run python -m scripts.rag.manage_collections test shodan -q "SHODAN origin" -k 5
+```
 
 ---
 
@@ -32,7 +75,7 @@ Top-level wrappers in `scripts/*.py` are kept for compatibility.
 ### Analyze
 
 ```bash
-uv run python scripts/rag/analyze_rag_text.py analyze rag_data/shodan.txt -v
+uv run python -m scripts.rag.analyze_rag_text analyze rag_data/shodan.txt -v
 ```
 
 Common options:
@@ -48,21 +91,21 @@ Common options:
 ### Validate metadata
 
 ```bash
-uv run python scripts/rag/analyze_rag_text.py validate rag_data/shodan.json
+uv run python -m scripts.rag.analyze_rag_text validate rag_data/shodan.json
 ```
 
 ### Scan directory
 
 ```bash
-uv run python scripts/rag/analyze_rag_text.py scan rag_data/ --auto-generate
+uv run python -m scripts.rag.analyze_rag_text scan rag_data/ --auto-generate
 ```
 
 ---
 
 ## 2) Push RAG Data to ChromaDB
 
 ```bash
-uv run python scripts/rag/push_rag_data.py rag_data/shodan.txt -c shodan
+uv run python -m scripts.rag.push_rag_data rag_data/shodan.txt -c shodan
 ```
 
 Common options:
@@ -82,8 +125,10 @@ Notes:
 
 - Leading HTML header comments are stripped before chunking.
 - Metadata file auto-detection maps `<name>.txt` (and `<name>_message_examples.txt`) to `<name>.json`.
+- If metadata exists, push runs the coverage quality gate before writing.
 - Metadata enrichment workers use `ProcessPoolExecutor` with `spawn` context to avoid Python 3.13 `fork()` deprecation warnings in multithreaded runs.
 - Collection writes stamp embedding fingerprint metadata and non-overwrite pushes block mixed-model writes.
+- Category threshold flags are logged for visibility, but category assignment itself happens when metadata is generated by `analyze_rag_text`.
 
 ---
 
@@ -92,25 +137,25 @@ Notes:
 ### List
 
 ```bash
-uv run python scripts/rag/manage_collections.py list-collections -v
+uv run python -m scripts.rag.manage_collections list-collections -v
 ```
 
 ### Delete one
 
 ```bash
-uv run python scripts/rag/manage_collections.py delete shodan_old -y
+uv run python -m scripts.rag.manage_collections delete shodan_old -y
 ```
 
 ### Delete multiple
 
 ```bash
-uv run python scripts/rag/manage_collections.py delete-multiple --pattern "test_*" -y
+uv run python -m scripts.rag.manage_collections delete-multiple --pattern "test_*" -y
 ```
 
 ### Test retrieval
 
 ```bash
-uv run python scripts/rag/manage_collections.py test shodan -q "SHODAN origin" -k 5
+uv run python -m scripts.rag.manage_collections test shodan -q "SHODAN origin" -k 5
 ```
 
 Optional embedding overrides:
@@ -121,13 +166,13 @@ Optional embedding overrides:
 ### Export
 
 ```bash
-uv run python scripts/rag/manage_collections.py export shodan -o backups/shodan.json
+uv run python -m scripts.rag.manage_collections export shodan -o backups/shodan.json
 ```
 
 ### Info
 
 ```bash
-uv run python scripts/rag/manage_collections.py info shodan
+uv run python -m scripts.rag.manage_collections info shodan
 ```
 
 ### Evaluate fixtures
@@ -158,7 +203,7 @@ Use this after upgrading to fingerprint enforcement to migrate legacy collection
 ## 4) Fetch and Clean Character Context From Web
 
 ```bash
-uv run python scripts/context/fetch_character_context.py "https://en.wikipedia.org/wiki/Leonardo_da_Vinci" -o rag_data/leonardo_da_vinci.txt
+uv run python -m scripts.context.fetch_character_context "https://en.wikipedia.org/wiki/Leonardo_da_Vinci" -o rag_data/leonardo_da_vinci.txt
 ```
 
 Features:
@@ -173,10 +218,10 @@ Features:
 ## Typical Workflow
 
 ```bash
-uv run python scripts/rag/analyze_rag_text.py analyze rag_data/new_char.txt -o rag_data/new_char.json --strict
-uv run python scripts/rag/analyze_rag_text.py validate rag_data/new_char.json
-uv run python scripts/rag/push_rag_data.py rag_data/new_char.txt -c new_char -w
-uv run python scripts/rag/manage_collections.py test new_char -q "intro prompt" -k 5
+uv run python -m scripts.rag.analyze_rag_text analyze rag_data/new_char.txt -o rag_data/new_char.json --strict
+uv run python -m scripts.rag.analyze_rag_text validate rag_data/new_char.json
+uv run python -m scripts.rag.push_rag_data rag_data/new_char.txt -c new_char -w
+uv run python -m scripts.rag.manage_collections test new_char -q "intro prompt" -k 5
 ```
 
 ## Related Files
 
@@ -12,6 +12,7 @@ This section documents configuration files in `configs/`.
 ## Runtime Loading Behavior
 
 - Runtime requires `configs/config.v2.json`.
+- The repository currently tracks `configs/config.v2.json` directly; no `config.v2.example.json` is shipped.
 - `core/config.py` flattens nested v2 keys into legacy-style runtime keys for internal use.
 - `ConversationManager` and script CLIs consume typed values via `load_conversation_runtime_config` / `load_rag_script_config`.