feat(SR-173): Visual Debug Report -- HTML+SVG overlay per step (closes #178) by Delqhi · Pull Request #184 · SIN-CLIs/stealth-runner

Delqhi · 2026-05-13T06:50:14Z

SR-173 -- Visual Debug Report (HTML + SVG-Overlay per Step)

Closes #178.

TL;DR

Per-step HTML+SVG-Overlay-Report über das Action-Screenshot. Macht die vier Koordinaten-Misalignment-Bug-Klassen in 5 Sekunden statt 15 Minuten sichtbar:

iFrame-Offset -- AX-Tree liefert iframe-lokale Koords; Click braucht page-Koords (frame_offset Translation).
DPR-Mismatch -- Screenshot ist physical px, AX-Tree CSS px; SVG-Rect muss um DPR skaliert werden.
Scroll stale -- Snapshot bei scrollY=300, Click 200 ms später bei scrollY=400.
z-index Overlay -- Modal eats the click; AX-Tree merkt es nicht, Mensch sieht es sofort.

Jede Bug-Klasse hat einen eigenen Unit-Test (tests/test_visual_debug.py). 12 / 12 grün auf Python 3.13 (0.49 s).

Dateien

Datei	Status	Zweck
`survey-cli/survey/observability/visual_debug.py`	NEW	Kernmodul: Renderer + Dispatcher + Geometry-Primitives + Protocol-Shims für SR-167/168
`survey-cli/survey/runner_policy.py`	NEW	Zentrale, immutable, env-driven `RunnerPolicy` (`STEALTH_ENV`, `VISUAL_DEBUG_*`)
`survey-cli/survey/observability/__init__.py`	patched	Re-exports der Public-API
`survey-cli/survey/safe_executor.py`	patched	Optionaler, failure-isolierter Hook nach jeder Action
`survey-cli/tests/test_visual_debug.py`	NEW	12 Tests: je 1 pro Bug-Klasse + Determinismus + Atomicity + Backpressure + E2E
`scripts/build_daily_visual_report.py`	NEW	Daily Aggregator + optional Vercel-Blob-Upload
`survey-cli/AGENTS.md`	patched	SR-173 Brain-File-Sektion (Datei-Landkarte, Design-Begründungen, NIEMALS-Regeln, Public-API, Test-Matrix, Operations, Roadmap-Hooks)

State-of-the-art Entscheidungen (begründete Abweichungen vom Briefing)

ThreadPoolExecutor, NICHT asyncio.create_task. safe_executor.SurveyFlowExecutor ist sync (Modul-Docstring: "synchronous websocket ... matches LangGraph node execution"). Es gibt keinen laufenden Event-Loop. Ein bounded ThreadPoolExecutor + BoundedSemaphore ist die korrekte Primitive: non-blocking submit, drop-on-overflow (NIEMALS blockieren), atexit-clean. Die Non-Blocking-Garantie aus SR-173: Visual Debug Report (HTML + SVG-Overlay per Step) #178 ist nicht nur erfüllt -- sie ist härter (asyncio-Tasks können bei saturiertem Loop unbegrenzt queuen; unser Semaphore cappt hart bei max_queue).
runner_policy.py ist NEU, nicht edit. Die Datei existierte auf main nicht.
Protocol-Shims für VerificationResult (SR-167 / feat(governance): SR-159 — path doctrine + AGENTS.md + path-guard workflow (#159) #173) und AttestationResult (SR-168 / test(probe): SR-159 — path-guard failure demonstration (do not merge) #174). Diese PRs sind noch nicht in main. runtime_checkable Protocol mit identischer Field-Shape erlaubt: heute kompilieren + testen; nach Merge der Dependencies ist es ein 1-Zeilen-Import-Swap, kein Runtime-Change.
Point / Box / ElementRef in visual_debug.py, nicht in snapshot.py. YAGNI -- aktuell ein Single-Caller. Promote-TODO ist inline dokumentiert.
blake2b-Sampling, nicht random.random(). Deterministisch pro step_id -- Retries auf denselben Step liefern dieselbe Sample-Entscheidung; kein Double-Counting in Dashboards.

Performance / Kosten

~35 KB pro File (JPEG@70 + SVG + JSON).
Prod: 10 % Sampling + 100 % bei Verifier-Fail.
Erwartet: ~1 500 Renders/Tag × ~35 KB = ~50 MB/Tag → ~$0.45/Monat auf Vercel Blob.

Operations

# Daily index bauen:
python scripts/build_daily_visual_report.py --date 2026-05-13

# Mit Upload (gibt index-URL auf stdout):
BLOB_READ_WRITE_TOKEN=... \
    python scripts/build_daily_visual_report.py --date 2026-05-13 --upload

Test-Lauf

cd survey-cli && pytest tests/test_visual_debug.py -v
============================== 12 passed in 0.49s ==============================

Compliance

Keine BANNED-Methoden referenziert.
BANNED-Liste in jeder neuen Source-Datei als Header-Block dokumentiert (AGENTS.md-Konvention).
Keine zusätzlichen .md-Dateien erzeugt -- alles inline in den Quellen + Brain-Sektion in survey-cli/AGENTS.md.
Keine print()-Debug-Statements im Hot Path (logger-only).
Atomare Writes via <final>.<uuid>.tmp + os.replace.
Frozen dataclasses (slots=True) -- thread-safe by construction, kein Locking.

Roadmap-Hooks (nach Merge)

SR-167 merged → VerificationResultLike → VerificationResult (1-Zeilen-Swap, inline TODO markiert).
SR-168 merged → analog für AttestationResultLike.
SR-172 (SR-172 META: Reliability Push to 10/10 — 5-Phase Plan #172) Meta-Tracker Checkbox.

Implements the per-step Visual Debug Report described in issue #178. What this PR delivers ===================== - survey/observability/visual_debug.py (NEW) Self-contained HTML+SVG renderer with embedded JPEG screenshot and inline SVG overlay (target bbox + click crosshair). Non-blocking dispatcher backed by a bounded ThreadPoolExecutor (drop-on-overflow, NEVER blocks the LangGraph hot path). Deterministic sampling via blake2b(step_id). Atomic writes via tmp + os.replace. - survey/runner_policy.py (NEW) Central, immutable, env-driven RunnerPolicy. Per-environment presets (prod = 10 % sampling + 100 % on failure; staging/dev = 100 %). - survey/observability/__init__.py (re-exports the new public API) - survey/safe_executor.py (PATCHED) Optional visual_debug_dispatcher + visual_debug_frame_builder kwargs. Failure-isolated hook -- executor stability never depends on debug pipeline being healthy. - tests/test_visual_debug.py (NEW) 12 tests, 12/12 green on Py 3.13. One test per coordinate-misalignment bug class (iframe-offset, DPR-mismatch, scroll-stale, z-index-overlay) plus invariants (determinism, atomicity, backpressure, end-to-end). - scripts/build_daily_visual_report.py (NEW) Daily aggregator: builds index.html with OK/FAIL filters; optional Vercel-Blob upload via $BLOB_READ_WRITE_TOKEN. - survey-cli/AGENTS.md (UPDATED) Adds the SR-173 brain-file section (Datei-Landkarte, Design- Entscheidungen mit Begründung, Public-API, NIEMALS-Regeln, Test-Matrix, Operations, Roadmap-Hooks for SR-167 / SR-168). State-of-the-art deviations from the briefing (documented inline) ================================================================= 1. ThreadPoolExecutor instead of asyncio.create_task. safe_executor.SurveyFlowExecutor is synchronous (its docstring states: "synchronous websocket ... matches LangGraph node execution"). There is no running event loop. The non-blocking invariant from the briefing is preserved -- and in fact strengthened: BoundedSemaphore guarantees hard drop-on-overflow, whereas asyncio tasks can queue indefinitely. 2. runner_policy.py is created NEW. The briefing said "edit" but the file did not exist on main. 3. Protocol-based shims for VerificationResult (SR-167 / #173) and AttestationResult (SR-168 / #174) -- those PRs aren't on main yet. When they merge, a 1-line import swap completes integration. 4. Point/Box/ElementRef are introduced in visual_debug.py, not in snapshot.py (YAGNI -- single caller today; promote later if a second caller emerges). Tests ===== cd survey-cli && pytest tests/test_visual_debug.py -v 12 passed in 0.49s on Python 3.13. Closes #178 (SR-173). Related: #172 (SR-172 reliability tracker), #173 (SR-167), #174 (SR-168).

The Protocol-Shim TODOs and AGENTS.md brain-section referenced #173/#174 as the placeholder issue numbers for SR-167/SR-168, but the canonical numbers are: - SR-167 -> issue #167 (Post-Action Verifier Node) - SR-168 -> issue #168 (Triple-Channel Attestation) #173 is actually SR-159 Path Doctrine. Fix all references in: - survey/observability/visual_debug.py (Protocol docstrings + TODOs) - survey/runner_policy.py (related-issues block) - survey-cli/AGENTS.md (SR-173 brain-section) No runtime change. Tests still 12/12 green.

…merge) The merge of PR #184 (Visual Debug) accidentally overwrote the NetworkTuning dataclass and get_network_tuning function that were added in PR #185 (SR-174 Network Gate). This restores those exports so network_gate.py can import them. CEO-Session 2026-05-13

Delqhi added 2 commits May 13, 2026 11:22

Delqhi force-pushed the feat/sr-173-visual-debug branch from f7e9408 to f29bf43 Compare May 13, 2026 11:22

Delqhi merged commit 0029f76 into main May 13, 2026
7 of 11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(SR-173): Visual Debug Report -- HTML+SVG overlay per step (closes #178)#184

feat(SR-173): Visual Debug Report -- HTML+SVG overlay per step (closes #178)#184
Delqhi merged 2 commits into
mainfrom
feat/sr-173-visual-debug

Delqhi commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Delqhi commented May 13, 2026

SR-173 -- Visual Debug Report (HTML + SVG-Overlay per Step)

TL;DR

Dateien

State-of-the-art Entscheidungen (begründete Abweichungen vom Briefing)

Performance / Kosten

Operations

Test-Lauf

Compliance

Roadmap-Hooks (nach Merge)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant