You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
perf(bench): hard 20s timeout + diagnostic counters in latency
Worker-side kill switch via threading.Timer + os._exit(137) bounds
calibration wall-clock at timeout * ceil(n/workers) even when PPR
or lazy-greedy blow up on pathological repos (vscode, mui).
Differentiates kill-by-timeout from genuine BrokenProcessPool by
elapsed time so pathological instances are checkpointed as timeouts
instead of triggering infinite retry.
Adds candidate_count, edge_count, scoring_ms, selection_ms,
greedy_iters to LatencyBreakdown so the next calibration run shows
which phase blew up, not just total scoring_selection_ms.
fsync on checkpoint append prevents partial writes after worker kill.
Copy file name to clipboardExpand all lines: REVIEW_TESTS.md
+115-4Lines changed: 115 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -827,11 +827,122 @@ Must-fix this week.
827
827
828
828
Из ~70 findings реальных must-fix для submission paper — **5**. Остальные — управляемый debt; pattern-based primitives закроют ~30 за счёт 5 мелких рефакторов.
829
829
830
+
---
831
+
832
+
## Re-run 2026-04-30 — Round 1 (incremental)
833
+
834
+
Eight new R1 sonnet agents ran today against the same scope. Below are the **non-duplicate** findings — issues not already covered by the canonical R1/R2/R3 above. Severity uses the same rubric.
835
+
836
+
### New 🔴 Critical
837
+
838
+
**X1 — `GitError` is dead code; Rust raises `PyRuntimeError`, MCP catches `GitError`** ✅
- Scenario: User passes a bad revision through MCP. The `try…except GitError` block never fires; `PyRuntimeError` propagates as a generic FastMCP `ToolError` with no `"Try 'HEAD~1..HEAD'…"` hint. `test_mcp.py::test_invalid_diff_range` only checks `pytest.raises(ToolError)` without message, so the dead branch passes CI.
841
+
- Why no test catches it: existing test never asserts on the error message text.
842
+
843
+
**X2 — `_diffctx` ImportError path is untested; pip-install without wheel = raw traceback**
- Scenario: User installs `treemapper` without the compiled extension (wrong Python ABI, source dist). First call to any diff feature crashes with raw `ImportError`. No graceful fallback message.
846
+
847
+
**X3 — PPR push-budget cap silently truncates BFS on large repos** ✅
848
+
- File:line: `diffctx/src/ppr.rs:75` — `max_pushes = (n * PPR.push_scale_factor).min(PPR.max_pushes_cap)`; line ~149 re-normalizes the score vector after early termination.
849
+
- Scenario: Monorepo (≥20k fragments). Cap fires; propagation halts before mass reaches distant-but-relevant fragments; re-normalization disguises the truncation. User sees plausible rankings that omit the actually-relevant code.
850
+
- Why it matters: paper integrity at scale. No yaml case has a repo big enough to hit the cap.
- Scenario: Mid-size project sits near the 50-file threshold. Adding/removing a single file flips scoring algorithms; user sees non-monotonic context changes between runs. Boundary (49/50/51) is exercised by no yaml case.
855
+
856
+
**X5 — `CochangeEdgeBuilder` is structurally dead in 100% of yaml tests** ✅
857
+
- File:line: `diffctx/src/edges/history/cochange.rs:112` — `if *count < COCHANGE.min_count { continue }`; `COCHANGE.min_count ≥ 2`. Test harness in `diffctx/tests/yaml_cases.rs` creates a repo with exactly two commits (initial + change), so every co-change pair count is 1 — always below threshold.
858
+
- Scenario: A regression in pair counting, log-scale weighting, or `max_files_per_commit` skip would never be caught. The entire history-edge category has zero coverage.
859
+
860
+
**X6 — `IntervalIndex::overlaps` treats shared boundary line as overlap** ✅
861
+
- File:line: `diffctx/src/interval.rs` — `if end >= frag.start_line() { return true; }` triggers when `end == start_line` (back-to-back fragments sharing a single boundary line, valid in compact Rust/Go/Scala).
862
+
- Scenario: A hunk touching the last line of function A causes function B (starting on that same line) to be permanently excluded from selection. Silently missing context.
863
+
864
+
**X7 — `DIFFCTX_OP_EGO_PER_HOP_DECAY` is a dead env knob** ✅
865
+
- File:line: `diffctx/src/config/scoring.rs:22` reads it into `EgoScoringConfig.per_hop_decay`. `diffctx/src/scoring.rs:112` calls `g.ego_graph(core_ids, self.max_depth)` — the decay parameter is **never passed** to `graph.rs:198::ego_graph(...)`.
866
+
- Scenario: Operator tunes `DIFFCTX_OP_EGO_PER_HOP_DECAY=0.5` for a calibration sweep; observable behavior is unchanged; data is silently meaningless. Any paper figure using this knob has zero variance from changing it.
867
+
868
+
**X8 — Directory symlinks are silently dropped (Python tree mode)** ✅
869
+
- File:line: `src/treemapper/tree.py:172` — `if entry.is_symlink() or not entry.exists(): logger.debug(...); return None`. All symlinks (including dir-symlinks the user explicitly placed inside the repo) skipped without warning.
870
+
- Scenario: User keeps `vendor/` or `shared/` as a symlinked dir. `treemapper .` silently omits all of it. No warning is emitted; existing test only verifies that a *file* symlink is absent.
871
+
872
+
### New 🟡 Warning (selected)
873
+
874
+
**X9 — UTF-16-LE/BE and UTF-8-with-BOM files**: `tree.py:231-247` only tests CP1251 fallback; UTF-16 files become `<unreadable content: not utf-8>` if `charset-normalizer` isn't installed. (`tests/test_basic.py::test_unicode_content_and_encoding_errors`).
875
+
876
+
**X10 — NFC vs NFD path round-trip on macOS**: HFS+/APFS returns NFD from `iterdir()`; PyYAML serializes as-is; downstream NFC lookups silently fail. Untested.
877
+
878
+
**X11 — `LexicalEdgeBuilder` zero-edge fallback when all changed identifiers are short**: `diffctx/src/edges/similarity/lexical.rs:96-104` — drops identifiers shorter than `query_min_identifier_length` (=3). A Go diff using `i`, `ok`, `err`, `db` produces zero lexical edges. Algorithm's recovery behavior is untested.
879
+
880
+
**X12 — `SiblingEdgeBuilder` breaks on backslash paths**: `diffctx/src/edges/structural/sibling.rs:20-27` uses `Path::new().parent()` without normalizing separators; `src\utils.rs` has no Unix parent → both files bucket under `""`, no sibling edges. No yaml case uses backslash paths.
881
+
882
+
**X13 — R extractor silently drops `.Rmd`, `.qmd`, `.rnw` files**: `diffctx/src/edges/semantic/r_lang.rs:13-16` — `is_r_file` only matches `.r` and `.rmd`. Quarto and Sweave notebooks produce no edges.
883
+
884
+
**X14 — `ScoringMode::Ppr`, `Ego`, `Bm25` have zero integration coverage**: every yaml test invocation in `diffctx/tests/yaml_cases.rs:240` and `diffctx/src/test_harness.rs:126` hardcodes `ScoringMode::Hybrid`. The three other modes differ in discovery and filtering paths and are never exercised end-to-end.
885
+
886
+
**X15 — Pure `git mv` (rename, no content change) → empty output**: `diffctx/src/git.rs::parse_diff` only collects hunks from `@@` lines. A bare `git mv` produces no `@@` lines → empty hunks → empty seeds → PPR emits nothing.
887
+
888
+
**X16 — `SAFE_RANGE_RE` accepts leading-dot ranges (`..origin/main`)** ✅: `diffctx/src/git.rs:31` regex `^[a-zA-Z0-9_.^~/@{}\-]+(\.\.\.?[a-zA-Z0-9_.^~/@{}\-]*)?$` — `.` is inside the character class, so `..origin/main` passes validation, then git rejects it at subprocess time with raw `fatal: ambiguous argument` rather than the designed `InvalidRange` error.
889
+
890
+
**X17 — Merge-commit combined-diff `@@@` headers silently ignored**: `diffctx/src/git.rs::HUNK_RE` matches `^@@ -` only; `@@@` (three-parent combined diff) is dropped. Files modified only inside a merge commit produce no hunk seeds.
891
+
892
+
**X18 — YAML literal block strips trailing whitespace**: `src/treemapper/writer.py:76-81` emits `|2`-indented blocks; PyYAML strips trailing spaces from each line on `safe_load`. Markdown line-break (`text \n`) and indented-docstring constructs silently mutate on round-trip. Distinct from O1 (which is about line-level newlines).
893
+
894
+
**X19 — `count_tokens` returns `u32`, casts via `as u32`**: `diffctx/src/tokenizer.rs:16` — silent overflow possible on very large inputs. `diffctx/src/pybridge.rs:335` forwards the same `u32`.
**X21 — `to_yaml`/`to_json` on `DiffContextResult` silently return empty string on serde failure** (`diffctx/src/pybridge.rs:103,108` — `unwrap_or_default()`). MCP clients receive `""` with no error.
901
+
902
+
**X22 — `forward_blend=0.0` env value silently inverts ranking direction** (`diffctx/src/config/limits.rs:142`; `ppr.rs:146`). Clamped to `[0,1]` but no degeneracy guard at the endpoints.
903
+
904
+
**X23 — `.scss` parsed with CSS grammar; `$var:…` produces ERROR-dominated tree** (`diffctx/src/parsers/tree_sitter_strategy.rs:407-416`). Only one SCSS yaml case exists and it passes via raw-anchor matching, not symbol extraction.
905
+
906
+
### False Positive from this re-run
907
+
908
+
-**F22** ("zero yaml test cases exist") — verified false: `find tests/cases/diff -name '*.yaml' | wc -l` = **2723**. The agent misread the harness path. Discard.
909
+
910
+
---
911
+
912
+
## Re-run 2026-04-30 — Updated Verdict
913
+
914
+
The 2026-04-28 verdict's top 5 (E1, R2-T1, R2-P1, E2, M2) re-verified against current source — all still real. The re-run surfaces eight additional 🔴 worth pulling forward:
915
+
916
+
### Updated must-fix-this-week (additions to prior list)
917
+
918
+
-**X3 (PPR push cap silent truncation)** — same paper-integrity class as E1; clamp + emit a `truncated_at_pushes` counter.
919
+
-**X4 (Hybrid 50-file boundary)** — add yaml cases at 49/50/51 candidate files asserting algorithmic determinism near boundary.
920
+
-**X5 (CochangeEdgeBuilder dead in 100% of yaml tests)** — add at least one yaml case with ≥2 commits per file pair so the entire history-edge category has any coverage at all.
921
+
-**X6 (IntervalIndex shared-boundary overlap)** — fix to `end > start_line` (strict) and add adjacency yaml case.
922
+
-**X7 (`EGO_PER_HOP_DECAY` dead knob)** — plumb through to `graph.ego_graph` or remove from config; add a determinism test that toggles the knob.
923
+
-**X1 (GitError dead code)** — register `GitError` as a `create_exception!` in `pybridge.rs` so the MCP `except GitError` actually fires; add MCP test asserting the helpful message text.
924
+
-**X8 (directory symlinks silently dropped)** — emit `tracing::warn` and add option `--follow-symlinks`; add test case with a directory symlink in the tree.
925
+
-**X14 (`ScoringMode::Ppr/Ego/Bm25` no integration coverage)** — parameterize at least one yaml case across all four modes to confirm non-Hybrid paths produce non-empty output.
- ✅ X1 verified at `src/treemapper/mcp/server.py:52` + `diffctx/src/pybridge.rs:224,287`.
930
+
- ✅ X3 verified at `diffctx/src/ppr.rs:75`.
931
+
- ✅ X5 verified at `diffctx/src/edges/history/cochange.rs:112` (`min_count` filter); harness at `diffctx/tests/yaml_cases.rs` creates exactly two commits.
0 commit comments