Skip to content

Commit 34e95dc

Browse files
fix: review inherits committed baseline depth for the PR diff
Review mode hardcoded the analysis depth to 2 when depth_level was unset, ignoring the committed .codeboarding/analysis.json's own depth_level. When the committed baseline was deeper (e.g. depth 4), validate-base rejected it as "deeper than expected", so the action regenerated a shallower depth-2 base, diffed head(2) vs that, and wrote base_commit_found=false / base_commit_sha=null into the PR artifact. The webview then fell back to diffing the depth-2 head against the committed depth-4 base, reporting hundreds of phantom "deleted" components for sub-trees that only differ by analysis depth. Review now inherits the committed baseline's depth_level (mirroring sync's run_analyze) via a new stdlib-only `baseline-depth` subcommand invoked before the engine is installed, so head and base are analyzed at the same depth, the committed baseline is reused, and the artifact carries a real base_commit_sha the webview diffs against. The accepted depth ceiling is raised 3 -> 4 (the engine has no depth cap) so a committed depth-4 baseline is a first-class value. - engine_adapter.py: best-effort engine imports so metadata-only subcommands run without the engine installed; add baseline_depth() + `baseline-depth` command; widen _supported_depth and argparse choices to include 4. - action.yml: resolve_depth inherits the committed baseline depth in review mode (sync unchanged); depth guard + input doc widened to 1-4. - README: depth_level doc widened to 1-4. - tests: baseline-depth coverage incl. engine-absent subprocess run; depth-4 accepted across base/head/analyze/validate-base; depth-5 now rejected. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent 26be7bc commit 34e95dc

5 files changed

Lines changed: 212 additions & 24 deletions

File tree

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -265,7 +265,7 @@ Review mode does not need `contents: write`: PR-specific generated files are sto
265265
| `github_token` | both | `${{ github.token }}` | Token for GitHub API calls; in review mode it posts or updates the PR comment. |
266266
| `push_token` | sync | `${{ github.token }}` | Token used for sync-mode pushes to `target_branch`. The workflow token can push when the workflow grants `permissions: contents: write`. Separate from `github_token` so commenting can use a GitHub App token while the push uses the workflow token. |
267267
| `codeboarding_version` | both | `0.12.3` | CodeBoarding PyPI package version used as the analysis engine. Pin for reproducibility. |
268-
| `depth_level` | both | empty (`2` for cold starts) | Analysis depth, 1 to 3, used for first analysis and `force_full` rebuilds. Once `.codeboarding/analysis.json` exists, its `metadata.depth_level` is the source of truth for incremental analysis and fallback-full recovery. |
268+
| `depth_level` | both | empty (`2` for cold starts) | Analysis depth, 1 to 4, used for first analysis and `force_full` rebuilds. Once `.codeboarding/analysis.json` exists, its `metadata.depth_level` is the source of truth: sync runs incremental at the baseline depth, and review analyzes the PR head at the committed baseline depth so the diff is apples-to-apples. |
269269
| `render_depth` | review | `1` | Display depth for the PR diagram. Keep `1` for a clean top-level view. |
270270
| `diagram_direction` | review | `LR` | Mermaid direction: `LR`, `TD`, `TB`, `RL`, or `BT`. |
271271
| `changed_only` | review | `false` | Render only changed components and incident edges. |

action.yml

Lines changed: 33 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ inputs:
3636
required: false
3737
default: '0.12.3'
3838
depth_level:
39-
description: 'Analysis depth (1-3) for cold-start or force_full rebuilds. Once .codeboarding/analysis.json exists, its metadata.depth_level is the source of truth for incremental analysis and fallback-full recovery. Empty (default): 2 for cold starts.'
39+
description: 'Analysis depth (1-4) for cold-start or force_full rebuilds. Once .codeboarding/analysis.json exists, its metadata.depth_level is the source of truth: sync runs incremental at the baseline depth, and review analyzes the PR head at the committed baseline depth so the diff is apples-to-apples. Empty (default): 2 for cold starts.'
4040
required: false
4141
default: ''
4242
agent_model:
@@ -210,8 +210,8 @@ runs:
210210
*) echo "::error::mode must be 'review' or 'sync' (got '$MODE')."; exit 1 ;;
211211
esac
212212
case "$DEPTH" in
213-
''|1|2|3) ;;
214-
*) echo "::error::depth_level must be 1, 2, or 3 (empty = default cold-start depth 2)."; exit 1 ;;
213+
''|1|2|3|4) ;;
214+
*) echo "::error::depth_level must be 1, 2, 3, or 4 (empty = default cold-start depth 2)."; exit 1 ;;
215215
esac
216216
echo "mode=$MODE" >> "$GITHUB_OUTPUT"
217217
@@ -472,8 +472,13 @@ runs:
472472
id: resolve_depth
473473
if: steps.guard.outputs.skip != 'true'
474474
shell: bash
475+
working-directory: target-repo
475476
env:
476477
INPUT_DEPTH: ${{ inputs.depth_level }}
478+
MODE: ${{ steps.guard.outputs.mode }}
479+
BASE_SHA: ${{ steps.guard.outputs.base_sha }}
480+
EMPTY_BASE: ${{ steps.guard.outputs.empty_base }}
481+
ACTION_PATH: ${{ github.action_path }}
477482
run: |
478483
set -euo pipefail
479484
# Explicit input controls cold-start/force_full rebuilds. Existing
@@ -483,6 +488,31 @@ runs:
483488
echo "Using explicit depth_level=$INPUT_DEPTH."
484489
exit 0
485490
fi
491+
# Review against a committed baseline: analyze the PR head at the SAME
492+
# depth the committed .codeboarding/analysis.json was generated with, so
493+
# head and base diff apples-to-apples. Defaulting to a shallower depth
494+
# would make validate-base reject the deeper baseline, force the action to
495+
# regenerate a shallower base, and (because the artifact then carries no
496+
# committed base SHA) leave the webview diffing the deeper committed base
497+
# against the shallower head — reporting phantom "deleted" components.
498+
# The engine is not installed yet at this step, so baseline-depth runs on
499+
# the runner's stdlib python3 (it only parses the committed JSON).
500+
if [ "$MODE" = "review" ] && [ "$EMPTY_BASE" != "true" ] && [ -n "$BASE_SHA" ]; then
501+
BASE_ANALYSIS="$(mktemp)"
502+
if git show "${BASE_SHA}:.codeboarding/analysis.json" > "$BASE_ANALYSIS" 2>/dev/null; then
503+
INHERITED="$(python3 "$ACTION_PATH/scripts/engine_adapter.py" baseline-depth --analysis "$BASE_ANALYSIS" | sed -n 's/^depth_level=//p')"
504+
rm -f "$BASE_ANALYSIS"
505+
if [ -n "$INHERITED" ]; then
506+
echo "depth=$INHERITED" >> "$GITHUB_OUTPUT"
507+
echo "Inheriting committed baseline depth_level=$INHERITED for the PR-head analysis."
508+
exit 0
509+
fi
510+
echo "Committed baseline has no usable depth_level; using default cold-start depth."
511+
else
512+
rm -f "$BASE_ANALYSIS"
513+
echo "No committed baseline at ${BASE_SHA}; using default cold-start depth."
514+
fi
515+
fi
486516
DEPTH=2
487517
echo "depth=$DEPTH" >> "$GITHUB_OUTPUT"
488518
echo "Using default cold-start depth_level=$DEPTH."

scripts/engine_adapter.py

Lines changed: 56 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,12 @@
11
"""CLI adapter between the action and the CodeBoarding analysis ENGINE.
22
33
No analysis logic lives here. The engine is the published ``codeboarding`` PyPI
4-
package installed by the action and imported lazily inside each function
5-
(``codeboarding_workflows`` etc.); this module just turns the action's shell
6-
steps into typed, tested calls into it. The lazy imports mean this file imports
7-
fine without the package present — the tests stub those modules and assert we
8-
call the engine with the right args.
4+
package installed by the action (``codeboarding_workflows`` etc.); this module
5+
just turns the action's shell steps into typed, tested calls into it. The engine
6+
imports are best-effort at module load, so this file imports fine without the
7+
package present — the metadata-only subcommands (``baseline-info``,
8+
``baseline-depth``, ``validate-base``) run with the stdlib alone, and the tests
9+
stub the engine modules to assert we call the engine with the right args.
910
1011
Subcommands (all paths/refs come in as argv, never interpolated into source):
1112
@@ -15,6 +16,7 @@
1516
health --artifact-dir D --repo P --name N --issues-out FILE
1617
validate-base --analysis F --expected-sha SHA [--expected-depth K]
1718
baseline-info --analysis F
19+
baseline-depth --analysis F
1820
analyze --repo P --out D --name N --run-id ID --source-sha SHA --depth K [--force-full]
1921
render --analysis F --out D --repo-name N --repo-ref R [--format .md]
2022
concat --docs-dir D --out F
@@ -60,12 +62,23 @@
6062
import shutil
6163
from pathlib import Path
6264

63-
from codeboarding_workflows.analysis import BaselineUnavailableError, run_full, run_incremental
64-
from codeboarding_workflows.rendering import render_docs
65-
from diagram_analysis.exceptions import IncrementalCacheMissingError
66-
from static_analyzer import get_static_analysis
67-
from static_analyzer.analysis_cache import StaticAnalysisCache
68-
from static_analyzer.cluster_helpers import build_all_cluster_results
65+
# The engine packages are imported best-effort so the metadata-only subcommands
66+
# (``baseline-info``, ``baseline-depth``, ``validate-base``) run with the stdlib
67+
# alone — they parse a committed analysis.json and never touch the engine. The
68+
# action invokes them BEFORE the engine package is pip-installed (e.g. while
69+
# resolving the review depth), so a hard import here would break that step. The
70+
# analysis subcommands that DO need the engine fail loudly when these are None.
71+
try:
72+
from codeboarding_workflows.analysis import BaselineUnavailableError, run_full, run_incremental
73+
from codeboarding_workflows.rendering import render_docs
74+
from diagram_analysis.exceptions import IncrementalCacheMissingError
75+
from static_analyzer import get_static_analysis
76+
from static_analyzer.analysis_cache import StaticAnalysisCache
77+
from static_analyzer.cluster_helpers import build_all_cluster_results
78+
except Exception: # engine package not installed (metadata-only subcommands don't need it)
79+
BaselineUnavailableError = IncrementalCacheMissingError = _MissingEngine = type("_MissingEngine", (Exception,), {})
80+
run_full = run_incremental = render_docs = None
81+
get_static_analysis = StaticAnalysisCache = build_all_cluster_results = None
6982

7083
try:
7184
from health.models import Severity
@@ -151,7 +164,7 @@ def _metadata_depth(metadata: dict) -> int | None:
151164

152165
def _supported_depth(metadata: dict) -> int | None:
153166
depth = _metadata_depth(metadata)
154-
return depth if depth in range(1, 4) else None
167+
return depth if depth in range(1, 5) else None
155168

156169

157170
def _analysis_depth_or_default(output_dir: Path, default_depth: int = _DEFAULT_DEPTH) -> int:
@@ -180,6 +193,21 @@ def baseline_info(analysis_path: Path) -> str:
180193
return commit if _SHA_RE.match(commit) else ""
181194

182195

196+
def baseline_depth(analysis_path: Path) -> int | None:
197+
"""Return the committed baseline's metadata.depth_level when present and a
198+
supported value (1-4), else None. Review mode uses this (via the
199+
``baseline-depth`` subcommand) to analyze the PR head at the SAME depth the
200+
committed baseline was generated with, so head and base diff apples-to-apples
201+
instead of defaulting to a shallower depth and reporting phantom changes.
202+
Parsing + the supported-range guard live here so the action shell never reads
203+
the JSON inline (mirrors ``baseline_info``).
204+
"""
205+
metadata = _load_metadata(analysis_path)
206+
if not isinstance(metadata, dict):
207+
return None
208+
return _supported_depth(metadata)
209+
210+
183211
def validate_base_analysis(
184212
analysis_path: Path, expected_sha: str, expected_depth: int | None = None
185213
) -> tuple[bool, str]:
@@ -198,6 +226,12 @@ def validate_base_analysis(
198226
expands persists depth_level 1 — rejecting that would force a full
199227
regeneration on every PR without ever converging. A missing or
200228
unparseable depth_level is accepted — legacy baselines predate the field.
229+
230+
Review now derives ``expected_depth`` from the committed baseline's own
231+
depth_level (via the ``baseline-depth`` subcommand), so the deeper-than-expected
232+
rejection no longer fires for the normal case — head and base are analyzed at
233+
the same depth. The rejection remains a safety net for an explicit
234+
``depth_level`` input that is shallower than the committed baseline.
201235
"""
202236
try:
203237
data = json.loads(analysis_path.read_text(encoding="utf-8"))
@@ -523,7 +557,7 @@ def main(argv=None) -> int:
523557
b = sub.add_parser("base")
524558
for a in ("--repo", "--out", "--name", "--run-id", "--source-sha"):
525559
b.add_argument(a, required=True)
526-
b.add_argument("--depth", required=True, type=int, choices=range(1, 4))
560+
b.add_argument("--depth", required=True, type=int, choices=range(1, 5))
527561

528562
s = sub.add_parser("seed")
529563
for a in ("--repo", "--out", "--source-sha"):
@@ -532,7 +566,7 @@ def main(argv=None) -> int:
532566
h = sub.add_parser("head")
533567
for a in ("--repo", "--out", "--name", "--run-id", "--base-ref", "--target-ref", "--source-sha"):
534568
h.add_argument(a, required=True)
535-
h.add_argument("--depth", required=True, type=int, choices=range(1, 4))
569+
h.add_argument("--depth", required=True, type=int, choices=range(1, 5))
536570
h.add_argument("--force-full", action="store_true", help="Run a full PR-head analysis instead of incremental.")
537571

538572
hc = sub.add_parser("health")
@@ -542,15 +576,18 @@ def main(argv=None) -> int:
542576
vb = sub.add_parser("validate-base")
543577
vb.add_argument("--analysis", required=True)
544578
vb.add_argument("--expected-sha", required=True)
545-
vb.add_argument("--expected-depth", type=int, choices=range(1, 4))
579+
vb.add_argument("--expected-depth", type=int, choices=range(1, 5))
546580

547581
bi = sub.add_parser("baseline-info")
548582
bi.add_argument("--analysis", required=True)
549583

584+
bd = sub.add_parser("baseline-depth")
585+
bd.add_argument("--analysis", required=True)
586+
550587
an = sub.add_parser("analyze")
551588
for a in ("--repo", "--out", "--name", "--run-id", "--source-sha"):
552589
an.add_argument(a, required=True)
553-
an.add_argument("--depth", required=True, type=int, choices=range(1, 4))
590+
an.add_argument("--depth", required=True, type=int, choices=range(1, 5))
554591
an.add_argument("--force-full", action="store_true", help="Ignore any committed baseline and run a full analysis.")
555592

556593
rn = sub.add_parser("render")
@@ -591,6 +628,9 @@ def main(argv=None) -> int:
591628
return 0 if ok else 1
592629
elif args.cmd == "baseline-info":
593630
print(f"commit_hash={baseline_info(Path(args.analysis))}")
631+
elif args.cmd == "baseline-depth":
632+
depth = baseline_depth(Path(args.analysis))
633+
print(f"depth_level={depth if depth is not None else ''}")
594634
elif args.cmd == "analyze":
595635
run_analyze(args.repo, args.out, args.name, args.run_id, args.source_sha, args.depth, args.force_full)
596636
elif args.cmd == "render":

tests/test_engine_adapter.py

Lines changed: 54 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -189,7 +189,7 @@ def test_main_sets_github_action_source(self):
189189
self.assertEqual(os.environ["CODEBOARDING_SOURCE"], "github_action")
190190

191191
def test_main_rejects_invalid_depth(self):
192-
for depth in ("0", "4", "x"):
192+
for depth in ("0", "5", "x"):
193193
with self.subTest(depth=depth):
194194
with redirect_stderr(StringIO()):
195195
with self.assertRaises(SystemExit):
@@ -211,6 +211,30 @@ def test_main_rejects_invalid_depth(self):
211211
]
212212
)
213213

214+
def test_main_accepts_depth_four(self):
215+
# The action's accepted depth ceiling is 4 so a committed depth-4 baseline
216+
# is a first-class value review can inherit (the engine has no depth cap).
217+
rf = _Rec()
218+
self._install(run_full=rf)
219+
engine_adapter.main(
220+
[
221+
"base",
222+
"--repo",
223+
"/repo",
224+
"--out",
225+
"/out",
226+
"--name",
227+
"myrepo",
228+
"--run-id",
229+
"rid-base",
230+
"--depth",
231+
"4",
232+
"--source-sha",
233+
"abc123",
234+
]
235+
)
236+
self.assertEqual(rf.calls[0]["depth_level"], 4)
237+
214238
def test_head_uses_incremental(self):
215239
ri, rf = _Rec(), _Rec()
216240
self._install(run_full=rf, run_incremental=ri)
@@ -497,6 +521,25 @@ def test_validate_base_without_expected_depth_ignores_depth(self):
497521
self.assertTrue(ok)
498522
self.assertIn("matches", message)
499523

524+
def test_validate_base_accepts_depth_four_baseline(self):
525+
# The core fix: review inherits the committed baseline's depth, so a
526+
# depth-4 baseline validated at --expected-depth 4 is accepted (reused,
527+
# not regenerated). Validated at a shallower expected depth it is still
528+
# rejected (an explicit shallower depth_level input).
529+
with tempfile.TemporaryDirectory() as tmp:
530+
path = Path(tmp) / "analysis.json"
531+
path.write_text(
532+
json.dumps({"metadata": {"commit_hash": "abc123", "depth_level": 4}}),
533+
encoding="utf-8",
534+
)
535+
536+
ok_same, _ = engine_adapter.validate_base_analysis(path, "abc123", expected_depth=4)
537+
ok_shallower, message = engine_adapter.validate_base_analysis(path, "abc123", expected_depth=2)
538+
539+
self.assertTrue(ok_same)
540+
self.assertFalse(ok_shallower)
541+
self.assertIn("deeper", message)
542+
500543
def test_main_validate_base_expected_depth_exit_codes(self):
501544
# patch.dict: main() setdefaults CODEBOARDING_SOURCE; don't leak it.
502545
with patch.dict(os.environ), tempfile.TemporaryDirectory() as tmp:
@@ -518,10 +561,18 @@ def test_main_validate_base_expected_depth_exit_codes(self):
518561
),
519562
1,
520563
)
564+
# depth 4 is now an accepted value (against a depth-2 baseline a
565+
# shallower-or-equal expected depth passes the depth check).
566+
self.assertEqual(
567+
engine_adapter.main(
568+
["validate-base", "--analysis", str(path), "--expected-sha", "abc123", "--expected-depth", "4"]
569+
),
570+
0,
571+
)
521572
with redirect_stderr(StringIO()):
522-
with self.assertRaises(SystemExit): # depth outside 1-3 rejected by argparse
573+
with self.assertRaises(SystemExit): # depth outside 1-4 rejected by argparse
523574
engine_adapter.main(
524-
["validate-base", "--analysis", str(path), "--expected-sha", "abc123", "--expected-depth", "4"]
575+
["validate-base", "--analysis", str(path), "--expected-sha", "abc123", "--expected-depth", "5"]
525576
)
526577

527578

0 commit comments

Comments
 (0)