feat(sequence-diagram): carry AppMap labels on diagram actions#2369
Conversation
Each function/RPC/query action now records the AppMap labels applied to its callee code object (sorted, omitted when empty), so consumers can read labels straight from the exported sequence diagram instead of re-joining against the raw AppMap classMap. Labels are not folded into `digest`/`subtreeDigest`, so a pure re-label is not treated as a behavioral change — labels are a classification signal, not a behavioral one. The JSON formatter serializes the whole diagram, so the new field flows into `--format json` output with no formatter change. Motivated by the gold-traces behavioral-diff work, which classifies the severity of a detected change by whether the changed node is security-labeled. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Extract the value-aware truncation out of the MCP tree renderer
(treeRender.ts `truncateStructValue`) into a shared lib
(src/lib/truncateStructValue.ts), and add an `appmap trim <files..>`
command that uses it to shrink AppMaps by truncating captured
parameter/return/receiver/message/exception values in place (or to
--output-dir), tunable via --max-length.
Trimming only touches captured `value` strings — call structure and
every stable property are untouched — so a trimmed AppMap is
behaviorally identical (its sequence-diagram digest is unchanged) but
much smaller. This is what the gold-traces engine uses to keep committed
baselines lean.
parseStruct also learns two shapes it previously flat-capped: Node
util.inspect `ClassName { ... }` (whitespace after the type name) and
multi-line bodies. Backward-compatible — the MCP renderer's 26 tests are
unchanged.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
AppMap Behavioral Review — sequence-diagram labels +
|
| Feature | Covered by | Status |
|---|---|---|
buildDiagram emits nodes; labelsOf runs |
Sequence_diagram/labels/are_reported… (buildDiagram ×49, labelsOf ×10) |
✅ helper executes |
| labels carried onto a diagram action (the headline branch) | — | ❌ uncovered — the recording is of the jest test process and contains 0 labeled code objects, so labelsOf returns undefined on all 10 calls; the "labels present" branch never runs, and labels are digest-excluded by design (Feature 2) so no compare would ever flag it |
appmap trim / trimAppMap |
no trace | ❌ uncovered — no gold trace records trim |
truncateStructValue + parser enhancements (Feature 4) |
no trace | ❌ uncovered — no head trace exercises truncateStructValue/truncate/renderTreeLineForMcp (grep over all 17 traces: 0 hits) |
resolveAppmapPath (label annotation) |
MCP_handler/get_call_tree… (resolveAppmapPath ×2) |
✅ path executes (no code change to verify) |
| SQL query rendering (unchanged) | query-find, query-tree, Sequence_diagram/SQL_query |
— baseline only |
| HTTP request rendering (unchanged) | OpenAPI, query-endpoints, Sequence_diagram/HTTP_server_request |
— baseline only |
Commands to close the gaps (record from the package dir so the ts-jest preset applies — see repo CLAUDE.md):
# Value-truncation + MCP renderer (the enhanced parser paths):
cd packages/cli && npx appmap-node npx jest tests/unit/cmds/query/lib/treeRender.spec.ts
# trim command end-to-end:
cd packages/cli && npx appmap-node npx jest tests/unit/cmds/trim # (add a trim.spec.ts first — see Tests to Synthesize)
# labels actually carried: record a fixture that HAS labeled code objects, not the bare harness run.Suggested Labels
Functions that changed but carry no AppMap label — labeling them (via appmap-label) lets
the next review interpret them:
command.perform— trim.ts:80handler— theappmap trimcommand entry point; labeling the handler lets a future compare recognize it as a command invocation.deserialize.safe— trim.ts:95JSON.parse(before)inhandler—trimdeserializes arbitrary on-disk AppMap JSON; a label marks the trust boundary.
Note on Feature 5: resolveAppmapPath was given a // @label security.path-resolution
comment, but the MCP_handler gold trace shows the recording emits no labeled code
objects (0 labeled objects in the classMap). Confirm the Node recorder actually harvests
// @label line comments for TS (labels are usually applied via appmap.yml/config, not
line comments) — otherwise this annotation is inert and the label won't appear in traces.
See 🟢 LOW below.
Behavioral Drift
None can be computed — there is no committed baseline to diff against (main has 0 gold
traces). The 17 head traces are the new baseline. Normally a changed entry here would be
real because the compare digest excludes volatile data (elapsed time, object ids,
parameter/return values) by construction; that machinery simply had no base/ side to run on
this time.
What the head traces confirm about the changes: buildDiagram/labelsOf execute in the
sequence-diagram labels trace, and resolveAppmapPath executes in the MCP trace — so those
paths are wired and non-crashing. The value-truncation library and the trim command execute
in no trace at all. Two subsystems (SQL rendering, HTTP rendering) have baseline traces but
were not touched by the diff, so future compares will hold them fixed.
One design fact matters for every future run: the labels feature is intentionally invisible to
the behavioral diff. Node.labels is excluded from digest/subtreeDigest
(types.ts:47), so a regression in labelsOf
(wrong labels, dropped labels) will never move a trace digest. Coverage of this feature must
come from unit assertions, not from gold-trace drift — record that expectation.
Unintended Side Effects
No behavioral compare was possible, so nothing can be observed changing outside scope. One
static residue is worth a confirm:
| Changed trace | Out-of-scope change | In the diff? | Assessment |
|---|---|---|---|
| (none — no base to diff) | truncateStructValue parser now matches ClassName {…} (space), multi-line Ruby bodies, and whitespace-only bodies — reached by the MCP tree/find renderers, not just the new trim command |
yes — in 87824eb4, but folded into a commit whose headline is "add trim command" |
🟡 concerning-to-confirm — the MCP renderer's output for those value shapes changes, and no gold trace covers it, so the change is unverified either way |
The parser enhancement is deliberate (the comments spell out each one), so it is intended, not a
stray edit — but it lands in the agent-facing MCP renderer with zero trace coverage. Confirm the
intended blast radius by recording treeRender.spec.ts (above) so the next review can hold the
MCP value-truncation output fixed.
Suggestions
🟡 MEDIUM — Value-truncation refactor changes MCP renderer output but is untraced
File: truncateStructValue.ts Context: parseStruct / splitTopLevelFields, lines 51–68
The extraction from treeRender.ts is not a pure move: parseStruct's type-name group gained
\s* (so ClassName { … } is now parsed struct-aware instead of flat-truncated), the Ruby branch
gained [\s\S]+ (multi-line bodies), and splitTopLevelFields switched to body.trim().length.
These functions feed the dense MCP tree/find renderers that an LLM agent reads. No gold trace
exercises truncateStructValue, so this output change is unverified in the baseline.
Risk: low correctness risk (the changes are additive parser reach), but the MCP surface an
agent consumes changed with no behavioral guard — a future regression here would pass the gold-trace
gate silently.
Recommended remediation: add a gold trace over the MCP value-truncation path so the digest
guards it:
cd packages/cli && npx appmap-node npx jest tests/unit/cmds/query/lib/treeRender.spec.ts
# then add an entry to packages/cli/gold_traces/manifest.yaml for it and re-bless.🟡 MEDIUM — appmap trim and trimAppMap have no test or gold trace
File: trim.ts Context: whole command, lines 1–102
A brand-new command that overwrites AppMap files in place by default
(trim.ts:97) ships with no unit test and no gold trace.
trimAppMap mutates the parsed object in place and the loop writes each file before processing the
next, so a malformed input mid-batch (JSON.parse throws) aborts the run after already
overwriting earlier files — a partial, irreversible edit with no dry-run.
Risk: data-destructive on bad input; the in-place default means a user who points trim at a
directory of recordings and hits one corrupt file is left with a half-trimmed set.
Recommended remediation: add trim.spec.ts asserting (a) values shrink while structure/digest
is preserved, and (b) a malformed file is skipped, not fatal. Consider validating/parsing all inputs
before writing any output. Regression test:
it('trims value strings but preserves the sequence-diagram digest', () => {
const before = JSON.parse(readFileSync(fixture, 'utf8'));
const after = trimAppMap(JSON.parse(readFileSync(fixture, 'utf8')));
expect(digestOf(after)).toEqual(digestOf(before)); // structure unchanged
expect(JSON.stringify(after).length).toBeLessThan(JSON.stringify(before).length);
});🟢 LOW — trim --max-length only bounds flat strings, not struct field values
File: trim.ts:85 Context: handler
--max-length is wired only into budget.flatCap; perValueCap (48) and idCap (12) stay fixed.
So trim --max-length 10 still emits struct field values up to 48 chars — the flag's name implies a
global cap it doesn't enforce.
Risk: cosmetic/expectation mismatch; no correctness impact.
Recommended remediation: either document that --max-length caps only non-struct strings, or
scale perValueCap/idCap proportionally to maxLength.
🟢 LOW — security.path-resolution label may be inert (line-comment annotation)
File: recordingId.ts:37 Context: resolveAppmapPath
The label is declared as a // @label line comment, but the MCP_handler gold trace that exercises
resolveAppmapPath shows no labeled code objects. Confirm the Node recorder harvests line-comment
labels for TypeScript; if not, apply the label via appmap.yml functions: so it actually rides on
future traces (and so a future review can reason about the auth/path-resolution boundary).
🟢 INFO — Sequence-diagram labels feature is correct and intended
labelsOf sorts for stable output and omits the field when empty; Node.labels is correctly kept
out of the digest so re-labeling isn't a false-positive behavioral change. Wiring is confirmed
(buildDiagram ×49 / labelsOf ×10 in the labels trace). The only gap is that no committed trace
contains labeled objects, so the carried-labels branch is unexercised (see Coverage Matrix).
Tests to Synthesize
| Target | Test name | Priority |
|---|---|---|
truncateStructValue (MCP renderer path) |
gold trace via treeRender.spec.ts recording |
Medium |
appmap trim / trimAppMap |
trim.spec.ts — value shrinks, digest preserved, malformed input skipped |
Medium |
buildDiagram labels branch |
unit assert: a callee with codeObject.labels yields sorted labels on its node; empty ⇒ field omitted |
Medium |
trim --max-length |
assert flag bounds output as documented | Low |
SQL Pass
No SQL was added or changed by this diff. The query-side changes (resolveAppmapPath,
treeRender, truncateStructValue) touch rendering/formatting of already-fetched values, not
query construction — no new predicate, no dynamic SQL, no concatenation, no injection surface. The
query-find/query-tree/SQL_query gold traces exist as a baseline but show no query drift (and
none is expected). resolveAppmapPath still resolves via the existing labeled-display-name /
basename logic unchanged. Nothing to flag.
HTTP Pass
No request-handling code changed. buildDiagram now attaches labels to ServerRPC/ClientRPC
nodes, but that is diagram metadata, not request processing — no change to auth gating, routing,
input handling, or transport. The OpenAPI, query-endpoints, and HTTP_server_request traces
are baseline-only. Nothing to flag.
Summary
| Severity | Findings | Action required |
|---|---|---|
| 🔴 High | 0 | — |
| 🟡 Medium | 2 | Add gold-trace / unit coverage for the value-truncation refactor and the new trim command before relying on the baseline to guard them |
| 🟢 Low | 3 (2 LOW + 1 INFO) | Clarify --max-length scope; verify the security.path-resolution label is harvested; note the labels feature is digest-invisible by design |
Not merge-blocking. No high-severity or destructive-drift finding, and — critically — no
behavioral drift could be computed because main carries no gold traces; this run establishes
the baseline rather than diffing against one. The single most important follow-up is coverage: the
value-truncation library (now feeding the agent-facing MCP renderer) and the file-mutating trim
command both ship with zero gold-trace coverage, and the headline labels feature is intentionally
excluded from the digest — so today's gold traces would not catch a regression in any of the three.
Record treeRender.spec.ts and add a trim.spec.ts, then re-bless, so the next review has real
behavioral signal to diff.
…labels - Update the sequenceDiagram RPC integration fixture for the new action labels. - Add handler-level fingerprint tests (a real-fixture Fingerprinter test and a FingerprintDirectoryCommand.execute() test). - Annotate two security-relevant boundaries with @Label (security.deserialization on pruneWithFilter, security.path-resolution on resolveAppmapPath) for behavioral review. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
e91f159 to
44cf188
Compare
Addresses the behavioral review of the trim command: - Skip a malformed/unreadable file instead of aborting the batch, so one bad file no longer stops the rest or leaves earlier files half-done; fail only when every input fails. - Write each output atomically via writeFileAtomic (temp file in the same dir, then rename) — reusing the fingerprint indexer's helper — so an interrupted write never leaves a partial AppMap in place. - --max-length now caps struct field and id-shaped values too (lowered to min(default, maxLength)), not just flat strings; the default (120) is unchanged, so committed baselines don't churn. Adds handler tests: a malformed file is skipped (good file still trimmed in place, bad file untouched), and an all-failed batch throws. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
What
Each diagram action (function call, HTTP server/client RPC, SQL query) now records the AppMap labels applied to its callee code object, as a sorted
labels?: string[]field — omitted when empty.Why
Labels live only in the raw AppMap
classMapand were dropped on sequence-diagram export. Consumers that wanted them (notably the gold-traces behavioral-diff work, which raises the severity of a detected change when the changed node is security-labeled) had to re-join the diagram against the raw AppMap. This makes the exported diagram self-contained: one source of truth for the analysis.Design notes
labelsis not folded intodigest/subtreeDigest(those derive frombuildStableHash/stableProperties). A pure re-label is therefore not a behavioral change — labels are a classification signal, not a behavioral one.formatter/json.tsserializes the whole diagram, solabelsflows into--format jsonautomatically. Text/PlantUML are unchanged (could annotate labeled calls later if desired).calleecode object.Testing
yarn test— 40 passed (3 new: label reported on a labeled action,undefinedon an unlabeled one, and labels excluded from the digest).yarn lint— 0 errors.sequence-diagram --format json:🤖 Generated with Claude Code