Skip to content

feat(sequence-diagram): carry AppMap labels on diagram actions#2369

Merged
kgilpin merged 4 commits into
mainfrom
feat/sequence-diagram-labels
Jul 3, 2026
Merged

feat(sequence-diagram): carry AppMap labels on diagram actions#2369
kgilpin merged 4 commits into
mainfrom
feat/sequence-diagram-labels

Conversation

@kgilpin

@kgilpin kgilpin commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

What

Each diagram action (function call, HTTP server/client RPC, SQL query) now records the AppMap labels applied to its callee code object, as a sorted labels?: string[] field — omitted when empty.

// buildDiagram.ts, every callee-based branch
labels: labelsOf(callee),   // [...event.codeObject.labels].sort(), or undefined

Why

Labels live only in the raw AppMap classMap and were dropped on sequence-diagram export. Consumers that wanted them (notably the gold-traces behavioral-diff work, which raises the severity of a detected change when the changed node is security-labeled) had to re-join the diagram against the raw AppMap. This makes the exported diagram self-contained: one source of truth for the analysis.

Design notes

  • Not in the digest. labels is not folded into digest/subtreeDigest (those derive from buildStableHash/stableProperties). A pure re-label is therefore not a behavioral change — labels are a classification signal, not a behavioral one.
  • No formatter change. formatter/json.ts serializes the whole diagram, so labels flows into --format json automatically. Text/PlantUML are unchanged (could annotate labeled calls later if desired).
  • All message types. Applied to ServerRPC / ClientRPC / Query / FunctionCall — anything with a callee code object.

Testing

  • yarn test — 40 passed (3 new: label reported on a labeled action, undefined on an unlabeled one, and labels excluded from the digest).
  • yarn lint — 0 errors.
  • End-to-end against a real Python (Strawberry/SQLAlchemy) AppMap: security-labeled resolvers surface their labels in sequence-diagram --format json:
    _claim_player  => ["security.authorization","security.join_code"]
    _game_by_code  => ["security.authorization","security.join_code"]
    

🤖 Generated with Claude Code

kgilpin and others added 2 commits June 27, 2026 10:06
Each function/RPC/query action now records the AppMap labels applied to its
callee code object (sorted, omitted when empty), so consumers can read labels
straight from the exported sequence diagram instead of re-joining against the
raw AppMap classMap.

Labels are not folded into `digest`/`subtreeDigest`, so a pure re-label is not
treated as a behavioral change — labels are a classification signal, not a
behavioral one. The JSON formatter serializes the whole diagram, so the new
field flows into `--format json` output with no formatter change.

Motivated by the gold-traces behavioral-diff work, which classifies the
severity of a detected change by whether the changed node is security-labeled.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Extract the value-aware truncation out of the MCP tree renderer
(treeRender.ts `truncateStructValue`) into a shared lib
(src/lib/truncateStructValue.ts), and add an `appmap trim <files..>`
command that uses it to shrink AppMaps by truncating captured
parameter/return/receiver/message/exception values in place (or to
--output-dir), tunable via --max-length.

Trimming only touches captured `value` strings — call structure and
every stable property are untouched — so a trimmed AppMap is
behaviorally identical (its sequence-diagram digest is unchanged) but
much smaller. This is what the gold-traces engine uses to keep committed
baselines lean.

parseStruct also learns two shapes it previously flat-capped: Node
util.inspect `ClassName { ... }` (whitespace after the type name) and
multi-line bodies. Backward-compatible — the MCP renderer's 26 tests are
unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

AppMap Behavioral Review — sequence-diagram labels + appmap trim

Revisions: 6defff5e vs main (78d4f20b)
Date: 2026-07-02
Commits reviewed:

  • 0d99566f feat(sequence-diagram): carry AppMap labels on diagram actions
  • 87824eb4 feat(cli): add appmap trim command and shared value-truncation lib
  • b305a8a3 / 8d9ee218 chore(gold-traces): establish & update the behavioral baseline
  • (a1d20de8, 45942513, 90b9ada5, 9bc207c6, afdd443d are CI-workflow only — out of behavioral scope)

⚠️ First-ever baseline — no behavioral compare was possible.
The baseline revision main (78d4f20b) has zero committed gold traces
(git ls-tree -r main | grep gold_traces/.*\.appmap\.json → 0). The gold-trace
baseline is introduced on this branch (commit 8d9ee218), so there is no base/
side to diff against and appmap compare cannot run. Per the skill, this is a valid
"so-far" baseline report: the 17 head traces below establish the baseline that
future runs will diff against. Findings here come from the source diff + inspection
of the head traces
, not from a behavioral delta. No behavioral drift can be — or is —
asserted.


Feature List

  1. Sequence-diagram actions carry AppMap labels. buildDiagram now sets
    labels: labelsOf(callee) on every emitted node (ServerRPC, ClientRPC, Query,
    FunctionCall). labelsOf reads event.codeObject?.labels, sorts for stable output,
    and returns undefined when empty so the field is omitted
    (buildDiagram.ts:27).
  2. Node.labels type field, deliberately excluded from the digest. types.ts adds
    labels?: string[] with an explicit contract that it is not part of
    digest/subtreeDigest — "a pure re-label is not a behavioral change"
    (types.ts:47).
  3. New appmap trim command. trimAppMap walks events[] and truncates every
    captured value string (parameters, message, receiver, return_value,
    exceptions) in place, leaving call structure/code objects/SQL untouched; registered
    in the CLI (trim.ts:1,
    cli.ts:150).
  4. Shared truncateStructValue library, extracted from the MCP tree renderer. The
    struct-aware, per-field value truncator moves from cmds/query/lib/treeRender.ts into
    lib/truncateStructValue.ts so both the MCP renderer and trim share it; treeRender
    re-exports it for back-compat (truncateStructValue.ts:1,
    treeRender.ts:13). The move is not
    byte-identical
    — the parser gained: multi-line Ruby-repr bodies ((.+)([\s\S]+)),
    ClassName {…} with a space before the brace ([\w.$]*[\w.$]*\s*), and whitespace-only
    bodies (body.lengthbody.trim().length). These change MCP renderer output for those shapes.
  5. security.path-resolution label annotation on resolveAppmapPath. A
    // @label security.path-resolution comment only — no code change to the function
    (recordingId.ts:37).

Intended scope (git diff --name-only main..6defff5e, app code): packages/sequence-diagram/src/{buildDiagram,types}.ts, packages/cli/src/{cli.ts, cmds/trim/trim.ts, lib/truncateStructValue.ts, cmds/query/lib/{recordingId,treeRender}.ts}. Both feature commits are additive/refactor; neither claims to preserve behavior in a subsystem it doesn't touch.


Coverage Matrix

Feature Covered by Status
buildDiagram emits nodes; labelsOf runs Sequence_diagram/labels/are_reported… (buildDiagram ×49, labelsOf ×10) ✅ helper executes
labels carried onto a diagram action (the headline branch) uncovered — the recording is of the jest test process and contains 0 labeled code objects, so labelsOf returns undefined on all 10 calls; the "labels present" branch never runs, and labels are digest-excluded by design (Feature 2) so no compare would ever flag it
appmap trim / trimAppMap no trace uncovered — no gold trace records trim
truncateStructValue + parser enhancements (Feature 4) no trace uncovered — no head trace exercises truncateStructValue/truncate/renderTreeLineForMcp (grep over all 17 traces: 0 hits)
resolveAppmapPath (label annotation) MCP_handler/get_call_tree… (resolveAppmapPath ×2) ✅ path executes (no code change to verify)
SQL query rendering (unchanged) query-find, query-tree, Sequence_diagram/SQL_query — baseline only
HTTP request rendering (unchanged) OpenAPI, query-endpoints, Sequence_diagram/HTTP_server_request — baseline only

Commands to close the gaps (record from the package dir so the ts-jest preset applies — see repo CLAUDE.md):

# Value-truncation + MCP renderer (the enhanced parser paths):
cd packages/cli && npx appmap-node npx jest tests/unit/cmds/query/lib/treeRender.spec.ts
# trim command end-to-end:
cd packages/cli && npx appmap-node npx jest tests/unit/cmds/trim   # (add a trim.spec.ts first — see Tests to Synthesize)
# labels actually carried: record a fixture that HAS labeled code objects, not the bare harness run.

Suggested Labels

Functions that changed but carry no AppMap label — labeling them (via appmap-label) lets
the next review interpret them:

  • command.performtrim.ts:80 handler — the appmap trim command entry point; labeling the handler lets a future compare recognize it as a command invocation.
  • deserialize.safetrim.ts:95 JSON.parse(before) in handlertrim deserializes arbitrary on-disk AppMap JSON; a label marks the trust boundary.

Note on Feature 5: resolveAppmapPath was given a // @label security.path-resolution
comment, but the MCP_handler gold trace shows the recording emits no labeled code
objects (0 labeled objects in the classMap). Confirm the Node recorder actually harvests
// @label line comments for TS (labels are usually applied via appmap.yml/config, not
line comments) — otherwise this annotation is inert and the label won't appear in traces.
See 🟢 LOW below.


Behavioral Drift

None can be computed — there is no committed baseline to diff against (main has 0 gold
traces). The 17 head traces are the new baseline. Normally a changed entry here would be
real because the compare digest excludes volatile data (elapsed time, object ids,
parameter/return values) by construction; that machinery simply had no base/ side to run on
this time.

What the head traces confirm about the changes: buildDiagram/labelsOf execute in the
sequence-diagram labels trace, and resolveAppmapPath executes in the MCP trace — so those
paths are wired and non-crashing. The value-truncation library and the trim command execute
in no trace at all. Two subsystems (SQL rendering, HTTP rendering) have baseline traces but
were not touched by the diff, so future compares will hold them fixed.

One design fact matters for every future run: the labels feature is intentionally invisible to
the behavioral diff.
Node.labels is excluded from digest/subtreeDigest
(types.ts:47), so a regression in labelsOf
(wrong labels, dropped labels) will never move a trace digest. Coverage of this feature must
come from unit assertions, not from gold-trace drift — record that expectation.


Unintended Side Effects

No behavioral compare was possible, so nothing can be observed changing outside scope. One
static residue is worth a confirm:

Changed trace Out-of-scope change In the diff? Assessment
(none — no base to diff) truncateStructValue parser now matches ClassName {…} (space), multi-line Ruby bodies, and whitespace-only bodies — reached by the MCP tree/find renderers, not just the new trim command yes — in 87824eb4, but folded into a commit whose headline is "add trim command" 🟡 concerning-to-confirm — the MCP renderer's output for those value shapes changes, and no gold trace covers it, so the change is unverified either way

The parser enhancement is deliberate (the comments spell out each one), so it is intended, not a
stray edit — but it lands in the agent-facing MCP renderer with zero trace coverage. Confirm the
intended blast radius by recording treeRender.spec.ts (above) so the next review can hold the
MCP value-truncation output fixed.


Suggestions

🟡 MEDIUM — Value-truncation refactor changes MCP renderer output but is untraced

File: truncateStructValue.ts Context: parseStruct / splitTopLevelFields, lines 51–68

The extraction from treeRender.ts is not a pure move: parseStruct's type-name group gained
\s* (so ClassName { … } is now parsed struct-aware instead of flat-truncated), the Ruby branch
gained [\s\S]+ (multi-line bodies), and splitTopLevelFields switched to body.trim().length.
These functions feed the dense MCP tree/find renderers that an LLM agent reads. No gold trace
exercises truncateStructValue, so this output change is unverified in the baseline.

Risk: low correctness risk (the changes are additive parser reach), but the MCP surface an
agent consumes changed with no behavioral guard — a future regression here would pass the gold-trace
gate silently.

Recommended remediation: add a gold trace over the MCP value-truncation path so the digest
guards it:

cd packages/cli && npx appmap-node npx jest tests/unit/cmds/query/lib/treeRender.spec.ts
# then add an entry to packages/cli/gold_traces/manifest.yaml for it and re-bless.

🟡 MEDIUM — appmap trim and trimAppMap have no test or gold trace

File: trim.ts Context: whole command, lines 1–102

A brand-new command that overwrites AppMap files in place by default
(trim.ts:97) ships with no unit test and no gold trace.
trimAppMap mutates the parsed object in place and the loop writes each file before processing the
next, so a malformed input mid-batch (JSON.parse throws) aborts the run after already
overwriting earlier files — a partial, irreversible edit with no dry-run.

Risk: data-destructive on bad input; the in-place default means a user who points trim at a
directory of recordings and hits one corrupt file is left with a half-trimmed set.

Recommended remediation: add trim.spec.ts asserting (a) values shrink while structure/digest
is preserved, and (b) a malformed file is skipped, not fatal. Consider validating/parsing all inputs
before writing any output. Regression test:

it('trims value strings but preserves the sequence-diagram digest', () => {
  const before = JSON.parse(readFileSync(fixture, 'utf8'));
  const after = trimAppMap(JSON.parse(readFileSync(fixture, 'utf8')));
  expect(digestOf(after)).toEqual(digestOf(before));       // structure unchanged
  expect(JSON.stringify(after).length).toBeLessThan(JSON.stringify(before).length);
});

🟢 LOW — trim --max-length only bounds flat strings, not struct field values

File: trim.ts:85 Context: handler

--max-length is wired only into budget.flatCap; perValueCap (48) and idCap (12) stay fixed.
So trim --max-length 10 still emits struct field values up to 48 chars — the flag's name implies a
global cap it doesn't enforce.

Risk: cosmetic/expectation mismatch; no correctness impact.

Recommended remediation: either document that --max-length caps only non-struct strings, or
scale perValueCap/idCap proportionally to maxLength.

🟢 LOW — security.path-resolution label may be inert (line-comment annotation)

File: recordingId.ts:37 Context: resolveAppmapPath

The label is declared as a // @label line comment, but the MCP_handler gold trace that exercises
resolveAppmapPath shows no labeled code objects. Confirm the Node recorder harvests line-comment
labels for TypeScript; if not, apply the label via appmap.yml functions: so it actually rides on
future traces (and so a future review can reason about the auth/path-resolution boundary).

🟢 INFO — Sequence-diagram labels feature is correct and intended

labelsOf sorts for stable output and omits the field when empty; Node.labels is correctly kept
out of the digest so re-labeling isn't a false-positive behavioral change. Wiring is confirmed
(buildDiagram ×49 / labelsOf ×10 in the labels trace). The only gap is that no committed trace
contains labeled objects, so the carried-labels branch is unexercised (see Coverage Matrix).


Tests to Synthesize

Target Test name Priority
truncateStructValue (MCP renderer path) gold trace via treeRender.spec.ts recording Medium
appmap trim / trimAppMap trim.spec.ts — value shrinks, digest preserved, malformed input skipped Medium
buildDiagram labels branch unit assert: a callee with codeObject.labels yields sorted labels on its node; empty ⇒ field omitted Medium
trim --max-length assert flag bounds output as documented Low

SQL Pass

No SQL was added or changed by this diff. The query-side changes (resolveAppmapPath,
treeRender, truncateStructValue) touch rendering/formatting of already-fetched values, not
query construction — no new predicate, no dynamic SQL, no concatenation, no injection surface. The
query-find/query-tree/SQL_query gold traces exist as a baseline but show no query drift (and
none is expected). resolveAppmapPath still resolves via the existing labeled-display-name /
basename logic unchanged. Nothing to flag.

HTTP Pass

No request-handling code changed. buildDiagram now attaches labels to ServerRPC/ClientRPC
nodes, but that is diagram metadata, not request processing — no change to auth gating, routing,
input handling, or transport. The OpenAPI, query-endpoints, and HTTP_server_request traces
are baseline-only. Nothing to flag.


Summary

Severity Findings Action required
🔴 High 0
🟡 Medium 2 Add gold-trace / unit coverage for the value-truncation refactor and the new trim command before relying on the baseline to guard them
🟢 Low 3 (2 LOW + 1 INFO) Clarify --max-length scope; verify the security.path-resolution label is harvested; note the labels feature is digest-invisible by design

Not merge-blocking. No high-severity or destructive-drift finding, and — critically — no
behavioral drift could be computed
because main carries no gold traces; this run establishes
the baseline rather than diffing against one. The single most important follow-up is coverage: the
value-truncation library (now feeding the agent-facing MCP renderer) and the file-mutating trim
command both ship with zero gold-trace coverage, and the headline labels feature is intentionally
excluded from the digest — so today's gold traces would not catch a regression in any of the three.
Record treeRender.spec.ts and add a trim.spec.ts, then re-bless, so the next review has real
behavioral signal to diff.

…labels

- Update the sequenceDiagram RPC integration fixture for the new action
  labels.
- Add handler-level fingerprint tests (a real-fixture Fingerprinter test
  and a FingerprintDirectoryCommand.execute() test).
- Annotate two security-relevant boundaries with @Label
  (security.deserialization on pruneWithFilter, security.path-resolution
  on resolveAppmapPath) for behavioral review.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Addresses the behavioral review of the trim command:

- Skip a malformed/unreadable file instead of aborting the batch, so one
  bad file no longer stops the rest or leaves earlier files half-done;
  fail only when every input fails.
- Write each output atomically via writeFileAtomic (temp file in the same
  dir, then rename) — reusing the fingerprint indexer's helper — so an
  interrupted write never leaves a partial AppMap in place.
- --max-length now caps struct field and id-shaped values too (lowered to
  min(default, maxLength)), not just flat strings; the default (120) is
  unchanged, so committed baselines don't churn.

Adds handler tests: a malformed file is skipped (good file still trimmed
in place, bad file untouched), and an all-failed batch throws.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@kgilpin kgilpin merged commit e24a475 into main Jul 3, 2026
26 checks passed
@kgilpin kgilpin deleted the feat/sequence-diagram-labels branch July 3, 2026 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants