feat(sequence-diagram): carry AppMap labels on diagram actions by kgilpin · Pull Request #2369 · getappmap/appmap-js

kgilpin · 2026-06-27T14:07:34Z

What

Each diagram action (function call, HTTP server/client RPC, SQL query) now records the AppMap labels applied to its callee code object, as a sorted labels?: string[] field — omitted when empty.

// buildDiagram.ts, every callee-based branch
labels: labelsOf(callee),   // [...event.codeObject.labels].sort(), or undefined

Why

Labels live only in the raw AppMap classMap and were dropped on sequence-diagram export. Consumers that wanted them (notably the gold-traces behavioral-diff work, which raises the severity of a detected change when the changed node is security-labeled) had to re-join the diagram against the raw AppMap. This makes the exported diagram self-contained: one source of truth for the analysis.

Design notes

Not in the digest. labels is not folded into digest/subtreeDigest (those derive from buildStableHash/stableProperties). A pure re-label is therefore not a behavioral change — labels are a classification signal, not a behavioral one.
No formatter change. formatter/json.ts serializes the whole diagram, so labels flows into --format json automatically. Text/PlantUML are unchanged (could annotate labeled calls later if desired).
All message types. Applied to ServerRPC / ClientRPC / Query / FunctionCall — anything with a callee code object.

Testing

yarn test — 40 passed (3 new: label reported on a labeled action, undefined on an unlabeled one, and labels excluded from the digest).
yarn lint — 0 errors.
End-to-end against a real Python (Strawberry/SQLAlchemy) AppMap: security-labeled resolvers surface their labels in sequence-diagram --format json:
```
_claim_player  => ["security.authorization","security.join_code"]
_game_by_code  => ["security.authorization","security.join_code"]
```

🤖 Generated with Claude Code

Each function/RPC/query action now records the AppMap labels applied to its callee code object (sorted, omitted when empty), so consumers can read labels straight from the exported sequence diagram instead of re-joining against the raw AppMap classMap. Labels are not folded into `digest`/`subtreeDigest`, so a pure re-label is not treated as a behavioral change — labels are a classification signal, not a behavioral one. The JSON formatter serializes the whole diagram, so the new field flows into `--format json` output with no formatter change. Motivated by the gold-traces behavioral-diff work, which classifies the severity of a detected change by whether the changed node is security-labeled. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Extract the value-aware truncation out of the MCP tree renderer (treeRender.ts `truncateStructValue`) into a shared lib (src/lib/truncateStructValue.ts), and add an `appmap trim <files..>` command that uses it to shrink AppMaps by truncating captured parameter/return/receiver/message/exception values in place (or to --output-dir), tunable via --max-length. Trimming only touches captured `value` strings — call structure and every stable property are untouched — so a trimmed AppMap is behaviorally identical (its sequence-diagram digest is unchanged) but much smaller. This is what the gold-traces engine uses to keep committed baselines lean. parseStruct also learns two shapes it previously flat-capped: Node util.inspect `ClassName { ... }` (whitespace after the type name) and multi-line bodies. Backward-compatible — the MCP renderer's 26 tests are unchanged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

github-actions · 2026-07-02T15:56:54Z

AppMap Behavioral Review — sequence-diagram labels + `appmap trim`

Revisions: 6defff5e vs main (78d4f20b)
Date: 2026-07-02
Commits reviewed:

0d99566f feat(sequence-diagram): carry AppMap labels on diagram actions
87824eb4 feat(cli): add appmap trim command and shared value-truncation lib
b305a8a3 / 8d9ee218 chore(gold-traces): establish & update the behavioral baseline
(a1d20de8, 45942513, 90b9ada5, 9bc207c6, afdd443d are CI-workflow only — out of behavioral scope)

⚠️ First-ever baseline — no behavioral compare was possible.
The baseline revision main (78d4f20b) has zero committed gold traces
(git ls-tree -r main | grep gold_traces/.*\.appmap\.json → 0). The gold-trace
baseline is introduced on this branch (commit 8d9ee218), so there is no base/
side to diff against and appmap compare cannot run. Per the skill, this is a valid
"so-far" baseline report: the 17 head traces below establish the baseline that
future runs will diff against. Findings here come from the source diff + inspection
of the head traces, not from a behavioral delta. No behavioral drift can be — or is —
asserted.

Feature List

Sequence-diagram actions carry AppMap labels. buildDiagram now sets
labels: labelsOf(callee) on every emitted node (ServerRPC, ClientRPC, Query,
FunctionCall). labelsOf reads event.codeObject?.labels, sorts for stable output,
and returns undefined when empty so the field is omitted
(buildDiagram.ts:27).
Node.labels type field, deliberately excluded from the digest. types.ts adds
labels?: string[] with an explicit contract that it is not part of
digest/subtreeDigest — "a pure re-label is not a behavioral change"
(types.ts:47).
New appmap trim command. trimAppMap walks events[] and truncates every
captured value string (parameters, message, receiver, return_value,
exceptions) in place, leaving call structure/code objects/SQL untouched; registered
in the CLI (trim.ts:1,
cli.ts:150).
Shared truncateStructValue library, extracted from the MCP tree renderer. The
struct-aware, per-field value truncator moves from cmds/query/lib/treeRender.ts into
lib/truncateStructValue.ts so both the MCP renderer and trim share it; treeRender
re-exports it for back-compat (truncateStructValue.ts:1,
treeRender.ts:13). The move is not
byte-identical — the parser gained: multi-line Ruby-repr bodies ((.+)→([\s\S]+)),
ClassName {…} with a space before the brace ([\w.$]*→[\w.$]*\s*), and whitespace-only
bodies (body.length→body.trim().length). These change MCP renderer output for those shapes.
security.path-resolution label annotation on resolveAppmapPath. A
// @label security.path-resolution comment only — no code change to the function
(recordingId.ts:37).

Intended scope (git diff --name-only main..6defff5e, app code): packages/sequence-diagram/src/{buildDiagram,types}.ts, packages/cli/src/{cli.ts, cmds/trim/trim.ts, lib/truncateStructValue.ts, cmds/query/lib/{recordingId,treeRender}.ts}. Both feature commits are additive/refactor; neither claims to preserve behavior in a subsystem it doesn't touch.

Coverage Matrix

Feature	Covered by	Status
`buildDiagram` emits nodes; `labelsOf` runs	`Sequence_diagram/labels/are_reported…` (buildDiagram ×49, labelsOf ×10)	✅ helper executes
labels carried onto a diagram action (the headline branch)	—	❌ uncovered — the recording is of the jest test process and contains 0 labeled code objects, so `labelsOf` returns `undefined` on all 10 calls; the "labels present" branch never runs, and labels are digest-excluded by design (Feature 2) so no compare would ever flag it
`appmap trim` / `trimAppMap`	no trace	❌ uncovered — no gold trace records `trim`
`truncateStructValue` + parser enhancements (Feature 4)	no trace	❌ uncovered — no head trace exercises `truncateStructValue`/`truncate`/`renderTreeLineForMcp` (grep over all 17 traces: 0 hits)
`resolveAppmapPath` (label annotation)	`MCP_handler/get_call_tree…` (resolveAppmapPath ×2)	✅ path executes (no code change to verify)
SQL query rendering (unchanged)	`query-find`, `query-tree`, `Sequence_diagram/SQL_query`	— baseline only
HTTP request rendering (unchanged)	`OpenAPI`, `query-endpoints`, `Sequence_diagram/HTTP_server_request`	— baseline only

Commands to close the gaps (record from the package dir so the ts-jest preset applies — see repo CLAUDE.md):

# Value-truncation + MCP renderer (the enhanced parser paths):
cd packages/cli && npx appmap-node npx jest tests/unit/cmds/query/lib/treeRender.spec.ts
# trim command end-to-end:
cd packages/cli && npx appmap-node npx jest tests/unit/cmds/trim   # (add a trim.spec.ts first — see Tests to Synthesize)
# labels actually carried: record a fixture that HAS labeled code objects, not the bare harness run.

Suggested Labels

Functions that changed but carry no AppMap label — labeling them (via appmap-label) lets
the next review interpret them:

command.perform — trim.ts:80 handler — the appmap trim command entry point; labeling the handler lets a future compare recognize it as a command invocation.
deserialize.safe — trim.ts:95 JSON.parse(before) in handler — trim deserializes arbitrary on-disk AppMap JSON; a label marks the trust boundary.

Note on Feature 5: resolveAppmapPath was given a // @label security.path-resolution
comment, but the MCP_handler gold trace shows the recording emits no labeled code
objects (0 labeled objects in the classMap). Confirm the Node recorder actually harvests
// @label line comments for TS (labels are usually applied via appmap.yml/config, not
line comments) — otherwise this annotation is inert and the label won't appear in traces.
See 🟢 LOW below.

Behavioral Drift

None can be computed — there is no committed baseline to diff against (main has 0 gold
traces). The 17 head traces are the new baseline. Normally a changed entry here would be
real because the compare digest excludes volatile data (elapsed time, object ids,
parameter/return values) by construction; that machinery simply had no base/ side to run on
this time.

What the head traces confirm about the changes: buildDiagram/labelsOf execute in the
sequence-diagram labels trace, and resolveAppmapPath executes in the MCP trace — so those
paths are wired and non-crashing. The value-truncation library and the trim command execute
in no trace at all. Two subsystems (SQL rendering, HTTP rendering) have baseline traces but
were not touched by the diff, so future compares will hold them fixed.

One design fact matters for every future run: the labels feature is intentionally invisible to
the behavioral diff. Node.labels is excluded from digest/subtreeDigest
(types.ts:47), so a regression in labelsOf
(wrong labels, dropped labels) will never move a trace digest. Coverage of this feature must
come from unit assertions, not from gold-trace drift — record that expectation.

Unintended Side Effects

No behavioral compare was possible, so nothing can be observed changing outside scope. One
static residue is worth a confirm:

Changed trace	Out-of-scope change	In the diff?	Assessment
(none — no base to diff)	`truncateStructValue` parser now matches `ClassName {…}` (space), multi-line Ruby bodies, and whitespace-only bodies — reached by the MCP tree/find renderers, not just the new `trim` command	yes — in `87824eb4`, but folded into a commit whose headline is "add `trim` command"	🟡 concerning-to-confirm — the MCP renderer's output for those value shapes changes, and no gold trace covers it, so the change is unverified either way

The parser enhancement is deliberate (the comments spell out each one), so it is intended, not a
stray edit — but it lands in the agent-facing MCP renderer with zero trace coverage. Confirm the
intended blast radius by recording treeRender.spec.ts (above) so the next review can hold the
MCP value-truncation output fixed.

Suggestions

🟡 MEDIUM — Value-truncation refactor changes MCP renderer output but is untraced

File: truncateStructValue.ts Context: parseStruct / splitTopLevelFields, lines 51–68

The extraction from treeRender.ts is not a pure move: parseStruct's type-name group gained
\s* (so ClassName { … } is now parsed struct-aware instead of flat-truncated), the Ruby branch
gained [\s\S]+ (multi-line bodies), and splitTopLevelFields switched to body.trim().length.
These functions feed the dense MCP tree/find renderers that an LLM agent reads. No gold trace
exercises truncateStructValue, so this output change is unverified in the baseline.

Risk: low correctness risk (the changes are additive parser reach), but the MCP surface an
agent consumes changed with no behavioral guard — a future regression here would pass the gold-trace
gate silently.

Recommended remediation: add a gold trace over the MCP value-truncation path so the digest
guards it:

cd packages/cli && npx appmap-node npx jest tests/unit/cmds/query/lib/treeRender.spec.ts
# then add an entry to packages/cli/gold_traces/manifest.yaml for it and re-bless.

🟡 MEDIUM — `appmap trim` and `trimAppMap` have no test or gold trace

File: trim.ts Context: whole command, lines 1–102

A brand-new command that overwrites AppMap files in place by default
(trim.ts:97) ships with no unit test and no gold trace.
trimAppMap mutates the parsed object in place and the loop writes each file before processing the
next, so a malformed input mid-batch (JSON.parse throws) aborts the run after already
overwriting earlier files — a partial, irreversible edit with no dry-run.

Risk: data-destructive on bad input; the in-place default means a user who points trim at a
directory of recordings and hits one corrupt file is left with a half-trimmed set.

Recommended remediation: add trim.spec.ts asserting (a) values shrink while structure/digest
is preserved, and (b) a malformed file is skipped, not fatal. Consider validating/parsing all inputs
before writing any output. Regression test:

it('trims value strings but preserves the sequence-diagram digest', () => {
  const before = JSON.parse(readFileSync(fixture, 'utf8'));
  const after = trimAppMap(JSON.parse(readFileSync(fixture, 'utf8')));
  expect(digestOf(after)).toEqual(digestOf(before));       // structure unchanged
  expect(JSON.stringify(after).length).toBeLessThan(JSON.stringify(before).length);
});

🟢 LOW — `trim --max-length` only bounds flat strings, not struct field values

File: trim.ts:85 Context: handler

--max-length is wired only into budget.flatCap; perValueCap (48) and idCap (12) stay fixed.
So trim --max-length 10 still emits struct field values up to 48 chars — the flag's name implies a
global cap it doesn't enforce.

Risk: cosmetic/expectation mismatch; no correctness impact.

Recommended remediation: either document that --max-length caps only non-struct strings, or
scale perValueCap/idCap proportionally to maxLength.

🟢 LOW — `security.path-resolution` label may be inert (line-comment annotation)

File: recordingId.ts:37 Context: resolveAppmapPath

The label is declared as a // @label line comment, but the MCP_handler gold trace that exercises
resolveAppmapPath shows no labeled code objects. Confirm the Node recorder harvests line-comment
labels for TypeScript; if not, apply the label via appmap.yml functions: so it actually rides on
future traces (and so a future review can reason about the auth/path-resolution boundary).

🟢 INFO — Sequence-diagram labels feature is correct and intended

labelsOf sorts for stable output and omits the field when empty; Node.labels is correctly kept
out of the digest so re-labeling isn't a false-positive behavioral change. Wiring is confirmed
(buildDiagram ×49 / labelsOf ×10 in the labels trace). The only gap is that no committed trace
contains labeled objects, so the carried-labels branch is unexercised (see Coverage Matrix).

Tests to Synthesize

Target	Test name	Priority
`truncateStructValue` (MCP renderer path)	gold trace via `treeRender.spec.ts` recording	Medium
`appmap trim` / `trimAppMap`	`trim.spec.ts` — value shrinks, digest preserved, malformed input skipped	Medium
`buildDiagram` labels branch	unit assert: a callee with `codeObject.labels` yields sorted `labels` on its node; empty ⇒ field omitted	Medium
`trim --max-length`	assert flag bounds output as documented	Low

SQL Pass

No SQL was added or changed by this diff. The query-side changes (resolveAppmapPath,
treeRender, truncateStructValue) touch rendering/formatting of already-fetched values, not
query construction — no new predicate, no dynamic SQL, no concatenation, no injection surface. The
query-find/query-tree/SQL_query gold traces exist as a baseline but show no query drift (and
none is expected). resolveAppmapPath still resolves via the existing labeled-display-name /
basename logic unchanged. Nothing to flag.

HTTP Pass

No request-handling code changed. buildDiagram now attaches labels to ServerRPC/ClientRPC
nodes, but that is diagram metadata, not request processing — no change to auth gating, routing,
input handling, or transport. The OpenAPI, query-endpoints, and HTTP_server_request traces
are baseline-only. Nothing to flag.

Summary

Severity	Findings	Action required
🔴 High	0	—
🟡 Medium	2	Add gold-trace / unit coverage for the value-truncation refactor and the new `trim` command before relying on the baseline to guard them
🟢 Low	3 (2 LOW + 1 INFO)	Clarify `--max-length` scope; verify the `security.path-resolution` label is harvested; note the labels feature is digest-invisible by design

Not merge-blocking. No high-severity or destructive-drift finding, and — critically — no
behavioral drift could be computed because main carries no gold traces; this run establishes
the baseline rather than diffing against one. The single most important follow-up is coverage: the
value-truncation library (now feeding the agent-facing MCP renderer) and the file-mutating trim
command both ship with zero gold-trace coverage, and the headline labels feature is intentionally
excluded from the digest — so today's gold traces would not catch a regression in any of the three.
Record treeRender.spec.ts and add a trim.spec.ts, then re-bless, so the next review has real
behavioral signal to diff.

@Label

…labels - Update the sequenceDiagram RPC integration fixture for the new action labels. - Add handler-level fingerprint tests (a real-fixture Fingerprinter test and a FingerprintDirectoryCommand.execute() test). - Annotate two security-relevant boundaries with @Label (security.deserialization on pruneWithFilter, security.path-resolution on resolveAppmapPath) for behavioral review. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Addresses the behavioral review of the trim command: - Skip a malformed/unreadable file instead of aborting the batch, so one bad file no longer stops the rest or leaves earlier files half-done; fail only when every input fails. - Write each output atomically via writeFileAtomic (temp file in the same dir, then rename) — reusing the fingerprint indexer's helper — so an interrupted write never leaves a partial AppMap in place. - --max-length now caps struct field and id-shaped values too (lowered to min(default, maxLength)), not just flat strings; the default (120) is unchanged, so committed baselines don't churn. Adds handler tests: a malformed file is skipped (good file still trimmed in place, bad file untouched), and an all-failed batch throws. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

kgilpin and others added 2 commits June 27, 2026 10:06

Copilot started work on behalf of kgilpin July 1, 2026 19:20 View session

Copilot finished work on behalf of kgilpin July 1, 2026 19:37

kgilpin force-pushed the feat/sequence-diagram-labels branch from e91f159 to 44cf188 Compare July 2, 2026 16:25

kgilpin mentioned this pull request Jul 2, 2026

chore(gold-traces): AppMap gold-trace baselines + behavioral review workflow #2371

Merged

kgilpin assigned dividedmind Jul 2, 2026

kgilpin merged commit e24a475 into main Jul 3, 2026
26 checks passed

kgilpin deleted the feat/sequence-diagram-labels branch July 3, 2026 19:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(sequence-diagram): carry AppMap labels on diagram actions#2369

feat(sequence-diagram): carry AppMap labels on diagram actions#2369
kgilpin merged 4 commits into
mainfrom
feat/sequence-diagram-labels

kgilpin commented Jun 27, 2026

Uh oh!

github-actions Bot commented Jul 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

kgilpin commented Jun 27, 2026

What

Why

Design notes

Testing

Uh oh!

github-actions Bot commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

AppMap Behavioral Review — sequence-diagram labels + appmap trim

Feature List

Coverage Matrix

Suggested Labels

Behavioral Drift

Unintended Side Effects

Suggestions

🟡 MEDIUM — Value-truncation refactor changes MCP renderer output but is untraced

🟡 MEDIUM — appmap trim and trimAppMap have no test or gold trace

🟢 LOW — trim --max-length only bounds flat strings, not struct field values

🟢 LOW — security.path-resolution label may be inert (line-comment annotation)

🟢 INFO — Sequence-diagram labels feature is correct and intended

Tests to Synthesize

SQL Pass

HTTP Pass

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented Jul 2, 2026 •

edited

Loading

AppMap Behavioral Review — sequence-diagram labels + `appmap trim`

🟡 MEDIUM — `appmap trim` and `trimAppMap` have no test or gold trace

🟢 LOW — `trim --max-length` only bounds flat strings, not struct field values

🟢 LOW — `security.path-resolution` label may be inert (line-comment annotation)