Skip to content

feat(observability): inject solution into outbound AWS SDK User-Agent (#319)#338

Draft
scottschreckengaust wants to merge 8 commits into
mainfrom
feat/319-sdk-user-agent-solution-tracking
Draft

feat(observability): inject solution into outbound AWS SDK User-Agent (#319)#338
scottschreckengaust wants to merge 8 commits into
mainfrom
feat/319-sdk-user-agent-solution-tracking

Conversation

@scottschreckengaust

@scottschreckengaust scottschreckengaust commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Implements #319. This PR body is the plan of record (no plan file committed); all 7 tasks are complete — each landed as its own commit. The first commit is an empty start-signal.

Verification: agent quality gate (1033 tests, 76.5% coverage, ruff, ty) ✅ · mise //cdk:test (2033 tests incl. new wire-capture + template assertions) ✅ · full CLI build (335 tests) ✅ · CDK↔CLI type-sync + constants-drift checks ✅ · CI: CodeQL ×3, security scans, PR lint, build (agentcore) all green. Local cdk synth fails on this workstation for environmental reasons only (synth requires ec2:DescribeAvailabilityZones, not granted to the local role; main also fails locally) — CI's build job performs the real synth and passes. The integ smoke failure is pre-existing on main (known issue, unrelated).

Closes #319

What this builds

Every outbound AWS API call from ABCA carries two solution-attribution User-Agent segments:

app/uksb-wt64nei4u6/{STACKNAME}             ← sanitized stack name, clipped to ≤34 chars (value ≤50)
md/uksb-wt64nei4u6#{IDENTIFIER}[#{TRACE}]   ← component label + optional per-request trace handle

Both are emitted via the raw/verbatim user-agent path (botocore user_agent_extra / JS customUserAgent) — not the sanitizing app-id config field, which would mangle the / separator (/ is not in the SDK app-id charset; verified against installed botocore 1.43.9 useragent.py and @aws-sdk/middleware-user-agent). Static parts are baked at client construction; #{TRACE} is appended per-request by a botocore before-send handler / JS middleware so cached clients and pooled connections are never reissued.

Verified SDK mechanics this design relies on

  • JS v3 customUserAgent pair encoding (middleware-user-agent/dist-cjs/index.js): a pair ['app/uksb-wt64nei4u6/STACK'] survives intact — the name is split on /, each part escaped (all our chars are token-safe), and rejoined with /. A pair ['md/uksb-wt64nei4u6', 'api'] renders md/uksb-wt64nei4u6#api (# is legal in values, not names). customUserAgent lands in both user-agent and x-amz-user-agent (node runtime), always at the end of each header — which is what lets the trace middleware do a safe suffix-append.
  • The UA middleware is named getUserAgentMiddleware, step build, priority low — our trace middleware is added addRelativeTo(..., { relation: 'after', toMiddleware: 'getUserAgentMiddleware' }).
  • DynamoDBDocumentClient.from(inner) shares the inner client's middlewareStack and config — instrumenting the inner DynamoDBClient covers the document client.
  • botocore renders user_agent_extra last (_build_extra is the final component in to_string()), so a before-send handler can append #{TRACE} to the end of request.headers['User-Agent'] and it lands exactly on the md/ segment. before-send handlers can also return a stubbed AWSResponse to short-circuit the HTTP send — that's the wire-capture test mechanism (no network, no moto).

Identity & trace mapping

Surface {IDENTIFIER} source {TRACE} handle
Python agent runtime agent hardcoded task id (ULID), set once via configure_session()
REST API Lambdas (task-api) api ABCA_COMPONENT env per-invocation ulid() requestId handlers already mint
Webhook ingest + Slack/Linear/GitHub integrations webhook ABCA_COMPONENT env requestId where minted, else absent
Orchestrator, strategies, reconcilers, stream consumers orchestr ABCA_COMPONENT env event.task_id (orchestrate-task); else absent
bgagent CLI cli hardcoded process.pid, set once at startup

{IDENTIFIER} comes from ABCA_COMPONENT env (not per-call-site parameters) because shared modules (shared/orchestrator.ts, shared/create-task-core.ts, …) construct module-level clients and are bundled into multiple Lambdas — the env var gives each Lambda its own label with one source of truth in CDK.

{STACKNAME} comes from a new ABCA_STACK_NAME env var (Lambdas, agent container) / optional stack_name CLI config field. When absent (local dev, unconfigured CLI), the app/ segment is omitted entirely — never a placeholder.

Trace ambient context: module-level variable (with the JS/Lambda single-concurrent-invocation guarantee, and matching the agent's existing aws_session._tags module-state pattern) rather than contextvar/AsyncLocalStorage. server.py:300 already documents why ContextVar is a per-thread trap in the agent (task runs on a spawned thread). The issue allows any ambient mechanism.

Solution-ID constant placement: per-surface literal constant (agent/src/ua.py, cdk/src/handlers/shared/ua.ts, cli/src/ua.ts) with cross-referencing comments — NOT contracts/constants.json. That contract's schema and its drift checker (scripts/check-constants-sync.ts) are numeric-only (min/max/default invariants policed in agent/src/policy.py); the existing repo precedent for uksb-wt64nei4u6 is a literal (cdk/src/main.ts:41). Each surface's tests assert the exact emitted string, which is the de-facto drift guard.

Shared sanitization spec (implemented identically in Python and TS, with identical test vectors)

ALLOWED = A–Z a–z 0–9 ! $ % & ' * + - . ^ _ ` | ~          (UA token chars; note '/' and '#' are NOT allowed)
sanitize(v): replace every char not in ALLOWED (incl. any non-ASCII byte) with '-'
stackname:   sanitize FIRST, then clip to first 34 chars   (16-char 'uksb-wt64nei4u6/' + 34 = 50)

Execution plan

Task 1 — Python: agent/src/ua.py + tests (TDD)

Create: agent/src/ua.py, agent/tests/test_ua.py

  • SOLUTION_ID = "uksb-wt64nei4u6", COMPONENT = "agent", STACK_NAME_ENV = "ABCA_STACK_NAME".
  • sanitize_ua_value(raw: str) -> str per the shared spec.
  • static_user_agent_extra() -> strapp/uksb-wt64nei4u6/{stack(≤34)} md/uksb-wt64nei4u6#agent, dropping the app/ segment when ABCA_STACK_NAME is unset/empty.
  • set_trace(handle: str | None) / get_trace() — lock-guarded module state; sanitized on read.
  • register_trace_appender(events) — registers a before-send.* handler that appends #{trace} to request.headers['User-Agent'] when a trace is set (no-op otherwise; never reconstructs the client). Works for both client.meta.events and botocore-session-level registration (sessions propagate to all derived clients and resources).
  • client_config() -> botocore.config.ConfigConfig(user_agent_extra=static_user_agent_extra()) for direct-client call sites.
  • Tests (sanitize vectors incl. /, #, non-ASCII → -; 50-char budget; clip-after-sanitize ordering; trace-absent ⇒ no trailing #).
  • Wire-capture test: real botocore client (fake creds, region pinned), register_trace_appender + a register_last('before-send.*', …) stub returning a canned AWSResponse. Two calls on the same client under different set_trace() values assert different #{TRACE} suffixes, intact app/.../ slash, and intact md/...#agent — straight from the captured User-Agent header.

Task 2 — Python: wire into aws_session.py and the 8 bypass sites

Modify: agent/src/aws_session.py, agent/src/config.py:46,110, agent/src/shell.py:67, agent/src/telemetry.py:62,170, agent/src/server.py:157,184, agent/src/memory.py:43; tests in agent/tests/test_aws_session.py

  • _build_scoped_session() (aws_session.py:160): set botocore_session.user_agent_extra = ua.static_user_agent_extra() and ua.register_trace_appender(...) on the botocore session we already construct. Same for the plain-session path (aws_session.py:213) by constructing the botocore session explicitly. Covers tenant_client/tenant_resource (progress_writer, task_state, nudge_reader, attachments, deliverers, telemetry S3) and the singleton's pool is untouched across trace changes.
  • Unscoped tenant_client/tenant_resource gap (aws_session.py:240/250): these bypass the session and call boto3.client(...) directly (kept that way deliberately for test transparency) — apply config=ua.client_config() + ua.register_trace_appender(client.meta.events) there too, merging with any caller-supplied config kwarg.
  • New aws_session.platform_client(service, **kwargs)boto3.client(service, config=ua.client_config(), **kwargs) + ua.register_trace_appender(client.meta.events). Migrate the 8 direct boto3.client(...) sites (logs ×5, secretsmanager ×2, bedrock-agentcore ×1) to it. The ambient STS client at aws_session.py:142 gets config=ua.client_config() inline (its events come from the default session). Existing tests patch boto3.client globally — MagicMocks tolerate the extra config kwarg and the events registration.
  • configure_session() (aws_session.py:81) additionally calls ua.set_trace(task_id) — the agent's trace is live for the whole task without touching pipeline.py call sites beyond what already exists (pipeline.py:634).
  • reset_session_cache() also clears the trace (test hygiene).
  • Run mise //agent:quality (lint, typecheck, tests). Commit.

Task 3 — JS handlers: cdk/src/handlers/shared/ua.ts + wire tests (TDD)

Create: cdk/src/handlers/shared/ua.ts, cdk/test/handlers/shared/ua.test.ts

  • Constants + sanitizeUaValue() mirroring Task 1 (same test vectors).
  • abcaUserAgent(): { customUserAgent: UserAgentPair[] }[['app/uksb-wt64nei4u6/' + stack]] (only when ABCA_STACK_NAME set) + [['md/uksb-wt64nei4u6', component]] with component from ABCA_COMPONENT (default api). Spread into client config: new DynamoDBClient({ ...abcaUserAgent() }).
  • setAbcaTrace(handle?: string) module state; withAbcaTrace<T>(client: T): T — adds the suffix-append middleware after getUserAgentMiddleware, mutating both user-agent and x-amz-user-agent only when they end with the static md/ segment. Defensive no-op when middlewareStack is absent — required because ~40 existing test files mock client constructors as jest.fn(() => ({})) and module-level instrumentation must not crash under those mocks (real clients always have a stack; this guard is test-environment-only and documented as such).
  • Wire-capture test: real DynamoDBClient with a stub requestHandler recording request.headers and returning a canned HttpResponse. Asserts: both segments present and intact in the emitted header (literal / survived — i.e. NOT routed through the app-id field); two sends on the same client under different setAbcaTrace() values emit different suffixes; trace-absent ⇒ exactly md/uksb-wt64nei4u6#api with no trailing #; sanitization of a hostile trace value. Commit.

Task 4 — JS handlers: migrate all client sites + trace at handler entries

Modify: all new XClient( sites under cdk/src/handlers/ (≈60, enumerated during exploration — every one currently passes {} or { region }), plus the ~20 handler entry points that already mint const requestId = ulid().

  • Mechanical per-site change: new DynamoDBClient({})new DynamoDBClient(abcaUserAgent()), wrapped withAbcaTrace(...) at the outermost cached client (for DocumentClient: instrument the inner client — shared stack confirmed). Lazy singletons (strategies/ecs-strategy.ts:28, strategies/agentcore-strategy.ts:26, shared/memory.ts:149, confirm-uploads.ts:700) instrumented inside their getClient()s.
  • setAbcaTrace(requestId) immediately after each existing ulid() mint; setAbcaTrace(event.task_id) at orchestrate-task.ts durable-handler entry. Stream consumers (fanout-task-events, approval-metrics-publisher) carry no trace (optional by design).
  • No behavior change beyond UA; existing handler tests stay green (constructor mocks receive the new config arg; withAbcaTrace no-ops on bare-object mocks).
  • Run mise //cdk:test. Commit (possibly split: shared modules / top-level handlers).

Task 5 — CDK: thread ABCA_STACK_NAME + ABCA_COMPONENT env vars

Modify: cdk/src/constructs/task-api.ts:442 (commonEnv → adds both, component api) and :931 (webhookEnvwebhook; note webhookEnv does not spread commonEnv), cdk/src/constructs/task-orchestrator.ts:221 (orchestr), consumer constructs (concurrency-reconciler:71, stranded-task-reconciler:107, pending-upload-cleanup:90orchestr; fanout-consumer uses this.fn.addEnvironment(...) at :168ff — add both vars the same way; approval-metrics-publisher-consumer:115 currently has no environment: block — add one), integrations (slack-integration:188,220,252, linear-integration:192,250,289, github-screenshot-integration:159,228webhook), cdk/src/stacks/agent.ts:297 (runtime env) + cdk/src/constructs/ecs-agent-cluster.ts:124 (container env) → ABCA_STACK_NAME only (agent hardcodes its component).

  • Stack name via Stack.of(this).stackName (concrete — stackName is set explicitly in main.ts:34).
  • Template assertions in cdk/test/constructs/task-api.test.ts and task-orchestrator.test.ts (each surface emits its expected component label — acceptance criterion).
  • mise //cdk:test + mise //cdk:synth. Commit.

Task 6 — CLI

Create: cli/src/ua.ts (CLI-local mirror of the tiny helper — cli cannot import from cdk; same pattern as the types.ts mirror), cli/test/ua.test.ts
Modify: cli/src/auth.ts:37,103, cli/src/commands/{slack,github,linear,admin}.ts client sites, cli/src/bin/bgagent.ts (one setAbcaTrace(String(process.pid)) at startup), cli/src/types.ts (CliConfig.stack_name?: string), cli/src/commands/configure.ts (--stack-name flag + bundle passthrough)

  • Component hardcoded cli; stack name from loadConfig().stack_name (omit app/ when unset).
  • Wire-capture test mirroring Task 3 + auth.test.ts assertion that the Cognito client constructor receives customUserAgent.
  • Risk check: type-sync drift hook Resolved: CliConfig is already in CLI_ONLY_ALLOWLIST (scripts/check-types-sync.ts:156) — adding stack_name? requires no CDK mirror.
  • cd cli && mise run build. Commit.

Task 7 — Docs + full verification

  • AGENTS.md “Common mistakes”: new bullet — every new AWS SDK client must go through the ABCA UA helpers (agent/src/ua.py · cdk/src/handlers/shared/ua.ts · cli/src/ua.ts); naked boto3.client(...) / new XClient({}) silently drops solution attribution (addresses @krokoko’s comment). Root-level file — no Starlight sync needed.
  • Header comments in each ua.ts/ua.py stating the wire format and pointing at feat(observability): inject solution into outbound AWS SDK User-Agent #319.
  • mise run build (agent quality → cdk → cli → docs), mise run security:sast. Sweep the acceptance-criteria checklist below; mark PR ready.

Acceptance-criteria map

Criterion (#319) Where satisfied
Static baked once, raw path (not app-id field) Tasks 1–4, 6; wire tests assert the / survives
{STACKNAME} sanitize-then-clip ≤34, value ≤50; test w/ /, #, non-ASCII Tasks 1, 3 sanitize tests
Stable per-component {IDENTIFIER}; per-surface test ABCA_COMPONENT env + Task 5 template assertions + wire tests
{TRACE} per-request, never baked, optional; same-client different-trace test; no trailing #; sanitized Tasks 1–4, 6 wire tests
Outgoing-header inspection at wire layer, both segments, trace-present & -absent botocore before-send short-circuit capture; JS stub requestHandler capture
Agent centralized in aws_session.py/config.py; singleton pool preserved Task 2
All JS handler clients covered via shared helper Tasks 3–4
CLI Cognito client covered Task 6
{STACKNAME} threaded via env / container env / CLI config — not hard-coded Tasks 5–6
CDK compile/synth/test + agent tests pass; no functional change Tasks 2, 4, 5, 7
Docs instruction for future SDK calls (issue comment) Task 7

Out of scope (per issue)

Dashboards/querying of attributed usage (#215/#245); the deploy-time uksb stack-description token (#292, done).

🤖 Generated with Claude Code

scottschreckengaust added a commit that referenced this pull request Jun 12, 2026
app/uksb-wt64nei4u6/{STACKNAME} + md/uksb-wt64nei4u6#agent[#{TRACE}],
emitted via botocore's verbatim user_agent_extra path (the sanitizing
app-id field would mangle the '/' separator). Static part baked at
construction; optional #{TRACE} appended per-request by a before-send
handler so cached clients keep their connection pool. Wire-capture
tests assert the emitted header via a short-circuiting before-send stub.

Task 1 of PR #338 plan. Part of #319

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@scottschreckengaust scottschreckengaust force-pushed the feat/319-sdk-user-agent-solution-tracking branch from a8425e7 to d126f1c Compare June 12, 2026 23:36
scottschreckengaust added a commit that referenced this pull request Jun 12, 2026
…ites

Session-level user_agent_extra on both the scoped (refreshable) and
plain singleton sessions covers every tenant_client/tenant_resource
caller; new platform_client() helper carries the UA + trace appender
for the ambient-chain call sites (logs x5, secretsmanager x2,
bedrock-agentcore x1) that bypass the session by design.
configure_session() doubles the task id as the UA trace handle.

The trace appender splices #{TRACE} onto the md/ segment (not the
header end) because boto3 renders 'Botocore/x.y.z' after the
session-level extra.

Task 2 of PR #338 plan. Part of #319

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
scottschreckengaust added a commit that referenced this pull request Jun 13, 2026
…racking

abcaUserAgent() bakes app/uksb-wt64nei4u6/{STACKNAME} (single-element
customUserAgent pair — the name path preserves '/') and
['md/uksb-wt64nei4u6', component] (the '#' comes from the SDK's own
name#value join; a '#' inside a name would be escaped to '-').
Component label from ABCA_COMPONENT env since shared modules bundle
into multiple Lambdas. withAbcaTrace() adds a middleware after
getUserAgentMiddleware that splices #{TRACE} onto the md/ segment in
both UA headers per-request; no-ops on bare-object constructor mocks.

Wire-capture tests run the full middleware stack against a stub
requestHandler and assert the emitted headers for trace-present,
trace-absent, and hostile-input cases.

Task 3 of PR #338 plan. Part of #319

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
scottschreckengaust added a commit that referenced this pull request Jun 13, 2026
… invocation

All ~60 client instantiations now pass abcaUserAgent() and are wrapped
withAbcaTrace() (document clients instrument the inner DynamoDBClient —
shared middleware stack). Handler entry points set the trace to their
freshly minted ulid() request id; orchestrate-task uses the task id.
Stream consumers (fanout, approval-metrics) carry no trace by design.

No behavior change beyond the UA headers; all 2031 existing tests pass
unmodified (withAbcaTrace no-ops on bare-object constructor mocks).

Task 4 of PR #338 plan. Part of #319

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
scottschreckengaust added a commit that referenced this pull request Jun 13, 2026
…face

Component labels: 'api' (task-api commonEnv), 'webhook' (webhookEnv +
slack/linear/github-screenshot integrations), 'orchestr' (orchestrator,
reconcilers, pending-upload cleanup, fanout + approval-metrics stream
consumers). Agent compute (AgentCore runtime env, ECS container env)
gets ABCA_STACK_NAME only — agent/src/ua.py hardcodes its label.
approval-metrics-publisher previously had no environment block; one is
added. Template assertions verify each surface's label (#319
acceptance criterion).

Task 5 of PR #338 plan. Part of #319

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
scottschreckengaust added a commit that referenced this pull request Jun 13, 2026
CLI-local ua.ts mirror (same convention as types.ts — CLI can't import
from CDK). Component 'cli', stack name from new optional stack_name
config field (configure --stack-name), trace = process pid set once at
startup. All Cognito/SecretsManager/CloudFormation/DynamoDB client
sites migrated. Wire-capture test mirrors the CDK one via a real
Cognito client + stub requestHandler; auth.test.ts asserts the
constructor receives the customUserAgent.

Task 6 of PR #338 plan. Part of #319

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
scottschreckengaust added a commit that referenced this pull request Jun 13, 2026
AGENTS.md 'Common mistakes' bullet directing agent/handler/CLI code to
the per-surface ua helpers, per review request on #319.

Task 7 of PR #338 plan. Part of #319

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@scottschreckengaust scottschreckengaust marked this pull request as ready for review June 13, 2026 01:05
@scottschreckengaust scottschreckengaust requested a review from a team as a code owner June 13, 2026 01:05

@scottschreckengaust scottschreckengaust left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing

Comment thread agent/src/config.py
from datetime import datetime, timedelta

import boto3
import boto3 # noqa: F401 — availability probe; client built via platform_client

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the import boto3 still here once and ignored via "noqa" to ensure it is available? Why not let it fail at the platform_client?

@scottschreckengaust scottschreckengaust changed the title feat(observability): inject solution tracking into outbound AWS SDK User-Agent (#319) feat(observability): inject solution into outbound AWS SDK User-Agent (#319) Jun 13, 2026
@scottschreckengaust

Copy link
Copy Markdown
Contributor Author

Recommend a near-total simplification — verified against the installed botocore 1.43.9 + JS v3

Both SDKs honor AWS_SDK_UA_APP_ID natively (botocore configprovider.py maps it to user_agent_appid; JS v3 NODE_APP_ID_CONFIG_OPTIONS.environmentVariableSelector). Verified renders:

AWS_SDK_UA_APP_ID emitted app/ segment
uksb-wt64nei4u6 app/uksb-wt64nei4u6
uksb-wt64nei4u6#orchestr app/uksb-wt64nei4u6#orchestr (# survives; 24 ≤ 50)
"" (empty) (none — free opt-out)
unset (none)

So app/uksb-wt64nei4u6 — optionally carrying the component as #agent — is emitted with zero code. The ua.py / ua.ts ×2 modules (482 src + 624 test lines), the raw customUserAgent/user_agent_extra path, the per-request before-send/middleware trace appender, the module-level trace state, and the ~60 client-site rewrites all exist only to carry the two self-imposed extras: the /{STACKNAME} slash form and the per-request #{TRACE}.

Prescription — collapse to native config

  1. One CDK context param sdkUaAppId, default uksb-wt64nei4u6 (or per-surface uksb-wt64nei4u6#<component>), matching the existing tryGetContext('stackName') pattern in main.ts.
  2. Thread it as the AWS_SDK_UA_APP_ID env var to every Lambda + AgentCore runtime + ECS container — reusing the exact env-threading this PR already built for ABCA_STACK_NAME/ABCA_COMPONENT. botocore and JS pick it up automatically; the agent needs no aws_session change, the CLI just exports it.
  3. Delete the three ua modules, the trace middleware/before-send handler, the module trace state, and the abcaUserAgent()/withAbcaTrace() edits at the ~60 client sites.
  4. Customer opt-out is then free: set the context/env to "".

Net: ~1.5k lines / 83 files → env-var threading + one param + docs/tests.

Cost of the simplification (honest trade-offs)

Full verified SDK snippets + the kept-vs-dropped breakdown are in #319. The research and implementation here are high-quality and correct — this is a scope call, not a defect. The /{STACKNAME} + #{TRACE} requirements in #319 are what force the raw path; relax those two (they're ours) and the feature becomes native config.

@scottschreckengaust

Copy link
Copy Markdown
Contributor Author

Correction — you're right, md/ stays. It survives as a static, baked-once segment, not the machinery. Verified together on a real STS wire request (botocore 1.43.9):

... app/uksb-wt64nei4u6 ... md/uksb-wt64nei4u6#agent
  • app/uksb-wt64nei4u6 ← native AWS_SDK_UA_APP_ID env (zero code)
  • md/uksb-wt64nei4u6#agent ← static user_agent_extra (botocore) / customUserAgent (JS), baked once at construction

Three tiers — keep the first two, drop the third:

Tier Segment Mechanism Verdict
1 app/uksb-wt64nei4u6[#stack] native AWS_SDK_UA_APP_ID env free — delete all app/ code
2 md/uksb-wt64nei4u6#{component} static user_agent_extra/customUserAgent (one line; session-level in agent, per-client spread in JS) keep — small
3 #{TRACE} per-request before-send handler / JS middleware + module trace state + ~60 withAbcaTrace wraps drop — the bulk

So what actually deletes is the trace plane (per-request appender, set_trace/getAbcaTrace, configure_session trace wiring, all withAbcaTrace() calls) and the raw-path app/ handling + sanitize-then-clip (native env replaces it). ua.py/ua.ts collapse to a ~15-line static md/ helper; client sites keep one static spread arg, no middleware.

Full kept-vs-dropped detail in #319.

@scottschreckengaust

Copy link
Copy Markdown
Contributor Author

Confirmed: # separator works and removes the last reason for the raw path.

Verified both SDKs: AWS_SDK_UA_APP_ID=uksb-wt64nei4u6#backgroundagent-devapp/uksb-wt64nei4u6#backgroundagent-dev (botocore allows # via allow_hash=True; JS UA_VALUE_ESCAPE_REGEX includes #). The / was the only char that forced app/ onto the raw customUserAgent/user_agent_extra path — switch to # and app/ rides the native AWS_SDK_UA_APP_ID env field with zero code.

Final shape:

  • app/uksb-wt64nei4u6#{stack} — native env (tier 1, free)
  • md/uksb-wt64nei4u6#{component} — static user_agent_extra/customUserAgent, baked once (tier 2, ~15 lines)
  • no {TRACE} (tier 3 dropped)

Comment rewrite required: every ua.* docstring says "NOT the sanitizing app-id field, whose charset excludes /" — with # that rationale inverts: the app-id field is now the right mechanism for app/. Full breakdown + criteria deltas posted in #319. Want me to push the simplification onto this branch (delete the trace plane + raw-path app/, keep static md/, rewrite comments), or land it as a fresh PR?

@scottschreckengaust

Copy link
Copy Markdown
Contributor Author

Companion draft posted: #345 — the simplified native-app-id alternative to this PR, per the discussion above (use # separator → app/ rides the SDK-native AWS_SDK_UA_APP_ID env, dropping the raw-path machinery and the per-request trace plane; static md/#{component} kept). Both are intentionally in draft so reviewers can compare side by side and choose one. This PR (#338) is the full raw-path implementation; #345 is the minimal-footprint option.

scottschreckengaust and others added 8 commits June 13, 2026 20:23
Empty start-signal commit. Execution plan lives in the draft PR body;
implementation commits follow.

Part of #319

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
app/uksb-wt64nei4u6/{STACKNAME} + md/uksb-wt64nei4u6#agent[#{TRACE}],
emitted via botocore's verbatim user_agent_extra path (the sanitizing
app-id field would mangle the '/' separator). Static part baked at
construction; optional #{TRACE} appended per-request by a before-send
handler so cached clients keep their connection pool. Wire-capture
tests assert the emitted header via a short-circuiting before-send stub.

Task 1 of PR #338 plan. Part of #319

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ites

Session-level user_agent_extra on both the scoped (refreshable) and
plain singleton sessions covers every tenant_client/tenant_resource
caller; new platform_client() helper carries the UA + trace appender
for the ambient-chain call sites (logs x5, secretsmanager x2,
bedrock-agentcore x1) that bypass the session by design.
configure_session() doubles the task id as the UA trace handle.

The trace appender splices #{TRACE} onto the md/ segment (not the
header end) because boto3 renders 'Botocore/x.y.z' after the
session-level extra.

Task 2 of PR #338 plan. Part of #319

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…racking

abcaUserAgent() bakes app/uksb-wt64nei4u6/{STACKNAME} (single-element
customUserAgent pair — the name path preserves '/') and
['md/uksb-wt64nei4u6', component] (the '#' comes from the SDK's own
name#value join; a '#' inside a name would be escaped to '-').
Component label from ABCA_COMPONENT env since shared modules bundle
into multiple Lambdas. withAbcaTrace() adds a middleware after
getUserAgentMiddleware that splices #{TRACE} onto the md/ segment in
both UA headers per-request; no-ops on bare-object constructor mocks.

Wire-capture tests run the full middleware stack against a stub
requestHandler and assert the emitted headers for trace-present,
trace-absent, and hostile-input cases.

Task 3 of PR #338 plan. Part of #319

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… invocation

All ~60 client instantiations now pass abcaUserAgent() and are wrapped
withAbcaTrace() (document clients instrument the inner DynamoDBClient —
shared middleware stack). Handler entry points set the trace to their
freshly minted ulid() request id; orchestrate-task uses the task id.
Stream consumers (fanout, approval-metrics) carry no trace by design.

No behavior change beyond the UA headers; all 2031 existing tests pass
unmodified (withAbcaTrace no-ops on bare-object constructor mocks).

Task 4 of PR #338 plan. Part of #319

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…face

Component labels: 'api' (task-api commonEnv), 'webhook' (webhookEnv +
slack/linear/github-screenshot integrations), 'orchestr' (orchestrator,
reconcilers, pending-upload cleanup, fanout + approval-metrics stream
consumers). Agent compute (AgentCore runtime env, ECS container env)
gets ABCA_STACK_NAME only — agent/src/ua.py hardcodes its label.
approval-metrics-publisher previously had no environment block; one is
added. Template assertions verify each surface's label (#319
acceptance criterion).

Task 5 of PR #338 plan. Part of #319

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
CLI-local ua.ts mirror (same convention as types.ts — CLI can't import
from CDK). Component 'cli', stack name from new optional stack_name
config field (configure --stack-name), trace = process pid set once at
startup. All Cognito/SecretsManager/CloudFormation/DynamoDB client
sites migrated. Wire-capture test mirrors the CDK one via a real
Cognito client + stub requestHandler; auth.test.ts asserts the
constructor receives the customUserAgent.

Task 6 of PR #338 plan. Part of #319

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
AGENTS.md 'Common mistakes' bullet directing agent/handler/CLI code to
the per-surface ua helpers, per review request on #319.

Task 7 of PR #338 plan. Part of #319

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@scottschreckengaust scottschreckengaust force-pushed the feat/319-sdk-user-agent-solution-tracking branch from 18ea957 to 686931b Compare June 13, 2026 20:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(observability): inject solution into outbound AWS SDK User-Agent

1 participant