From 3b8cf13d362e105a1d35ace087f05ca655efde0e Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Fri, 12 Jun 2026 16:33:37 +0200 Subject: [PATCH 1/6] docs: ADR-0027 feature-usage bitmask in the User-Agent MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add an ADR, design spec, and per-language bit registry for a lightweight feature-usage signal: a 64-bit mask, emitted as a `(feat=vN.)` User-Agent comment, stamped per request on first-party (Azure/Foundry) clients only. - docs/decisions/0027-feature-usage-bitmask-user-agent.md — ADR (options-first, with Limitations, Open Questions, and v1->v2 migration) - docs/specs/002-feature-usage-telemetry.md — design spec + implementation plan - docs/specs/feature-usage-bit-registry.md — per-language bit tables + governance Granularity is per package with core broken out per feature (each orchestration pattern and built-in context/history provider). Registries are per language (decoder selects by the language already in the UA). OpenTelemetry emission is deferred (privacy). Docs only; no code changes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../0027-feature-usage-bitmask-user-agent.md | 281 ++++++++++++++ docs/specs/002-feature-usage-telemetry.md | 356 ++++++++++++++++++ docs/specs/feature-usage-bit-registry.md | 210 +++++++++++ 3 files changed, 847 insertions(+) create mode 100644 docs/decisions/0027-feature-usage-bitmask-user-agent.md create mode 100644 docs/specs/002-feature-usage-telemetry.md create mode 100644 docs/specs/feature-usage-bit-registry.md diff --git a/docs/decisions/0027-feature-usage-bitmask-user-agent.md b/docs/decisions/0027-feature-usage-bitmask-user-agent.md new file mode 100644 index 00000000000..d9ebff92d6c --- /dev/null +++ b/docs/decisions/0027-feature-usage-bitmask-user-agent.md @@ -0,0 +1,281 @@ +--- +status: proposed +contact: eavanvalkenburg +date: 2026-06-12 +deciders: eavanvalkenburg +consulted: +informed: +--- + +# Feature-usage bitmask in the User-Agent + +## Context and Problem Statement + +We can see which Agent Framework packages are installed and that *some* framework +call happened (via the existing `agent-framework-python/{version}` User-Agent), +but we have no usage-based signal about **which features are actually exercised** +at runtime, nor which are used *together* (e.g. workflows + MCP + Foundry). How +can we collect a lightweight, privacy-respecting signal of feature usage for the +traffic we can actually read, without standing up new event pipelines? + +The detailed mechanism is in [SPEC-002](../specs/002-feature-usage-telemetry.md); +the per-language bit lists are in +[`docs/feature-usage-bit-registry.json`](../feature-usage-bit-registry.json). + +## Decision Drivers + +- **Transparency** — openly documented, human-decodable, user-controllable. No + hidden or obfuscated telemetry. +- **First-party scope / no third-party leakage** — emit only to Azure/Foundry + endpoints (the telemetry we can ingest); never leak a feature fingerprint into + third-party logs we cannot read. +- **Live signal** — reflect features exercised *so far*, re-evaluated per request, + not frozen at client construction. +- **Low cost / few moving parts** — reuse telemetry already in the request path; + near-zero runtime overhead; as little machinery as the job needs. +- **Privacy** — encode only coarse boolean feature usage; no identifiers, + arguments, prompts, or payloads. + +## Considered Options + +The options below are grouped by the decisions that matter: the **transport**, +the **granularity**, and the **registry sharing model**. + +### Transport + +#### A. User-Agent token, first-party only, per request (chosen) + +Stamp a `(feat=...)` comment onto the UA, but only on Azure/Foundry clients, and +re-evaluate it per request. + +- Good, reuses telemetry already sent to the one backend we can read. +- Good, per-request stamping reflects the live mask (not frozen at construction). +- Good, first-party scoping means no fingerprint leaks to third-party providers. +- Good, maps onto .NET's existing per-request UA pipeline policies unchanged. +- Bad, no signal for traffic that never hits a first-party endpoint (accepted — + we couldn't read it anyway). + +#### B. User-Agent token on all clients + +- Good, simplest to wire (one static header). +- Bad, sends a deployment fingerprint to OpenAI/Anthropic/AWS/Google logs we + cannot read — privacy leak for zero benefit. +- Bad, baked into static `default_headers`, so it freezes at client construction + and reports a near-empty mask. + +#### C. OpenTelemetry span/resource attribute + +- Good, precise per-call usage; no UA change. +- Bad (**privacy — the main reason to hold it**), a span attribute broadcasts the + feature-combination fingerprint into the user's **general** telemetry pipeline, + which is typically exported to third-party APM vendors (Datadog, Honeycomb, …). + That re-introduces exactly the fingerprint leakage the first-party-only UA + scoping (A) was chosen to avoid — just into a different set of third parties. +- Bad (secondary), also a cardinality footgun (a growing, combinatorial value + must never become a metric dimension). +- Neutral, for the team's own goal it reaches us only if the user exports to + Azure Monitor and we query it. +- **Deferred, not rejected.** The version prefix lets us add it later **if** the + privacy review blesses a broadly-emitted mask (or a scoped/redacted variant) + and a concrete query needs the per-call precision. + +#### D. Bespoke usage events + +- Good, richest detail and flexibility. +- Bad, new data flow and cost; larger privacy surface; heavy to build and review; + overkill for a coarse "which features" signal. + +#### E. Install/import-time signal only (status quo-ish) + +- Good, zero new runtime work. +- Bad, measures installation, not usage; cannot capture feature combinations — + does not solve the problem. + +### Granularity + +#### F. Per package, with core broken out per feature/provider (chosen) + +- Good, ~50 bits (Python) / ~40 (.NET) fit a **64-bit** mask, which keeps .NET's + accumulator lock-free (`Interlocked.Or`) and the registry hand-maintainable. +- Good, matches the actual questions ("which orchestration / which built-in + provider / which package?") — each orchestration pattern and each built-in + context/history provider gets its own bit, since they serve different purposes. +- Neutral, cannot distinguish sub-features *within* a provider package (e.g. + openai chat vs embeddings) until a bit is promoted. + +#### G. Per construct (one bit per instantiable type) + +- Good, finest detail. +- Bad, ~96 bits forces a 128-bit mask, which forfeits .NET's lock-free + `Interlocked.Or` (needs a lock / `UInt128`). +- Bad, ~96 call sites across two SDKs; the sheer count pushes toward code + generation and extra tests — machinery to manage machinery. +- Bad, precision nobody's decision actually needs. + +### Registry sharing model + +#### H. Per-language bit lists (chosen) + +Each SDK owns an independent list; the decoder picks the list using the language +already present in the UA product token. + +- Good, **no cross-language coordination**: each SDK numbers and evolves its + features independently; adding a Python feature never touches .NET numbering. +- Good, no null placeholders for one-SDK features, no "same bit, same meaning" + rule, no SDK-aware decode caveats. +- Good, decoding is trivial: language (from UA) + version -> list -> AND. +- Neutral, two small lists to maintain instead of one (but they were going to + diverge anyway — the packages differ). + +#### I. Single shared cross-language registry + +- Good, one list, one number space. +- Bad, forces synchronized numbering and null placeholders for features that + exist in only one SDK, plus SDK-aware decode rules. +- Bad, the synchronization is pure accidental complexity — **the language is + already in the User-Agent**, so sharing the number space buys nothing. + +### Registry maintenance + +#### J. Hand-written enum + parity test (chosen) + +- Good, ~40 members that change a few times a year; a 10-line test (enum vs JSON + list) is enough. +- Good, no build step, no generator to own. + +#### K. Code-generate the enums from the registry + +- Bad, a generator + drift test + schema test to maintain a short list of + integer constants; justified only by the per-construct bit count we rejected. + +### Representation (how the mask is rendered as text) + +All examples below encode the same mask — bits 0, 2, 16, 22, 27 set +(agent + workflow + sequential-orchestration + foundry.chat_client + openai, in +the Python v1 list) = decimal `138477573`. + +#### L. Decimal — `feat=v1.138477573` + +- Good, human-familiar; trivial to parse. +- Neutral, no visual alignment to bit/nibble boundaries; slightly longer than hex + for large masks. No advantage over hex. + +#### M. Hex (chosen) — `feat=v1.8410005` + +- Good, compact (≤16 chars for a 64-bit mask). +- Good, decodes with one stdlib call in every language (`int(x, 16)` / + `Convert.ToUInt64(x, 16)`); nibble boundaries are eyeball-able. +- Good, lowercase, no `0x` prefix, no leading zeros — unambiguous and stable. + +#### N. Binary / bit-list — `feat=v1.1000010000010000000000000101` or `feat=v1.0,2,16,22,27` + +- Good, most directly human-readable ("which bits"). +- Bad, longest form in the UA; the bit-list needs delimiter handling and grows + with the number of set bits. + +#### O. Alphabet / base-N (e.g. Crockford base32 `feat=v1.442005`, base62 `feat=v1.9n2lf`) + +- Good, shortest representation. +- Bad, needs a custom alphabet + decode table on both ends; base62 is + case-sensitive (fragile through case-normalizing intermediaries); not + eyeball-able. Premature optimization for a value that is already ≤16 chars in + hex. + +## Decision Outcome + +Chosen: **a per-request, first-party-only User-Agent `(feat=...)` token (A), +with per-package granularity (F), per-language bit lists (H), hand-written enums +kept honest by a parity test (J), rendered as lowercase hex (M).** + +This is the smallest design that answers the question. A 64-bit mask accumulates +from universal `mark_feature_used()` calls; the token is stamped per request only +on Azure/Foundry clients (live, no third-party leak); each SDK owns an +independent bit list selected by the language already in the UA; the mask is +rendered as hex (`feat=v1.8410005`). OTel (C) is deferred — mainly because a +broadly-emitted span attribute would leak the fingerprint into the user's general +telemetry, against the first-party-only stance — but left open behind the version +prefix. Per-construct granularity (G), a shared registry (I), codegen (K), and the +decimal/binary/base-N representations (L, N, O) are rejected as complexity or +length the problem does not require. + +### Consequences + +- Good, adds usage signal at near-zero cost, no new data flow, few moving parts. +- Good, transparent (public registry, human-decodable token) and disabled by the + existing User-Agent opt-out. +- Good, first-party-only + per-request emission gives a live mask and no + third-party fingerprint leak. +- Good, 64-bit keeps .NET lock-free; per-language lists remove all cross-language + sync; hand-written enums avoid a codegen toolchain. +- Neutral, the token's reach equals first-party traffic; broader per-call signal + (OTel) can be added later if needed. +- Bad, each feature must add a `mark_feature_used()` call, and first-party clients + need a per-request hook (small, mirrors existing patterns). + +## Registry versioning and migration (v1 → v2) + +The token carries a **per-language** version (`feat=v1.`); a version bump is +independent for Python and .NET. + +- **Additive growth stays on v1 — no bump.** Allocating a new feature to a + reserved/unused bit is backward-compatible: an older decoder simply sees an + unknown high bit and ignores it. Normal package growth never needs a new + version. +- **A bump (v2) is required only for breaking changes:** renumbering or + re-partitioning existing bits, changing the *meaning* of an already-assigned + bit, or widening beyond 64-bit. Within a version a bit is **never** reused or + reassigned — that invariant is what lets old decoders stay correct. +- **Mixed-version coexistence is the norm.** A fleet runs many SDK releases at + once, so `v1` and `v2` tokens appear simultaneously for a long time (old SDKs + keep emitting `v1`). The decoder keeps **every** published `(language, + version)` table and selects by the token's version; the `v1` table is retained + indefinitely for historical decode. +- **Unknown version → do not guess.** A decoder without the `vN` table must + record "unknown registry version" rather than decode against an older table — + bit meanings may differ across versions, so mis-attribution is worse than + no data. +- **Producing v2:** publish the v2 table alongside v1 in the registry doc, bump + that SDK's `FeatureBit` enum + version constant; the SDK emits `v2` from the + release it ships in. Prefer staying on v1 (additive) and reserving a clean v2 + for an eventual deliberate re-partition. + +## Limitations + +| Limitation | Caused by (choice) | Why we accepted it | +| --- | --- | --- | +| **No signal for self-hosted or third-party-only traffic.** If a process never calls Azure/Foundry, we see nothing. | First-party-only emission (A) | We can't read third-party logs anyway, and must not leak a fingerprint into them. Reach traded for privacy. | +| **No OTel / per-call signal in v1.** | OTel deferred (C) — primarily on **privacy** grounds | A broadly-emitted span attribute would push the fingerprint into the user's general telemetry / third-party APM vendors, undoing the first-party-only scoping. Left open to add later if there is a compelling reason to add. | +| **Mask reflects "usage so far," not the whole session.** Early requests carry fewer bits than later ones. | Process-global accumulator + per-request stamping | Honest and still useful; the team aggregates across requests. The per-request design is what makes it *grow* rather than freeze. | +| **No per-agent / per-call attribution.** The mask is one process-wide value — "this process used X", not "this agent/call used X". | Single global accumulator (simplicity) | Per-call attribution is what the deferred OTel span path would add; not needed for portfolio-level questions. | +| **Coarse granularity.** Can't distinguish sub-features (e.g. openai chat vs embeddings, which shell tool). | Per-package granularity (F) + 64-bit (keeps .NET lock-free) | Matches the actual questions; finer bits can be promoted later behind the version prefix. | +| **Fingerprinting risk is reduced, not eliminated.** A feature-combination mask is still a deployment signature, and it transits intermediaries (proxies/CDNs) even when first-party-scoped. | Emitting any feature-combination value | Scope + opt-out + coarse granularity mitigate it; residual risk is the subject of the privacy review below. | + +## Open Questions (for decider discussion) + +These are unresolved and should be decided before/at approval: + +1. **Privacy / telemetry-acceptance review (blocking).** Is a coarse, + first-party-only, opt-out-able feature-combination mask acceptable telemetry? + Even scoped, it transits intermediaries and is a deployment fingerprint. This + is a **release precondition**. Possible outcomes that would change the design: + require a dedicated opt-out flag (Q2), coarser granularity, hashing, or + explicit opt-in. +2. **Dedicated opt-out flag?** v1 reuses `AGENT_FRAMEWORK_USER_AGENT_DISABLED` + (mask dies with the whole UA). Do we add a mask-only flag now (keep base UA, + drop the fingerprint), or wait until asked / until the privacy review requires + it? +3. **When (if ever) to add the OTel path?** Held back mainly for **privacy**: a + span attribute broadcasts the fingerprint into the user's general telemetry + and onward to third-party APM vendors, contradicting the first-party-only + stance. It also carries a metric-cardinality hazard. Would the privacy review + allow a broadly-emitted mask, a scoped/redacted variant, or none? Decide if/when + to revisit. + +## More Information + +- Mechanism & API: [SPEC-002](../specs/002-feature-usage-telemetry.md) +- Per-language bit lists: [`docs/feature-usage-bit-registry.json`](../feature-usage-bit-registry.json) +- Encoding / opt-out / governance prose: [feature-usage-bit-registry.md](../specs/feature-usage-bit-registry.md) +- Existing accumulator pattern: `python/packages/core/agent_framework/_telemetry.py` +- .NET emission policies: `dotnet/src/Microsoft.Agents.AI.Foundry/AgentFrameworkUserAgentPolicy.cs`, + `dotnet/src/Microsoft.Agents.AI.Foundry.Hosting/HostedAgentUserAgentPolicy.cs` diff --git a/docs/specs/002-feature-usage-telemetry.md b/docs/specs/002-feature-usage-telemetry.md new file mode 100644 index 00000000000..e2b52cdbfcf --- /dev/null +++ b/docs/specs/002-feature-usage-telemetry.md @@ -0,0 +1,356 @@ +--- +status: proposed +contact: eavanvalkenburg +date: 2026-06-12 +deciders: eavanvalkenburg +consulted: +informed: +--- + +# Feature-usage telemetry via an accumulating bitmask + +> Companion design for [ADR-0027](../decisions/0027-feature-usage-bitmask-user-agent.md). +> The per-language bit tables, encoding, opt-out, and governance live in +> [feature-usage-bit-registry.md](feature-usage-bit-registry.md). Each SDK's +> hand-written `FeatureBit` enum is the source of truth for that language. + +## What is the goal of this feature? + +Give the Agent Framework team a lightweight signal about **which framework +features are actually exercised** at runtime (not merely installed), so we can +prioritise investment based on real usage. We emit a single small number — a +*feature mask* — on the User-Agent that already goes out with each request. + +**Reach is deliberately bounded.** The mask accumulates from *all* feature usage, +but the `feat=` token is only stamped on requests to **first-party (Azure / +Foundry) endpoints** — the only backends whose telemetry the team can ingest. We +do **not** send the token to third-party providers (OpenAI direct, Anthropic, +Bedrock, Gemini, Ollama, Mistral); doing so would leak a deployment fingerprint +into logs we cannot read (see [Emission](#emission)). + +**Granularity is per package**, with core broken out per feature: one bit per +orchestration pattern (sequential / concurrent / group-chat / magentic / handoff) +and **one bit per built-in context/history provider** (memory, skills, +file-access, compaction, todo, agent-mode, background-agents, in-memory/file +history) — because those serve different purposes and we want to know which are +used. See the [registry](feature-usage-bit-registry.md). The question is "are +people using workflows / which orchestration / which providers / MCP / Foundry +memory / Redis?", not which exact subclass. It still fits a 64-bit mask, keeps +the .NET accumulator lock-free, and keeps the registry small enough to +hand-maintain. Finer detail can be earned later via the version prefix. + +Success metric: within one release after rollout, ≥80% of first-party (Foundry) +requests carry a **non-empty** feature token whose mask reflects features marked +**after** client construction (i.e. the token is live, not frozen — see the +per-request requirement below). Secondary: ability to break down first-party +traffic by feature combination (e.g. "% of Foundry traffic that also uses +workflows"). + +This is done **transparently**: the bit registry is public and the emitted value +is human-decodable, and the existing User-Agent opt-out disables it. + +## What is the problem being solved? + +Today we only know which packages are *installed* (from package telemetry) or +that *some* Agent Framework call happened (the existing +`agent-framework-python/{version}` User-Agent). We have no usage-based signal +about feature combinations, and no way to tell that, say, a process uses +workflows + MCP + Foundry together. Collecting this through bespoke events would +add cost and new data flows; folding a tiny accumulating integer into telemetry +we already send is far cheaper and easier to reason about for privacy. + +## Mechanism + +### Process-global accumulator in `core` + +The accumulator and its helpers live in the existing +`agent_framework/_telemetry.py` (alongside `get_user_agent()` / +`prepend_agent_framework_to_user_agent()`), so the User-Agent machinery stays in +one module. It owns a process-global 64-bit accumulator. The existing +`AGENT_FRAMEWORK_USER_AGENT_DISABLED` flag (`IS_TELEMETRY_ENABLED` in that module) +already gates the whole User-Agent contribution, so it gates the mask too — no +new env var: + +```python +# agent_framework/_telemetry.py (same module as get_user_agent) +# IS_TELEMETRY_ENABLED already defined here (AGENT_FRAMEWORK_USER_AGENT_DISABLED) + +REGISTRY_VERSION = 1 + +_feature_mask = 0 +_feature_mask_lock = threading.Lock() + + +def mark_feature_used(bit: int) -> None: + """OR a feature bit into the process-global mask. + + Called the first time a feature is exercised. Cheap and idempotent; + a no-op when the User-Agent contribution is disabled. + """ + global _feature_mask + if not IS_TELEMETRY_ENABLED: + return + with _feature_mask_lock: + _feature_mask |= 1 << bit + + +def get_feature_token() -> str | None: + """Return ``v.`` for the accumulated mask, or None.""" + if not IS_TELEMETRY_ENABLED or _feature_mask == 0: + return None + return f"v{REGISTRY_VERSION}.{_feature_mask:x}" +``` + +- **Per package/feature, usage-based:** `mark_feature_used()` is called the first + time a feature is genuinely exercised — at construction of a representative + type (e.g. `Agent`, an `MCPTool`, a provider, a Foundry surface), never at + import time. The mask grows over the process lifetime. +- **No import cycles:** the call lives in each package's own module, so `core` + never imports optional packages. Each package references its bit via the shared + `FeatureBit` IntEnum exported from `core`. + +### Bit constants + +`core` exports a hand-written `FeatureBit` IntEnum (defined in `_telemetry.py` +alongside the accumulator). **The enum is the source of truth** for Python; the +Python table in [feature-usage-bit-registry.md](feature-usage-bit-registry.md) is +its published contract, kept aligned in the same PR (see +[Keeping the bitmap in sync](#keeping-the-bitmap-in-sync)). Each package imports +its named member and marks it where the feature is first exercised: + +```python +# agent_framework_foundry/_chat_client.py +from agent_framework import FeatureBit, mark_feature_used + +class RawFoundryChatClient(...): # base client; FoundryChatClient builds on it + def __init__(self, ...): + mark_feature_used(FeatureBit.FOUNDRY_CHAT_CLIENT) # bit 22 in v1 + ... +``` + +Mark in the **`Raw*` base client** (e.g. `RawFoundryChatClient`) so every path +that constructs a Foundry chat client — including the higher-level +`FoundryChatClient` — sets the bit exactly once. + +Using the shared enum (not literals) keeps `core` free of optional-package +imports while guaranteeing the bit values match the registry. For reference, in +v1 `FoundryChatClient` → bit 22, `FoundryAgent` → bit 23, Foundry memory → bit 24. + +## Emission + +**One path in v1: the User-Agent `feat=` token, stamped per request on +first-party (Azure/Foundry) clients only.** + +Marking (`mark_feature_used`) is **universal** — every feature sets its bit +regardless of provider. Only **emission** is scoped. A user who never calls a +first-party endpoint emits no token; this is the honest, intended behaviour (no +third-party leakage, no signal we couldn't read anyway). + +The base User-Agent (`agent-framework-python/{version}` plus any hosting prefix) +is unchanged and still set once via `default_headers` on **every** client. +`get_user_agent()` stays base-only (no `feat=`). The `feat=` token is **separate**, +added **only** by Azure/Foundry-based clients, and **re-evaluated on each +request** so it reflects the mask accumulated so far. A helper stamps it: + +```python +# agent_framework/_telemetry.py +def apply_feature_token(user_agent: str) -> str: + """Append/refresh the live ``(feat=v.)`` comment on a UA string. + + Re-reads the current mask on every call, so newly accumulated bits are + reflected immediately. Idempotent: replaces an existing ``(feat=...)`` + comment rather than appending a second. + """ + token = get_feature_token() # None when disabled or mask == 0 + base = _strip_feature_comment(user_agent) + return f"{base} (feat={token})" if token else base +``` + +Because `default_headers` are static, first-party clients install a +**per-request hook** that calls `apply_feature_token()` on each outgoing request: + +- **httpx-based clients** (`AzureOpenAI*` via the `openai` SDK): construct the + underlying client with + `http_client=httpx.AsyncClient(event_hooks={"request": [_stamp_feat_hook]})`, + where the hook mutates `request.headers["User-Agent"]`. Gate on the existing + `use_azure` signal in `agent_framework_openai/_shared.py` so generic OpenAI + clients never get the hook. +- **azure-core pipeline clients** (`AIProjectClient`, `SearchClient`, + `CosmosClient`, …): add a tiny `SansIOHTTPPolicy` whose `on_request` calls + `apply_feature_token()` on `request.http_request.headers["User-Agent"]`. This + mirrors .NET's per-request `PipelinePolicy` exactly. + +This fixes the frozen-at-construction problem: the token is materialised at +**send time**, not client-init time, so it carries features constructed after the +client. It also confines the token to first-party endpoints. + +Encoding uses the RFC 7231 **comment** form `(feat=v1.)` (metadata, not a +product token), placed after the agent-framework product token, e.g.: + +```text +foundry-hosting/agent-framework-python/1.2.3 (feat=v1.2a) +``` + +### OpenTelemetry — not in v1 + +An OTel span attribute carrying the same value was considered but **deferred — +primarily for privacy, not complexity**. Unlike the first-party-only UA token, a +span attribute broadcasts the feature-combination fingerprint into the user's +**general** telemetry pipeline, which is commonly exported to third-party APM +vendors (Datadog, Honeycomb, …) — re-introducing exactly the leakage the +first-party scoping was chosen to avoid. (It also carries a cardinality footgun: +a monotonically-growing, combinatorial value must never become a metric +dimension.) The version prefix leaves the door open to add it later **if** the +privacy review blesses a broadly-emitted or scoped/redacted variant; v1 ships the +UA path only. See [ADR-0027 → option C](../decisions/0027-feature-usage-bitmask-user-agent.md#considered-options). + +## API Changes + +New public surface in `agent-framework-core` (exported from +`agent_framework`): + +- `mark_feature_used(bit: int) -> None` +- `get_feature_token() -> str | None` — returns `v.` or `None`. +- `apply_feature_token(user_agent: str) -> str` — live, idempotent UA stamper + used by first-party per-request hooks. +- `FeatureBit` (IntEnum) — hand-written source of truth for the Python bit list + (see [Keeping the bitmap in sync](#keeping-the-bitmap-in-sync)). + +No new env var: the existing `AGENT_FRAMEWORK_USER_AGENT_DISABLED` disables the +mask along with the rest of the User-Agent contribution. + +Behavioural change to existing API: + +- `get_user_agent()` / `prepend_agent_framework_to_user_agent()` are + **unchanged** — they keep returning the base UA with no `feat=` token. The + token is added only by first-party per-request hooks via + `apply_feature_token()`. + +No breaking changes: when the mask is empty or disabled, or for any non +first-party client, output is byte-for-byte identical to today. + +## Opt-out + +The mask is part of the User-Agent contribution, so the existing flag covers it — +no new env var in v1: + +| Env var | Effect | +| --- | --- | +| `AGENT_FRAMEWORK_USER_AGENT_DISABLED` | disables the **entire** AF User-Agent contribution, mask included | + +(If a privacy review later requires keeping the base UA while dropping only the +mask, a dedicated flag can be added then — not built speculatively now.) + +## E2E example + +```python +from agent_framework import Agent +from agent_framework_foundry import FoundryChatClient +from agent_framework_openai import OpenAIChatClient + +# First-party (Foundry) client: per-request hook stamps the live feat token. +agent = Agent(client=FoundryChatClient(...), instructions="...") +# Agent use marks bit 0; FoundryChatClient marks bit 22 +await agent.run("Hello") +# Outgoing request to Foundry carries: +# User-Agent: agent-framework-python/1.2.3 (feat=v1.) + +# Third-party client: NO feat token is added (no first-party hook). +other = Agent(client=OpenAIChatClient(...), instructions="...") +await other.run("Hi") +# Outgoing request to OpenAI carries only: +# User-Agent: agent-framework-python/1.2.3 +``` + +Disabling the User-Agent contribution (mask included): + +```bash +AGENT_FRAMEWORK_USER_AGENT_DISABLED=true python app.py +``` + +## .NET mapping + +- `core` has a hand-written `FeatureBit` enum (`: ulong`) — the **source of + truth** for the .NET bit list, matching the .NET table in the registry doc — + plus `FeatureUsage.MarkUsed(FeatureBit)` (universal marking, as in Python). +- 64-bit width means the accumulator is **lock-free**: + `Interlocked.Or(ref _mask, (long)bit)`. No lock, no `UInt128`, no split-long. +- **Emission is per-request and first-party-scoped**, matching Python. The + existing `AgentFrameworkUserAgentPolicy` / `HostedAgentUserAgentPolicy` + pipeline policies already run per request — extend them to append/refresh the + `(feat=...)` comment, and register the feat-stamping policy **only on + Azure/Foundry clients** (e.g. `FoundryChatClient`), not on third-party + `IChatClient`s. +- Same `v.` comment format ⇒ decoded numbers mean the same thing in + both SDKs. (.NET's policy was already per-request, so there is no Python/.NET + timing asymmetry.) + +## Keeping the bitmap in sync + +The **`FeatureBit` enum in each SDK is the source of truth** for that language. +[feature-usage-bit-registry.md](feature-usage-bit-registry.md) holds the matching +**published table per language** — the contract a decoder reads. There is +deliberately **no shared numbering** and **no machine-readable registry file**: a +Python bit and a .NET bit with the same index need not mean the same thing, and +each SDK adds features without coordinating with the other. + +Adding a feature is one PR: add the `FeatureBit` enum member, add the matching +row in that language's table, and mark it at the call site. Review keeps the enum +and table aligned (≈40 entries, changing a few times a year — not worth a +generator or a generated-file drift test). If a programmatic decoder is built +later, export that language's table to JSON for it then. + +### Decoding + +``` +UA: agent-framework-python/1.2.3 (feat=v1.2a) + │ │ └ hex mask + │ └ version + └ language → pick the Python table (version 1) +``` + +Read language → pick the table; read `vN` → pick that version; `AND` the hex mask +against each bit. Unknown high bits (from a newer SDK than the decoder's copy of +the table) are ignored. + +## Implementation plan (post-approval) + +1. **Core accumulator + enum** — in `agent_framework/_telemetry.py` add the + 64-bit mask, lock, `mark_feature_used`, `get_feature_token`, + `apply_feature_token`, and the hand-written `FeatureBit` IntEnum (source of + truth, matching the Python table in the registry doc); `get_user_agent()` + stays base-only. Unit tests for the live/idempotent stamper. +2. **First-party per-request hooks** — add the httpx `event_hooks` request hook + (gated on `use_azure` in `agent_framework_openai/_shared.py`) and the + azure-core `SansIOHTTPPolicy` (for `AIProjectClient`/`SearchClient`/Cosmos). + Verify against a real Foundry call that the UA carries a **non-empty, + post-construction** mask. **Do not** add hooks to third-party clients. +3. **Mark feature usage** — call `mark_feature_used(FeatureBit.X)` once per + feature, the first time it is exercised: at the **`Raw*` base client/entry + point** per package (e.g. `RawFoundryChatClient`) so every higher-level + wrapper inherits the marking, and in the `__init__` of **each** core + construct that owns a bit — including every built-in context/history provider + (memory, skills, file-access, compaction, todo, agent-mode, background-agents, + in-memory/file history) and each orchestration builder. Marking is universal; + emission stays first-party-only. +4. **.NET parity** — hand-written `FeatureBit : ulong` enum (source of truth for + the .NET table); `FeatureUsage.MarkUsed` with lock-free `Interlocked.Or`; + extend the existing per-request UA policy to stamp `(feat=...)` **only on + Azure/Foundry clients**. The .NET enum is **independent** of Python's. +5. **Docs & tests** — update package `AGENTS.md`/skills; tests for the UA opt-out, + first-party scoping, and the live (non-frozen) UA. + +## Limitations & open questions + +The decision-level limitations and unresolved trade-offs — privacy review +(blocking), reach, per-process (not per-call) attribution, coarse granularity, +fingerprinting residue, and the dedicated-opt-out / OTel questions — are owned by +the ADR. See **[ADR-0027 → Limitations](../decisions/0027-feature-usage-bitmask-user-agent.md#limitations)** +and **[Open Questions](../decisions/0027-feature-usage-bitmask-user-agent.md#open-questions-for-decider-discussion)**. +This spec is the implementation reference; it does not re-litigate those choices. + +Implementation-only note: + +- **Per-request hook overhead is negligible** (a flag check, a lock-free read of + the mask, and a string concat per first-party request), but benchmark the hot + path once if a high-QPS Foundry scenario is in scope. diff --git a/docs/specs/feature-usage-bit-registry.md b/docs/specs/feature-usage-bit-registry.md new file mode 100644 index 00000000000..ce70d669280 --- /dev/null +++ b/docs/specs/feature-usage-bit-registry.md @@ -0,0 +1,210 @@ +# Feature-usage bit registry (per-language) + +> **Status:** draft, accompanies [ADR-0027](../decisions/0027-feature-usage-bitmask-user-agent.md) +> and [SPEC-002](002-feature-usage-telemetry.md). +> **Version:** `1` per language · **Width:** 64-bit + +This document is the human-readable registry for the feature-usage mask. The +**source of truth for each SDK is its own hand-written `FeatureBit` enum**; the +tables below are the published contract a decoder (or a human) uses to turn a +mask back into feature names. Keep the enum and the matching table in sync in the +same PR — review is the check; there is no generated artifact. + +This telemetry is intentionally **transparent**: this registry is public, the +emitted value is human-decodable, and the existing User-Agent opt-out disables it. + +## What is collected + +A single 64-bit integer (the *feature mask*) describing **which Agent Framework +features were exercised** in a process — not which packages are installed. +**Granularity is per package**, with core broken out per feature — each agent, +workflow engine, MCP, orchestration pattern, and **each individual built-in +context/history provider** gets its own bit, because they serve different +purposes and we want to know which are used. A feature sets its bit the first +time it is genuinely used; the SDK ORs the bits together and emits the value. + +No identifiers, arguments, prompts, payloads, or user data are encoded — only the +coarse boolean \"this feature was used\" per registered bit. + +## Per-language, not shared + +The two tables below are **independent**. Bit indexes are **not** shared across +languages — Python bit 13 and .NET bit 13 do not mean the same thing. This is +deliberate: the User-Agent product token already names the language +(`agent-framework-python` vs `agent-framework-dotnet`), so a decoder selects the +right table from the UA and decodes against it. Each SDK numbers and evolves its +features independently — no cross-language synchronization, no null placeholders, +no \"same bit, same meaning\" rule. + +## Encoding + +- **Width:** 64-bit unsigned integer per language. +- **Versioning:** the emission carries the version so a decoder knows the bit + mapping in effect (version is per language). +- **User-Agent:** the mask is an RFC 7231 **comment** (metadata, not a product + token), placed after the agent-framework product token: + + ```text + agent-framework-python/1.2.3 (feat=v1.) + ``` + + where `` is lowercase hex, no leading zeros, no `0x` prefix. Example + for bits 0, 1, 5 set (`0b100011 = 0x23`): + + ```text + agent-framework-python/1.2.3 (feat=v1.23) + ``` + +- **Decoding:** read the **language** from the product token, pick that table; + read `vN`, pick that version; `AND` the hex mask against each bit. Unknown high + bits (newer SDK than the decoder's copy) are ignored. + +## Emission scope (where the mask is sent) + +- **Marking is universal:** every feature sets its bit the first time it is used, + regardless of provider. +- **User-Agent `(feat=...)` comment — first-party only, per request.** Stamped + only on requests to **Azure / Foundry** endpoints (the telemetry the team can + ingest), re-evaluated **per request** so it reflects the live mask. It is + **never** sent to third-party providers — a feature fingerprint must not leak + into logs we cannot read. See [SPEC-002](002-feature-usage-telemetry.md#emission). +- **OpenTelemetry: not in v1.** Deferred primarily for privacy (a span attribute + would broadcast the fingerprint into the user's general telemetry / third-party + APM vendors). Left open behind the version prefix; see + [ADR-0027](../decisions/0027-feature-usage-bitmask-user-agent.md#considered-options). + +## Bit table — Python (`agent-framework-python`, version 1) + +Layout: core feature + provider bits 0–15 (contiguous, with room to grow), +orchestration patterns 16–21, provider/integration packages from 22. + +| Bit | Id | Feature | Marked at (representative) | +| --- | --- | --- | --- | +| 0 | `core.agent` | Agent | `agent_framework.Agent` | +| 1 | `core.harness_agent` | Harness agent | `agent_framework.create_harness_agent` | +| 2 | `core.workflow` | Workflow engine (custom graphs) | `agent_framework.WorkflowBuilder` | +| 3 | `core.mcp` | MCP tool (any transport) | `agent_framework.MCPStdioTool` | +| 4 | `core.tool_approval` | Tool-approval harness | `agent_framework.ToolApprovalMiddleware` | +| 5 | `core.memory_provider` | Memory context provider | `agent_framework.MemoryContextProvider` | +| 6 | `core.skills_provider` | Skills provider | `agent_framework.SkillsProvider` | +| 7 | `core.file_access_provider` | File-access provider | `agent_framework.FileAccessProvider` | +| 8 | `core.compaction_provider` | Context compaction provider | `agent_framework.CompactionProvider` | +| 9 | `core.todo_provider` | Todo provider | `agent_framework.TodoProvider` | +| 10 | `core.agent_mode_provider` | Agent-mode provider | `agent_framework.AgentModeProvider` | +| 11 | `core.background_agents_provider` | Background-agents provider | `agent_framework.BackgroundAgentsProvider` | +| 12 | `core.in_memory_history_provider` | In-memory history provider | `agent_framework.InMemoryHistoryProvider` | +| 13 | `core.file_history_provider` | File history provider | `agent_framework.FileHistoryProvider` | +| 14–15 | _reserved_ | growth | — | +| 16 | `orchestration.sequential` | Sequential orchestration | `agent_framework_orchestrations.SequentialBuilder` | +| 17 | `orchestration.concurrent` | Concurrent orchestration | `agent_framework_orchestrations.ConcurrentBuilder` | +| 18 | `orchestration.group_chat` | Group-chat orchestration | `agent_framework_orchestrations.GroupChatBuilder` | +| 19 | `orchestration.magentic` | Magentic orchestration | `agent_framework_orchestrations.MagenticBuilder` | +| 20 | `orchestration.handoff` | Handoff orchestration | `agent_framework_orchestrations.HandoffBuilder` | +| 21 | _reserved_ | growth | — | +| 22 | `foundry.chat_client` | Foundry chat client | `agent_framework_foundry` `RawFoundryChatClient` | +| 23 | `foundry.agent` | Foundry agent | `agent_framework_foundry.FoundryAgent` | +| 24 | `foundry.memory` | Foundry memory provider | `agent_framework_foundry.FoundryMemoryProvider` | +| 25 | `foundry_local` | Foundry Local client | `agent_framework_foundry_local.FoundryLocalClient` | +| 26 | `foundry_hosting` | Foundry hosting layer | `agent_framework_foundry_hosting` | +| 27 | `openai` | OpenAI clients | `agent_framework_openai` | +| 28 | `anthropic` | Anthropic clients | `agent_framework_anthropic` | +| 29 | `bedrock` | AWS Bedrock clients | `agent_framework_bedrock` | +| 30 | `gemini` | Gemini chat client | `agent_framework_gemini` | +| 31 | `mistral` | Mistral embedding client | `agent_framework_mistral` | +| 32 | `ollama` | Ollama clients | `agent_framework_ollama` | +| 33 | `claude` | Claude Agent SDK agent | `agent_framework_claude` | +| 34 | `copilotstudio` | Copilot Studio agent | `agent_framework_copilotstudio` | +| 35 | `github_copilot` | GitHub Copilot agent | `agent_framework_github_copilot` | +| 36 | `azure_ai_search` | Azure AI Search context provider | `agent_framework_azure_ai_search` | +| 37 | `azure_cosmos` | Azure Cosmos history / checkpoint store | `agent_framework_azure_cosmos` | +| 38 | `azure_contentunderstanding` | Azure Content Understanding context provider | `agent_framework_azure_contentunderstanding` | +| 39 | `redis` | Redis context / history provider | `agent_framework_redis` | +| 40 | `mem0` | Mem0 memory provider | `agent_framework_mem0` | +| 41 | `purview` | Purview client | `agent_framework_purview` | +| 42 | `a2a` | A2A agent / executor | `agent_framework_a2a` | +| 43 | `ag_ui` | AG-UI chat client / agent | `agent_framework_ag_ui` | +| 44 | `chatkit` | ChatKit integration | `agent_framework_chatkit` | +| 45 | `devui` | DevUI served | `agent_framework_devui` | +| 46 | `declarative` | Declarative agent / workflow | `agent_framework_declarative` | +| 47 | `durabletask` | Durable task runtime | `agent_framework_durabletask` | +| 48 | `azurefunctions` | Azure Functions agent host | `agent_framework_azurefunctions` | +| 49 | `tools` | Shell tools | `agent_framework_tools.shell` | +| 50 | `monty` | Monty CodeAct provider | `agent_framework_monty` | +| 51 | `hyperlight` | Hyperlight CodeAct provider | `agent_framework_hyperlight` | +| 52–63 | _reserved_ | future packages | — | + +## Bit table — .NET (`agent-framework-dotnet`, version 1) + +| Bit | Id | Feature | Marked at (representative) | +| --- | --- | --- | --- | +| 0 | `core.agent` | Agent | `Microsoft.Agents.AI.ChatClientAgent` | +| 1 | `core.harness_agent` | Harness agent | `Microsoft.Agents.AI.HarnessAgent` | +| 2 | `core.workflow` | Workflow engine (custom graphs) | `Microsoft.Agents.AI.Workflows.WorkflowBuilder` | +| 3 | `core.tool_approval` | Tool-approval agent | `Microsoft.Agents.AI.ToolApprovalAgent` | +| 4 | `core.chat_history_memory_provider` | Chat-history memory provider | `Microsoft.Agents.AI.ChatHistoryMemoryProvider` | +| 5 | `core.file_memory_provider` | File memory provider | `Microsoft.Agents.AI.FileMemoryProvider` | +| 6 | `core.text_search_provider` | Text-search provider | `Microsoft.Agents.AI.TextSearchProvider` | +| 7 | `core.file_access_provider` | File-access provider | `Microsoft.Agents.AI.FileAccessProvider` | +| 8 | `core.skills_provider` | Skills provider | `Microsoft.Agents.AI.AgentSkillsProviderBuilder` | +| 9 | `core.compaction_provider` | Context compaction provider | `Microsoft.Agents.AI.Compaction.CompactionProvider` | +| 10 | `core.todo_provider` | Todo provider | `Microsoft.Agents.AI.TodoProvider` | +| 11 | `core.agent_mode_provider` | Agent-mode provider | `Microsoft.Agents.AI.AgentModeProvider` | +| 12 | `core.background_agents_provider` | Background-agents provider | `Microsoft.Agents.AI.BackgroundAgentsProvider` | +| 13 | `core.in_memory_history_provider` | In-memory history provider | `Microsoft.Agents.AI.InMemoryChatHistoryProvider` | +| 14–15 | _reserved_ | growth | — | +| 16 | `orchestration.sequential` | Sequential orchestration | `Microsoft.Agents.AI.Workflows.SequentialWorkflowBuilder` | +| 17 | `orchestration.concurrent` | Concurrent orchestration | `Microsoft.Agents.AI.Workflows.ConcurrentWorkflowBuilder` | +| 18 | `orchestration.group_chat` | Group-chat orchestration | `Microsoft.Agents.AI.Workflows.GroupChatWorkflowBuilder` | +| 19 | `orchestration.magentic` | Magentic orchestration | `Microsoft.Agents.AI.Workflows.MagenticWorkflowBuilder` | +| 20 | `orchestration.handoff` | Handoff orchestration | `Microsoft.Agents.AI.Workflows.HandoffWorkflowBuilder` | +| 21 | _reserved_ | growth | — | +| 22 | `foundry.chat_client` | Foundry chat client | `Microsoft.Agents.AI.Foundry.FoundryChatClient` | +| 23 | `foundry.agent` | Foundry agent | `Microsoft.Agents.AI.Foundry.FoundryAgent` | +| 24 | `foundry.memory` | Foundry memory provider | `Microsoft.Agents.AI.Foundry.FoundryMemoryProvider` | +| 25 | `foundry_hosting` | Foundry hosting layer | `Microsoft.Agents.AI.Foundry.Hosting` | +| 26 | `openai` | OpenAI integration | `Microsoft.Agents.AI.OpenAI` | +| 27 | `anthropic` | Anthropic integration | `Microsoft.Agents.AI.Anthropic` | +| 28 | `copilotstudio` | Copilot Studio agent | `Microsoft.Agents.AI.CopilotStudio.CopilotStudioAgent` | +| 29 | `github_copilot` | GitHub Copilot agent | `Microsoft.Agents.AI.GitHub.Copilot.GitHubCopilotAgent` | +| 30 | `azure_cosmos` | Cosmos history / checkpoint store | `Microsoft.Agents.AI.CosmosChatHistoryProvider` | +| 31 | `valkey` | Valkey chat-history provider | `Microsoft.Agents.AI.Valkey.ValkeyChatHistoryProvider` | +| 32 | `mem0` | Mem0 memory provider | `Microsoft.Agents.AI.Mem0.Mem0Provider` | +| 33 | `purview` | Purview integration | `Microsoft.Agents.AI.Purview` | +| 34 | `a2a` | A2A agent | `Microsoft.Agents.AI.A2A.A2AAgent` | +| 35 | `ag_ui` | AG-UI chat client | `Microsoft.Agents.AI.AGUI.AGUIChatClient` | +| 36 | `devui` | DevUI served | `Microsoft.Agents.AI.DevUI` | +| 37 | `declarative` | Declarative agent factory | `Microsoft.Agents.AI.ChatClientPromptAgentFactory` | +| 38 | `durabletask` | Durable task runtime | `Microsoft.Agents.AI.DurableTask` | +| 39 | `azurefunctions` | Azure Functions agent host | `Microsoft.Agents.AI.Hosting.AzureFunctions` | +| 40 | `tools` | Shell tools | `Microsoft.Agents.AI.Tools.Shell.ShellExecutor` | +| 41 | `hyperlight` | Hyperlight CodeAct provider | `Microsoft.Agents.AI.Hyperlight.HyperlightCodeActProvider` | +| 42 | `hosting` | Generic AF hosting | `Microsoft.Agents.AI.Hosting` | +| 43–63 | _reserved_ | future packages | — | + +## Opt-out + +The mask is part of the User-Agent contribution, so the existing flag covers it — +no dedicated flag in v1: + +- `AGENT_FRAMEWORK_USER_AGENT_DISABLED=true|1` — suppresses the entire Agent + Framework User-Agent contribution (mask included). + +(If a privacy review later requires keeping the base UA while dropping only the +mask, a dedicated flag can be added then.) + +## Governance + +1. One bit per package/feature, **numbered independently per language**, in the + table for that language. New bits are added by editing this file in a reviewed + PR; bits are never reused within a `(language, version)`. +2. The **`FeatureBit` enum in each SDK is the source of truth**; the matching + table here is the published contract. Add the enum member and the table row in + the same PR — review keeps them aligned (no generated artifact). +3. Adding a feature: add the enum member, add the table row, mark it at the call + site (the `Raw*` base / entry point so wrappers inherit it). +4. Widening beyond 64-bit or re-partitioning bumps that language's version; old + decoders keep working because the version prefix disambiguates the mapping. + +> **No machine-readable registry file ships today.** Nothing consumes one at +> runtime (each SDK owns its enum). If/when a programmatic decoder is built, this +> table is the contract to export to JSON for it then. From e5cc3d0abc2ee66c507dbdcd132c4f4902fa2cf0 Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Fri, 12 Jun 2026 16:37:56 +0200 Subject: [PATCH 2/6] docs: fix dead links to removed registry JSON in ADR-0027 The registry JSON was consolidated into feature-usage-bit-registry.md; point the ADR's two remaining links at the markdown instead of the deleted file (fixes markdown-link-check 404s). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/decisions/0027-feature-usage-bitmask-user-agent.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/docs/decisions/0027-feature-usage-bitmask-user-agent.md b/docs/decisions/0027-feature-usage-bitmask-user-agent.md index d9ebff92d6c..b806478ea57 100644 --- a/docs/decisions/0027-feature-usage-bitmask-user-agent.md +++ b/docs/decisions/0027-feature-usage-bitmask-user-agent.md @@ -19,8 +19,8 @@ can we collect a lightweight, privacy-respecting signal of feature usage for the traffic we can actually read, without standing up new event pipelines? The detailed mechanism is in [SPEC-002](../specs/002-feature-usage-telemetry.md); -the per-language bit lists are in -[`docs/feature-usage-bit-registry.json`](../feature-usage-bit-registry.json). +the per-language bit tables are in +[feature-usage-bit-registry.md](../specs/feature-usage-bit-registry.md). ## Decision Drivers @@ -274,8 +274,7 @@ These are unresolved and should be decided before/at approval: ## More Information - Mechanism & API: [SPEC-002](../specs/002-feature-usage-telemetry.md) -- Per-language bit lists: [`docs/feature-usage-bit-registry.json`](../feature-usage-bit-registry.json) -- Encoding / opt-out / governance prose: [feature-usage-bit-registry.md](../specs/feature-usage-bit-registry.md) +- Per-language bit tables, encoding, opt-out, governance: [feature-usage-bit-registry.md](../specs/feature-usage-bit-registry.md) - Existing accumulator pattern: `python/packages/core/agent_framework/_telemetry.py` - .NET emission policies: `dotnet/src/Microsoft.Agents.AI.Foundry/AgentFrameworkUserAgentPolicy.cs`, `dotnet/src/Microsoft.Agents.AI.Foundry.Hosting/HostedAgentUserAgentPolicy.cs` From 676d8845795c864bcd23032e30655824adbe4e10 Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Fri, 12 Jun 2026 16:42:26 +0200 Subject: [PATCH 3/6] =?UTF-8?q?docs:=20address=20review=20=E2=80=94=20drop?= =?UTF-8?q?=20JSON-parity=20wording,=20clarify=20per-language=20decode?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - ADR option J: the parity test compares the enum against the per-language table in the registry doc, not a (now-removed) JSON file. - Spec .NET mapping: the wire format is shared, but the mask is decoded per-language (select the table via the UA product token) — fixes the "decoded numbers mean the same thing in both SDKs" wording that conflicted with the per-language, non-synchronized bit indexes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/decisions/0027-feature-usage-bitmask-user-agent.md | 4 ++-- docs/specs/002-feature-usage-telemetry.md | 8 +++++--- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/docs/decisions/0027-feature-usage-bitmask-user-agent.md b/docs/decisions/0027-feature-usage-bitmask-user-agent.md index b806478ea57..948fd98362a 100644 --- a/docs/decisions/0027-feature-usage-bitmask-user-agent.md +++ b/docs/decisions/0027-feature-usage-bitmask-user-agent.md @@ -139,8 +139,8 @@ already present in the UA product token. #### J. Hand-written enum + parity test (chosen) -- Good, ~40 members that change a few times a year; a 10-line test (enum vs JSON - list) is enough. +- Good, ~40 members that change a few times a year; a 10-line test (the enum vs + the per-language table in the registry doc) is enough. - Good, no build step, no generator to own. #### K. Code-generate the enums from the registry diff --git a/docs/specs/002-feature-usage-telemetry.md b/docs/specs/002-feature-usage-telemetry.md index e2b52cdbfcf..68bfbba90ae 100644 --- a/docs/specs/002-feature-usage-telemetry.md +++ b/docs/specs/002-feature-usage-telemetry.md @@ -281,9 +281,11 @@ AGENT_FRAMEWORK_USER_AGENT_DISABLED=true python app.py `(feat=...)` comment, and register the feat-stamping policy **only on Azure/Foundry clients** (e.g. `FoundryChatClient`), not on third-party `IChatClient`s. -- Same `v.` comment format ⇒ decoded numbers mean the same thing in - both SDKs. (.NET's policy was already per-request, so there is no Python/.NET - timing asymmetry.) +- Same **wire format** (`v.` comment, hex encoding) in both SDKs — + but the **mask is decoded per language**: indexes are not shared, so a decoder + must read the language from the UA product token and select that language's + table before decoding. (.NET's policy was already per-request, so there is no + Python/.NET timing asymmetry.) ## Keeping the bitmap in sync From c945b33a137eb091988a69785e87a104993fefc3 Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Mon, 15 Jun 2026 17:48:33 +0200 Subject: [PATCH 4/6] docs: add dedicated mask-only opt-out env var (AGENT_FRAMEWORK_FEATURE_MASK_DISABLED) Re-introduce a dedicated opt-out that disables only the feature mask while keeping the base agent-framework-/{version} User-Agent, alongside the existing AGENT_FRAMEWORK_USER_AGENT_DISABLED (whole UA). Updates the spec accumulator gate, API surface, opt-out table and examples; the registry opt-out section; and the ADR (decision outcome, consequences, open questions -> decided). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../0027-feature-usage-bitmask-user-agent.md | 41 ++++++---- docs/specs/002-feature-usage-telemetry.md | 74 ++++++++++++------- docs/specs/feature-usage-bit-registry.md | 15 ++-- 3 files changed, 83 insertions(+), 47 deletions(-) diff --git a/docs/decisions/0027-feature-usage-bitmask-user-agent.md b/docs/decisions/0027-feature-usage-bitmask-user-agent.md index 948fd98362a..7c1094f0d5b 100644 --- a/docs/decisions/0027-feature-usage-bitmask-user-agent.md +++ b/docs/decisions/0027-feature-usage-bitmask-user-agent.md @@ -191,18 +191,22 @@ This is the smallest design that answers the question. A 64-bit mask accumulates from universal `mark_feature_used()` calls; the token is stamped per request only on Azure/Foundry clients (live, no third-party leak); each SDK owns an independent bit list selected by the language already in the UA; the mask is -rendered as hex (`feat=v1.8410005`). OTel (C) is deferred — mainly because a -broadly-emitted span attribute would leak the fingerprint into the user's general -telemetry, against the first-party-only stance — but left open behind the version -prefix. Per-construct granularity (G), a shared registry (I), codegen (K), and the -decimal/binary/base-N representations (L, N, O) are rejected as complexity or -length the problem does not require. +rendered as hex (`feat=v1.8410005`). **Two opt-out env vars are provided:** a +dedicated `AGENT_FRAMEWORK_FEATURE_MASK_DISABLED` that drops only the mask while +keeping the base SDK identity/version User-Agent, and the existing +`AGENT_FRAMEWORK_USER_AGENT_DISABLED` that drops the whole contribution. OTel (C) +is deferred — mainly because a broadly-emitted span attribute would leak the +fingerprint into the user's general telemetry, against the first-party-only +stance — but left open behind the version prefix. Per-construct granularity (G), +a shared registry (I), codegen (K), and the decimal/binary/base-N representations +(L, N, O) are rejected as complexity or length the problem does not require. ### Consequences - Good, adds usage signal at near-zero cost, no new data flow, few moving parts. -- Good, transparent (public registry, human-decodable token) and disabled by the - existing User-Agent opt-out. +- Good, transparent (public registry, human-decodable token) and disabled by + **two** opt-out env vars: a dedicated `AGENT_FRAMEWORK_FEATURE_MASK_DISABLED` + (mask only) and the existing `AGENT_FRAMEWORK_USER_AGENT_DISABLED` (whole UA). - Good, first-party-only + per-request emission gives a live mask and no third-party fingerprint leak. - Good, 64-bit keeps .NET lock-free; per-language lists remove all cross-language @@ -257,20 +261,25 @@ These are unresolved and should be decided before/at approval: 1. **Privacy / telemetry-acceptance review (blocking).** Is a coarse, first-party-only, opt-out-able feature-combination mask acceptable telemetry? Even scoped, it transits intermediaries and is a deployment fingerprint. This - is a **release precondition**. Possible outcomes that would change the design: - require a dedicated opt-out flag (Q2), coarser granularity, hashing, or - explicit opt-in. -2. **Dedicated opt-out flag?** v1 reuses `AGENT_FRAMEWORK_USER_AGENT_DISABLED` - (mask dies with the whole UA). Do we add a mask-only flag now (keep base UA, - drop the fingerprint), or wait until asked / until the privacy review requires - it? -3. **When (if ever) to add the OTel path?** Held back mainly for **privacy**: a + is a **release precondition**. Possible outcomes that would further change the + design: coarser granularity, hashing, or explicit opt-in (a dedicated mask-only + opt-out flag is already included — see below). +2. **When (if ever) to add the OTel path?** Held back mainly for **privacy**: a span attribute broadcasts the fingerprint into the user's general telemetry and onward to third-party APM vendors, contradicting the first-party-only stance. It also carries a metric-cardinality hazard. Would the privacy review allow a broadly-emitted mask, a scoped/redacted variant, or none? Decide if/when to revisit. +### Decided + +- **Dedicated opt-out flag — included.** In addition to the existing + `AGENT_FRAMEWORK_USER_AGENT_DISABLED` (drops the whole UA), v1 ships + `AGENT_FRAMEWORK_FEATURE_MASK_DISABLED`, which drops **only** the feature mask + while keeping the base SDK identity/version User-Agent. This lets a + privacy-conscious user withhold the usage signal without losing the + support/compat value of the SDK-version header. + ## More Information - Mechanism & API: [SPEC-002](../specs/002-feature-usage-telemetry.md) diff --git a/docs/specs/002-feature-usage-telemetry.md b/docs/specs/002-feature-usage-telemetry.md index 68bfbba90ae..c09d48e635f 100644 --- a/docs/specs/002-feature-usage-telemetry.md +++ b/docs/specs/002-feature-usage-telemetry.md @@ -46,8 +46,10 @@ per-request requirement below). Secondary: ability to break down first-party traffic by feature combination (e.g. "% of Foundry traffic that also uses workflows"). -This is done **transparently**: the bit registry is public and the emitted value -is human-decodable, and the existing User-Agent opt-out disables it. +This is done **transparently**: the bit registry is public, the emitted value is +human-decodable, and two env vars disable it — a dedicated +`AGENT_FRAMEWORK_FEATURE_MASK_DISABLED` (mask only) and the existing +`AGENT_FRAMEWORK_USER_AGENT_DISABLED` (whole User-Agent). ## What is the problem being solved? @@ -66,29 +68,38 @@ we already send is far cheaper and easier to reason about for privacy. The accumulator and its helpers live in the existing `agent_framework/_telemetry.py` (alongside `get_user_agent()` / `prepend_agent_framework_to_user_agent()`), so the User-Agent machinery stays in -one module. It owns a process-global 64-bit accumulator. The existing -`AGENT_FRAMEWORK_USER_AGENT_DISABLED` flag (`IS_TELEMETRY_ENABLED` in that module) -already gates the whole User-Agent contribution, so it gates the mask too — no -new env var: +one module. It owns a process-global 64-bit accumulator. Two env vars can disable +it: the existing `AGENT_FRAMEWORK_USER_AGENT_DISABLED` (which drops the whole +User-Agent contribution, mask included), and a **dedicated** +`AGENT_FRAMEWORK_FEATURE_MASK_DISABLED` that drops **only** the feature mask while +keeping the base `agent-framework-python/{version}` User-Agent: ```python # agent_framework/_telemetry.py (same module as get_user_agent) # IS_TELEMETRY_ENABLED already defined here (AGENT_FRAMEWORK_USER_AGENT_DISABLED) +FEATURE_MASK_DISABLED_ENV_VAR = "AGENT_FRAMEWORK_FEATURE_MASK_DISABLED" REGISTRY_VERSION = 1 _feature_mask = 0 _feature_mask_lock = threading.Lock() +def _feature_mask_enabled() -> bool: + """Mask is on unless the UA is disabled or the dedicated flag is set.""" + if not IS_TELEMETRY_ENABLED: + return False + return os.environ.get(FEATURE_MASK_DISABLED_ENV_VAR, "false").lower() not in ("true", "1") + + def mark_feature_used(bit: int) -> None: """OR a feature bit into the process-global mask. Called the first time a feature is exercised. Cheap and idempotent; - a no-op when the User-Agent contribution is disabled. + a no-op when the feature mask is disabled. """ global _feature_mask - if not IS_TELEMETRY_ENABLED: + if not _feature_mask_enabled(): return with _feature_mask_lock: _feature_mask |= 1 << bit @@ -96,7 +107,7 @@ def mark_feature_used(bit: int) -> None: def get_feature_token() -> str | None: """Return ``v.`` for the accumulated mask, or None.""" - if not IS_TELEMETRY_ENABLED or _feature_mask == 0: + if not _feature_mask_enabled() or _feature_mask == 0: return None return f"v{REGISTRY_VERSION}.{_feature_mask:x}" ``` @@ -215,9 +226,10 @@ New public surface in `agent-framework-core` (exported from used by first-party per-request hooks. - `FeatureBit` (IntEnum) — hand-written source of truth for the Python bit list (see [Keeping the bitmap in sync](#keeping-the-bitmap-in-sync)). +- `FEATURE_MASK_DISABLED_ENV_VAR` constant — the dedicated mask-only opt-out env + var name (`AGENT_FRAMEWORK_FEATURE_MASK_DISABLED`). -No new env var: the existing `AGENT_FRAMEWORK_USER_AGENT_DISABLED` disables the -mask along with the rest of the User-Agent contribution. +Two independent opt-outs gate the mask; see [Opt-out](#opt-out). Behavioural change to existing API: @@ -231,15 +243,17 @@ first-party client, output is byte-for-byte identical to today. ## Opt-out -The mask is part of the User-Agent contribution, so the existing flag covers it — -no new env var in v1: +Two independent env vars, so users can drop just the mask or the whole UA: | Env var | Effect | | --- | --- | +| `AGENT_FRAMEWORK_FEATURE_MASK_DISABLED` | disables **only** the feature mask; the base `agent-framework-python/{version}` User-Agent is still sent | | `AGENT_FRAMEWORK_USER_AGENT_DISABLED` | disables the **entire** AF User-Agent contribution, mask included | -(If a privacy review later requires keeping the base UA while dropping only the -mask, a dedicated flag can be added then — not built speculatively now.) +Both accept `true`/`1` (case-insensitive). The dedicated flag lets a +privacy-conscious user keep contributing the SDK identity/version (useful for +support and compat triage) while withholding the feature-usage signal. The mask +is also disabled implicitly whenever the whole User-Agent is. ## E2E example @@ -262,7 +276,14 @@ await other.run("Hi") # User-Agent: agent-framework-python/1.2.3 ``` -Disabling the User-Agent contribution (mask included): +Drop only the feature mask (keep the base User-Agent): + +```bash +AGENT_FRAMEWORK_FEATURE_MASK_DISABLED=true python app.py +# Foundry request User-Agent: agent-framework-python/1.2.3 (no (feat=...) comment) +``` + +Drop the entire User-Agent contribution (mask included): ```bash AGENT_FRAMEWORK_USER_AGENT_DISABLED=true python app.py @@ -281,11 +302,12 @@ AGENT_FRAMEWORK_USER_AGENT_DISABLED=true python app.py `(feat=...)` comment, and register the feat-stamping policy **only on Azure/Foundry clients** (e.g. `FoundryChatClient`), not on third-party `IChatClient`s. -- Same **wire format** (`v.` comment, hex encoding) in both SDKs — - but the **mask is decoded per language**: indexes are not shared, so a decoder - must read the language from the UA product token and select that language's - table before decoding. (.NET's policy was already per-request, so there is no - Python/.NET timing asymmetry.) +- Same **wire format** (`v.` comment, hex encoding) and the same + two opt-out env vars (`AGENT_FRAMEWORK_FEATURE_MASK_DISABLED`, + `AGENT_FRAMEWORK_USER_AGENT_DISABLED`) in both SDKs — but the **mask is decoded + per language**: indexes are not shared, so a decoder must read the language from + the UA product token and select that language's table before decoding. (.NET's + policy was already per-request, so there is no Python/.NET timing asymmetry.) ## Keeping the bitmap in sync @@ -339,15 +361,17 @@ the table) are ignored. the .NET table); `FeatureUsage.MarkUsed` with lock-free `Interlocked.Or`; extend the existing per-request UA policy to stamp `(feat=...)` **only on Azure/Foundry clients**. The .NET enum is **independent** of Python's. -5. **Docs & tests** — update package `AGENTS.md`/skills; tests for the UA opt-out, - first-party scoping, and the live (non-frozen) UA. +5. **Docs & tests** — update package `AGENTS.md`/skills; tests for **both** + opt-out env vars (mask-only and whole-UA), first-party scoping, and the live + (non-frozen) UA. ## Limitations & open questions The decision-level limitations and unresolved trade-offs — privacy review (blocking), reach, per-process (not per-call) attribution, coarse granularity, -fingerprinting residue, and the dedicated-opt-out / OTel questions — are owned by -the ADR. See **[ADR-0027 → Limitations](../decisions/0027-feature-usage-bitmask-user-agent.md#limitations)** +fingerprinting residue, and the OTel question — are owned by the ADR (the +dedicated mask-only opt-out is now decided and included). See +**[ADR-0027 → Limitations](../decisions/0027-feature-usage-bitmask-user-agent.md#limitations)** and **[Open Questions](../decisions/0027-feature-usage-bitmask-user-agent.md#open-questions-for-decider-discussion)**. This spec is the implementation reference; it does not re-litigate those choices. diff --git a/docs/specs/feature-usage-bit-registry.md b/docs/specs/feature-usage-bit-registry.md index ce70d669280..6c486ea7586 100644 --- a/docs/specs/feature-usage-bit-registry.md +++ b/docs/specs/feature-usage-bit-registry.md @@ -11,7 +11,8 @@ mask back into feature names. Keep the enum and the matching table in sync in th same PR — review is the check; there is no generated artifact. This telemetry is intentionally **transparent**: this registry is public, the -emitted value is human-decodable, and the existing User-Agent opt-out disables it. +emitted value is human-decodable, and two env vars disable it (mask-only or the +whole User-Agent — see [Opt-out](#opt-out)). ## What is collected @@ -183,14 +184,16 @@ orchestration patterns 16–21, provider/integration packages from 22. ## Opt-out -The mask is part of the User-Agent contribution, so the existing flag covers it — -no dedicated flag in v1: +Two independent environment variables disable the mask: -- `AGENT_FRAMEWORK_USER_AGENT_DISABLED=true|1` — suppresses the entire Agent +- `AGENT_FRAMEWORK_FEATURE_MASK_DISABLED=true|1` — drops **only** the feature + mask; the base `agent-framework-/{version}` User-Agent is still sent. +- `AGENT_FRAMEWORK_USER_AGENT_DISABLED=true|1` — suppresses the **entire** Agent Framework User-Agent contribution (mask included). -(If a privacy review later requires keeping the base UA while dropping only the -mask, a dedicated flag can be added then.) +The dedicated flag lets a privacy-conscious user keep contributing SDK +identity/version (useful for support and compatibility triage) while withholding +the feature-usage signal. ## Governance From e075a78625d738d281a1b79fa4aa4d3d5955cb64 Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Mon, 15 Jun 2026 17:52:22 +0200 Subject: [PATCH 5/6] docs: add prior-art comparison (AWS botocore m/, Stainless, Azure, etc.) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add a Prior art section to ADR-0027 surveying how comparable SDKs encode identity/usage in the User-Agent or sidecar headers, with citations: - AWS botocore `m/` feature-code list — the direct analog (per-request, usage-based feature flags in the UA); contrasts short-code set vs our hex bitmask. - OpenAI/Anthropic Stainless `X-Stainless-*` headers (static identity). - Azure azure-core UserAgentPolicy + AZURE_TELEMETRY_DISABLED. - Google x-goog-api-client; LangSmith version token + tracing opt-in. Also add an Open Question on honoring the cross-tool DO_NOT_TRACK convention. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../0027-feature-usage-bitmask-user-agent.md | 47 +++++++++++++++++++ 1 file changed, 47 insertions(+) diff --git a/docs/decisions/0027-feature-usage-bitmask-user-agent.md b/docs/decisions/0027-feature-usage-bitmask-user-agent.md index 7c1094f0d5b..08fb3bd7c8e 100644 --- a/docs/decisions/0027-feature-usage-bitmask-user-agent.md +++ b/docs/decisions/0027-feature-usage-bitmask-user-agent.md @@ -216,6 +216,47 @@ a shared registry (I), codegen (K), and the decimal/binary/base-N representation - Bad, each feature must add a `mark_feature_used()` call, and first-party clients need a per-request hook (small, mirrors existing patterns). +## Prior art + +SDK telemetry-in-the-User-Agent is well-established; this design is closest to +AWS's, and conventional in the rest. Summary of what comparable SDKs do: + +| SDK | What's in the UA / headers | Usage-based? | Opt-out | Closest to ours? | +| --- | --- | --- | --- | --- | +| **AWS botocore** | structured UA with an `m/` token: a per-request set of **short feature codes** for features actually exercised (`WAITER`→`B`, `PAGINATOR`→`C`, retry mode, checksums, credential source, …) | **Yes** — registered at call time via `register_feature_id`, contextvar-scoped per request | `AWS_SDK_UA_APP_ID` sets app id (no opt-out for `m/`) | **Yes — direct analog** | +| **OpenAI / Anthropic** (Stainless) | sidecar `X-Stainless-*` headers: lang, package version, OS, arch, runtime, runtime version; plus per-request `x-stainless-retry-count`, `x-stainless-read-timeout` | Mostly static identity (retry/timeout are per-request) | none | No (static identity) | +| **Azure SDK** (`azure-core`) | `User-Agent: azsdk-python-{pkg}/{ver} Python/{pyver} ({platform})` | No | `AZURE_TELEMETRY_DISABLED` (tracing spans only, **not** the UA) | No | +| **Google API core** | `x-goog-api-client: gl-python/… grpc/… gax/… gapic/…` | No | none | No | +| **LangSmith** | `User-Agent: langsmith-py/{ver}`; usage lives in trace payloads | No (header) | opt-in via `LANGSMITH_TRACING_V2`/`LANGCHAIN_TRACING_V2`; `…HIDE_INPUTS/OUTPUTS` | No | + +Takeaways that shaped (or validate) our choices: + +- **AWS `m/` is the precedent for usage-based feature flags in a first-party + User-Agent.** It validates the core idea. Its key *difference* is the encoding: + AWS uses a **comma-separated set of 1–2 char short codes** (open-ended, no bit + coordination, but variable length), whereas we use a fixed-width **hex + bitmask** (compact, bounded, decode-by-AND, but needs per-language bit + allocation). We keep the bitmask for boundedness and trivial AND-decoding; + AWS's short-code set is recorded as a viable alternative if bit-position + coordination ever becomes painful (it would also drop the 64-bit ceiling). +- **First-party-only emission** is stricter than any of the above; the closest in + spirit is Stainless headers, which only reach the owning API. We make the + hostname/endpoint allowlist explicit (Azure/Foundry only). +- **Opt-out naming.** `AZURE_TELEMETRY_DISABLED` is the family precedent for our + `AGENT_FRAMEWORK_*_DISABLED` names. Separately, the cross-tool + the cross-tool `DO_NOT_TRACK` convention (honored by e.g. + HuggingFace Hub) is worth considering — see Open Questions. + +Sources: botocore [`useragent.py`](https://github.com/boto/botocore/blob/develop/botocore/useragent.py) +(`_USERAGENT_FEATURE_MAPPINGS`, `register_feature_id`, `_build_feature_metadata`); +openai-python [`_base_client.py` `platform_headers()`](https://github.com/openai/openai-python/blob/main/src/openai/_base_client.py); +anthropic-sdk-python [`_base_client.py`](https://github.com/anthropics/anthropic-sdk-python/blob/main/src/anthropic/_base_client.py); +azure-core [`_universal.py` `UserAgentPolicy`](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/core/azure-core/azure/core/pipeline/policies/_universal.py); +google-api-core [`client_info.py`](https://github.com/googleapis/python-api-core/blob/main/google/api_core/client_info.py); +langsmith-sdk [`client.py`](https://github.com/langchain-ai/langsmith-sdk/blob/main/python/langsmith/client.py) / +[`utils.py`](https://github.com/langchain-ai/langsmith-sdk/blob/main/python/langsmith/utils.py); +huggingface_hub [`constants.py`](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/constants.py). + ## Registry versioning and migration (v1 → v2) The token carries a **per-language** version (`feat=v1.`); a version bump is @@ -270,6 +311,12 @@ These are unresolved and should be decided before/at approval: stance. It also carries a metric-cardinality hazard. Would the privacy review allow a broadly-emitted mask, a scoped/redacted variant, or none? Decide if/when to revisit. +3. **Honor the cross-tool `DO_NOT_TRACK` convention?** Several ecosystems treat + `DO_NOT_TRACK=1` as a universal telemetry opt-out (HuggingFace Hub honors it; + see [Prior art](#prior-art)). Should our opt-out also respect `DO_NOT_TRACK` + (in addition to the two `AGENT_FRAMEWORK_*` flags)? Cheap to add and + community-friendly, but it widens the opt-out surface and needs a clear + precedence rule. Recommend yes; confirm with the deciders. ### Decided From 2ef56b8d89673cbef0666dfdf94642947631aecf Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Mon, 15 Jun 2026 19:07:43 +0200 Subject: [PATCH 6/6] docs: fold in botocore lessons; record accumulation-scope decision MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit botocore's m/ feature list scopes features to a per-request contextvars set that resets between calls — clean per-call attribution, but it assumes every feature lives inside a service request. That holds for an SDK natively bound to its own services; it does not for us, where many features (agent/workflow/provider construction, session setup) are not bound to any request. - ADR: add Accumulation scope options — P (process-global monotonic, chosen) vs Q (botocore per-request set, rejected) with the request-binding rationale; reference P in the decision; reframe the "no per-call attribution" limitation as a deliberate scope choice. - ADR Prior art: bitmask gives bounded token size for free (vs botocore's 1024-byte cap + truncation); mechanism is private, wire format is the contract; fix a duplicated phrase. - Spec: note the mask is process-global, monotonic, never reset (intentional, lock/Interlocked.Or-safe), the token is safe-by-construction (no sanitization), and the helpers are private API. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../0027-feature-usage-bitmask-user-agent.md | 93 +++++++++++++++---- docs/specs/002-feature-usage-telemetry.md | 14 +++ 2 files changed, 90 insertions(+), 17 deletions(-) diff --git a/docs/decisions/0027-feature-usage-bitmask-user-agent.md b/docs/decisions/0027-feature-usage-bitmask-user-agent.md index 08fb3bd7c8e..c34769222eb 100644 --- a/docs/decisions/0027-feature-usage-bitmask-user-agent.md +++ b/docs/decisions/0027-feature-usage-bitmask-user-agent.md @@ -91,6 +91,41 @@ re-evaluate it per request. - Bad, measures installation, not usage; cannot capture feature combinations — does not solve the problem. +### Accumulation scope + +#### P. Process-global, monotonic mask (chosen) + +A single mask per process; bits are OR-ed in as features are first used and never +cleared. The token reflects "what this process has used so far." + +- Good, fits our **mixed feature lifecycle**: many features are *not* bound to an + outbound request — an `Agent`, a workflow/orchestration, or a context/history + provider is constructed once and lives for the session/process. A process-wide + mask is the only scope that can represent them at all. +- Good, trivial and cheap: one OR under a lock (Python) / `Interlocked.Or` + (.NET); no per-request state plumbing. +- Neutral, coarser than per-call — early requests carry fewer bits than later + ones, and the token says "this process used X", not "this call used X". + +#### Q. Per-request set, reset between calls (botocore's model — rejected) + +AWS botocore scopes its `m/` feature codes to a `contextvars` set that is reset +between requests, giving exact per-call attribution (and it deliberately no-ops +when called outside a request context to avoid features bleeding across requests). +See [Prior art](#prior-art). + +- Good, exact per-call attribution directly in the User-Agent. +- Bad, **assumes every feature is exercised inside a single service request** — + true for botocore (an SDK natively bound to AWS service calls), but *not* for + us. Our features split into request-scoped ones (a chat call, an MCP tool + invocation) and decidedly non-request ones (agent/workflow/provider + construction, session setup). The latter have no request to attach to, so a + per-request set would simply miss them. +- Bad, needs `contextvars` propagation through every async/threaded path and a + reset discipline; the bleed-guard botocore documents is the warning sign. +- Note, per-call attribution for the request-scoped subset is better served by + the deferred OTel span path (option C) than by reshaping the UA token. + ### Granularity #### F. Per package, with core broken out per feature/provider (chosen) @@ -184,22 +219,27 @@ the Python v1 list) = decimal `138477573`. ## Decision Outcome Chosen: **a per-request, first-party-only User-Agent `(feat=...)` token (A), -with per-package granularity (F), per-language bit lists (H), hand-written enums -kept honest by a parity test (J), rendered as lowercase hex (M).** - -This is the smallest design that answers the question. A 64-bit mask accumulates -from universal `mark_feature_used()` calls; the token is stamped per request only -on Azure/Foundry clients (live, no third-party leak); each SDK owns an -independent bit list selected by the language already in the UA; the mask is -rendered as hex (`feat=v1.8410005`). **Two opt-out env vars are provided:** a -dedicated `AGENT_FRAMEWORK_FEATURE_MASK_DISABLED` that drops only the mask while -keeping the base SDK identity/version User-Agent, and the existing +with a process-global monotonic accumulator (P), per-package granularity (F), +per-language bit lists (H), hand-written enums kept honest by a parity test (J), +rendered as lowercase hex (M).** + +This is the smallest design that answers the question. A 64-bit +**process-global, monotonic** mask accumulates from universal +`mark_feature_used()` calls (so it spans construction-time and session-scoped +features that aren't bound to any request — the per-request set model (Q) can't); +the token is **stamped per request** only on Azure/Foundry clients, so it reflects +the live mask without freezing at construction (live, no third-party leak); each +SDK owns an independent bit list selected by the language already in the UA; the +mask is rendered as hex (`feat=v1.8410005`). **Two opt-out env vars are +provided:** a dedicated `AGENT_FRAMEWORK_FEATURE_MASK_DISABLED` that drops only +the mask while keeping the base SDK identity/version User-Agent, and the existing `AGENT_FRAMEWORK_USER_AGENT_DISABLED` that drops the whole contribution. OTel (C) is deferred — mainly because a broadly-emitted span attribute would leak the fingerprint into the user's general telemetry, against the first-party-only -stance — but left open behind the version prefix. Per-construct granularity (G), -a shared registry (I), codegen (K), and the decimal/binary/base-N representations -(L, N, O) are rejected as complexity or length the problem does not require. +stance — but left open behind the version prefix. Per-request scoping (Q), +per-construct granularity (G), a shared registry (I), codegen (K), and the +decimal/binary/base-N representations (L, N, O) are rejected as complexity or +length the problem does not require. ### Consequences @@ -239,13 +279,32 @@ Takeaways that shaped (or validate) our choices: allocation). We keep the bitmask for boundedness and trivial AND-decoding; AWS's short-code set is recorded as a viable alternative if bit-position coordination ever becomes painful (it would also drop the 64-bit ceiling). +- **A fixed-width bitmask gives bounded token size for free.** botocore must cap + the `m/` component at 1024 bytes and truncate at delimiter boundaries (with a + fallback log) precisely *because* its short-code set is unbounded. Our 64-bit + hex is ≤16 chars by construction — no size cap, no truncation logic. +- **Scope is where we diverge most — and deliberately.** botocore collects + features into a per-request `contextvars` set that is **reset between + requests**, and no-ops outside a request context to prevent cross-request + bleed. That works because every botocore feature is exercised *inside* an AWS + service request. We are more general: some features are request-scoped (a chat + call, an MCP tool invocation) but many are **not bound to any request** + (agent / workflow / provider construction, session setup). So we use a + **process-global, monotonic** mask (option P), which is the only scope that can + represent the non-request features. Our mask therefore intentionally "bleeds" + (accumulates) for the life of the process — the opposite of botocore's reset — + and that is the intended semantic, not the bug botocore guards against. +- **The mechanism is private; the wire format is the contract.** botocore marks + its whole user-agent module private and "subject to abrupt breaking changes." + Same for us: the Python/.NET helpers are internal, and only the emitted token + + the per-language registry tables are the stable, decodable contract. - **First-party-only emission** is stricter than any of the above; the closest in spirit is Stainless headers, which only reach the owning API. We make the hostname/endpoint allowlist explicit (Azure/Foundry only). - **Opt-out naming.** `AZURE_TELEMETRY_DISABLED` is the family precedent for our - `AGENT_FRAMEWORK_*_DISABLED` names. Separately, the cross-tool - the cross-tool `DO_NOT_TRACK` convention (honored by e.g. - HuggingFace Hub) is worth considering — see Open Questions. + `AGENT_FRAMEWORK_*_DISABLED` names. Separately, the cross-tool `DO_NOT_TRACK` + convention (honored by e.g. HuggingFace Hub) is worth considering — see Open + Questions. Sources: botocore [`useragent.py`](https://github.com/boto/botocore/blob/develop/botocore/useragent.py) (`_USERAGENT_FEATURE_MAPPINGS`, `register_feature_id`, `_build_feature_metadata`); @@ -291,7 +350,7 @@ independent for Python and .NET. | **No signal for self-hosted or third-party-only traffic.** If a process never calls Azure/Foundry, we see nothing. | First-party-only emission (A) | We can't read third-party logs anyway, and must not leak a fingerprint into them. Reach traded for privacy. | | **No OTel / per-call signal in v1.** | OTel deferred (C) — primarily on **privacy** grounds | A broadly-emitted span attribute would push the fingerprint into the user's general telemetry / third-party APM vendors, undoing the first-party-only scoping. Left open to add later if there is a compelling reason to add. | | **Mask reflects "usage so far," not the whole session.** Early requests carry fewer bits than later ones. | Process-global accumulator + per-request stamping | Honest and still useful; the team aggregates across requests. The per-request design is what makes it *grow* rather than freeze. | -| **No per-agent / per-call attribution.** The mask is one process-wide value — "this process used X", not "this agent/call used X". | Single global accumulator (simplicity) | Per-call attribution is what the deferred OTel span path would add; not needed for portfolio-level questions. | +| **No per-agent / per-call attribution.** The mask is one process-wide value — "this process used X", not "this agent/call used X". | Process-global monotonic scope (P) | A deliberate choice, not a transport limit: botocore *does* per-call attribution in the UA via a per-request `contextvars` set (Q), but that assumes every feature lives inside a service request. Many of ours don't (agent/workflow/provider construction, session setup), so process-global is the only scope that captures them. Per-call detail for the request-scoped subset is left to the deferred OTel path. | | **Coarse granularity.** Can't distinguish sub-features (e.g. openai chat vs embeddings, which shell tool). | Per-package granularity (F) + 64-bit (keeps .NET lock-free) | Matches the actual questions; finer bits can be promoted later behind the version prefix. | | **Fingerprinting risk is reduced, not eliminated.** A feature-combination mask is still a deployment signature, and it transits intermediaries (proxies/CDNs) even when first-party-scoped. | Emitting any feature-combination value | Scope + opt-out + coarse granularity mitigate it; residual risk is the subject of the privacy review below. | diff --git a/docs/specs/002-feature-usage-telemetry.md b/docs/specs/002-feature-usage-telemetry.md index c09d48e635f..a164d26e9ec 100644 --- a/docs/specs/002-feature-usage-telemetry.md +++ b/docs/specs/002-feature-usage-telemetry.md @@ -116,6 +116,20 @@ def get_feature_token() -> str | None: time a feature is genuinely exercised — at construction of a representative type (e.g. `Agent`, an `MCPTool`, a provider, a Foundry surface), never at import time. The mask grows over the process lifetime. +- **Process-global and monotonic — intentionally never reset.** Unlike a + per-request scheme (e.g. botocore's `contextvars` feature set that resets + between calls), our mask spans the whole process because many features are not + bound to any request — an agent, workflow/orchestration, or context/history + provider is constructed once and used across the session. The single global + mask is the only scope that can represent them, and its monotonic "usage so + far" growth is the intended semantic, not a bleed bug. Concurrency-safe via the + module lock (Python) / `Interlocked.Or` (.NET). +- **Token is safe by construction.** The emitted value is `v{int}.{hex}` — + characters limited to `[0-9a-fv.]` — so no header-injection sanitization is + required (contrast botocore, which must sanitize arbitrary component strings). +- **Private API.** `mark_feature_used`, `get_feature_token`, `apply_feature_token` + and the mask itself are internal helpers; only the emitted token and the + per-language registry tables are the stable, decodable contract. - **No import cycles:** the call lives in each package's own module, so `core` never imports optional packages. Each package references its bit via the shared `FeatureBit` IntEnum exported from `core`.