From 3b8cf13d362e105a1d35ace087f05ca655efde0e Mon Sep 17 00:00:00 2001
From: eavanvalkenburg <github@vanvalkenburg.eu>
Date: Fri, 12 Jun 2026 16:33:37 +0200
Subject: [PATCH 1/6] docs: ADR-0027 feature-usage bitmask in the User-Agent
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Add an ADR, design spec, and per-language bit registry for a lightweight
feature-usage signal: a 64-bit mask, emitted as a `(feat=vN.<hex>)` User-Agent
comment, stamped per request on first-party (Azure/Foundry) clients only.

- docs/decisions/0027-feature-usage-bitmask-user-agent.md — ADR (options-first,
  with Limitations, Open Questions, and v1->v2 migration)
- docs/specs/002-feature-usage-telemetry.md — design spec + implementation plan
- docs/specs/feature-usage-bit-registry.md — per-language bit tables + governance

Granularity is per package with core broken out per feature (each orchestration
pattern and built-in context/history provider). Registries are per language
(decoder selects by the language already in the UA). OpenTelemetry emission is
deferred (privacy). Docs only; no code changes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---
 .../0027-feature-usage-bitmask-user-agent.md  | 281 ++++++++++++++
 docs/specs/002-feature-usage-telemetry.md     | 356 ++++++++++++++++++
 docs/specs/feature-usage-bit-registry.md      | 210 +++++++++++
 3 files changed, 847 insertions(+)
 create mode 100644 docs/decisions/0027-feature-usage-bitmask-user-agent.md
 create mode 100644 docs/specs/002-feature-usage-telemetry.md
 create mode 100644 docs/specs/feature-usage-bit-registry.md
diff --git a/docs/decisions/0027-feature-usage-bitmask-user-agent.md b/docs/decisions/0027-feature-usage-bitmask-user-agent.md
new file mode 100644
index 00000000000..d9ebff92d6c
--- /dev/null
+++ b/docs/decisions/0027-feature-usage-bitmask-user-agent.md
@@ -0,0 +1,281 @@
+---
+status: proposed
+contact: eavanvalkenburg
+date: 2026-06-12
+deciders: eavanvalkenburg
+consulted:
+informed:
+---
+
+# Feature-usage bitmask in the User-Agent
+
+## Context and Problem Statement
+
+We can see which Agent Framework packages are installed and that *some* framework
+call happened (via the existing `agent-framework-python/{version}` User-Agent),
+but we have no usage-based signal about **which features are actually exercised**
+at runtime, nor which are used *together* (e.g. workflows + MCP + Foundry). How
+can we collect a lightweight, privacy-respecting signal of feature usage for the
+traffic we can actually read, without standing up new event pipelines?
+
+The detailed mechanism is in [SPEC-002](../specs/002-feature-usage-telemetry.md);
+the per-language bit lists are in
+[`docs/feature-usage-bit-registry.json`](../feature-usage-bit-registry.json).
+
+## Decision Drivers
+
+- **Transparency** — openly documented, human-decodable, user-controllable. No
+  hidden or obfuscated telemetry.
+- **First-party scope / no third-party leakage** — emit only to Azure/Foundry
+  endpoints (the telemetry we can ingest); never leak a feature fingerprint into
+  third-party logs we cannot read.
+- **Live signal** — reflect features exercised *so far*, re-evaluated per request,
+  not frozen at client construction.
+- **Low cost / few moving parts** — reuse telemetry already in the request path;
+  near-zero runtime overhead; as little machinery as the job needs.
+- **Privacy** — encode only coarse boolean feature usage; no identifiers,
+  arguments, prompts, or payloads.
+
+## Considered Options
+
+The options below are grouped by the decisions that matter: the **transport**,
+the **granularity**, and the **registry sharing model**.
+
+### Transport
+
+#### A. User-Agent token, first-party only, per request (chosen)
+
+Stamp a `(feat=...)` comment onto the UA, but only on Azure/Foundry clients, and
+re-evaluate it per request.
+
+- Good, reuses telemetry already sent to the one backend we can read.
+- Good, per-request stamping reflects the live mask (not frozen at construction).
+- Good, first-party scoping means no fingerprint leaks to third-party providers.
+- Good, maps onto .NET's existing per-request UA pipeline policies unchanged.
+- Bad, no signal for traffic that never hits a first-party endpoint (accepted —
+  we couldn't read it anyway).
+
+#### B. User-Agent token on all clients
+
+- Good, simplest to wire (one static header).
+- Bad, sends a deployment fingerprint to OpenAI/Anthropic/AWS/Google logs we
+  cannot read — privacy leak for zero benefit.
+- Bad, baked into static `default_headers`, so it freezes at client construction
+  and reports a near-empty mask.
+
+#### C. OpenTelemetry span/resource attribute
+
+- Good, precise per-call usage; no UA change.
+- Bad (**privacy — the main reason to hold it**), a span attribute broadcasts the
+  feature-combination fingerprint into the user's **general** telemetry pipeline,
+  which is typically exported to third-party APM vendors (Datadog, Honeycomb, …).
+  That re-introduces exactly the fingerprint leakage the first-party-only UA
+  scoping (A) was chosen to avoid — just into a different set of third parties.
+- Bad (secondary), also a cardinality footgun (a growing, combinatorial value
+  must never become a metric dimension).
+- Neutral, for the team's own goal it reaches us only if the user exports to
+  Azure Monitor and we query it.
+- **Deferred, not rejected.** The version prefix lets us add it later **if** the
+  privacy review blesses a broadly-emitted mask (or a scoped/redacted variant)
+  and a concrete query needs the per-call precision.
+
+#### D. Bespoke usage events
+
+- Good, richest detail and flexibility.
+- Bad, new data flow and cost; larger privacy surface; heavy to build and review;
+  overkill for a coarse "which features" signal.
+
+#### E. Install/import-time signal only (status quo-ish)
+
+- Good, zero new runtime work.
+- Bad, measures installation, not usage; cannot capture feature combinations —
+  does not solve the problem.
+
+### Granularity
+
+#### F. Per package, with core broken out per feature/provider (chosen)
+
+- Good, ~50 bits (Python) / ~40 (.NET) fit a **64-bit** mask, which keeps .NET's
+  accumulator lock-free (`Interlocked.Or`) and the registry hand-maintainable.
+- Good, matches the actual questions ("which orchestration / which built-in
+  provider / which package?") — each orchestration pattern and each built-in
+  context/history provider gets its own bit, since they serve different purposes.
+- Neutral, cannot distinguish sub-features *within* a provider package (e.g.
+  openai chat vs embeddings) until a bit is promoted.
+
+#### G. Per construct (one bit per instantiable type)
+
+- Good, finest detail.
+- Bad, ~96 bits forces a 128-bit mask, which forfeits .NET's lock-free
+  `Interlocked.Or` (needs a lock / `UInt128`).
+- Bad, ~96 call sites across two SDKs; the sheer count pushes toward code
+  generation and extra tests — machinery to manage machinery.
+- Bad, precision nobody's decision actually needs.
+
+### Registry sharing model
+
+#### H. Per-language bit lists (chosen)
+
+Each SDK owns an independent list; the decoder picks the list using the language
+already present in the UA product token.
+
+- Good, **no cross-language coordination**: each SDK numbers and evolves its
+  features independently; adding a Python feature never touches .NET numbering.
+- Good, no null placeholders for one-SDK features, no "same bit, same meaning"
+  rule, no SDK-aware decode caveats.
+- Good, decoding is trivial: language (from UA) + version -> list -> AND.
+- Neutral, two small lists to maintain instead of one (but they were going to
+  diverge anyway — the packages differ).
+
+#### I. Single shared cross-language registry
+
+- Good, one list, one number space.
+- Bad, forces synchronized numbering and null placeholders for features that
+  exist in only one SDK, plus SDK-aware decode rules.
+- Bad, the synchronization is pure accidental complexity — **the language is
+  already in the User-Agent**, so sharing the number space buys nothing.
+
+### Registry maintenance
+
+#### J. Hand-written enum + parity test (chosen)
+
+- Good, ~40 members that change a few times a year; a 10-line test (enum vs JSON
+  list) is enough.
+- Good, no build step, no generator to own.
+
+#### K. Code-generate the enums from the registry
+
+- Bad, a generator + drift test + schema test to maintain a short list of
+  integer constants; justified only by the per-construct bit count we rejected.
+
+### Representation (how the mask is rendered as text)
+
+All examples below encode the same mask — bits 0, 2, 16, 22, 27 set
+(agent + workflow + sequential-orchestration + foundry.chat_client + openai, in
+the Python v1 list) = decimal `138477573`.
+
+#### L. Decimal — `feat=v1.138477573`
+
+- Good, human-familiar; trivial to parse.
+- Neutral, no visual alignment to bit/nibble boundaries; slightly longer than hex
+  for large masks. No advantage over hex.
+
+#### M. Hex (chosen) — `feat=v1.8410005`
+
+- Good, compact (≤16 chars for a 64-bit mask).
+- Good, decodes with one stdlib call in every language (`int(x, 16)` /
+  `Convert.ToUInt64(x, 16)`); nibble boundaries are eyeball-able.
+- Good, lowercase, no `0x` prefix, no leading zeros — unambiguous and stable.
+
+#### N. Binary / bit-list — `feat=v1.1000010000010000000000000101` or `feat=v1.0,2,16,22,27`
+
+- Good, most directly human-readable ("which bits").
+- Bad, longest form in the UA; the bit-list needs delimiter handling and grows
+  with the number of set bits.
+
+#### O. Alphabet / base-N (e.g. Crockford base32 `feat=v1.442005`, base62 `feat=v1.9n2lf`)
+
+- Good, shortest representation.
+- Bad, needs a custom alphabet + decode table on both ends; base62 is
+  case-sensitive (fragile through case-normalizing intermediaries); not
+  eyeball-able. Premature optimization for a value that is already ≤16 chars in
+  hex.
+
+## Decision Outcome
+
+Chosen: **a per-request, first-party-only User-Agent `(feat=...)` token (A),
+with per-package granularity (F), per-language bit lists (H), hand-written enums
+kept honest by a parity test (J), rendered as lowercase hex (M).**
+
+This is the smallest design that answers the question. A 64-bit mask accumulates
+from universal `mark_feature_used()` calls; the token is stamped per request only
+on Azure/Foundry clients (live, no third-party leak); each SDK owns an
+independent bit list selected by the language already in the UA; the mask is
+rendered as hex (`feat=v1.8410005`). OTel (C) is deferred — mainly because a
+broadly-emitted span attribute would leak the fingerprint into the user's general
+telemetry, against the first-party-only stance — but left open behind the version
+prefix. Per-construct granularity (G), a shared registry (I), codegen (K), and the
+decimal/binary/base-N representations (L, N, O) are rejected as complexity or
+length the problem does not require.
+
+### Consequences
+
+- Good, adds usage signal at near-zero cost, no new data flow, few moving parts.
+- Good, transparent (public registry, human-decodable token) and disabled by the
+  existing User-Agent opt-out.
+- Good, first-party-only + per-request emission gives a live mask and no
+  third-party fingerprint leak.
+- Good, 64-bit keeps .NET lock-free; per-language lists remove all cross-language
+  sync; hand-written enums avoid a codegen toolchain.
+- Neutral, the token's reach equals first-party traffic; broader per-call signal
+  (OTel) can be added later if needed.
+- Bad, each feature must add a `mark_feature_used()` call, and first-party clients
+  need a per-request hook (small, mirrors existing patterns).
+
+## Registry versioning and migration (v1 → v2)
+
+The token carries a **per-language** version (`feat=v1.<hex>`); a version bump is
+independent for Python and .NET.
+
+- **Additive growth stays on v1 — no bump.** Allocating a new feature to a
+  reserved/unused bit is backward-compatible: an older decoder simply sees an
+  unknown high bit and ignores it. Normal package growth never needs a new
+  version.
+- **A bump (v2) is required only for breaking changes:** renumbering or
+  re-partitioning existing bits, changing the *meaning* of an already-assigned
+  bit, or widening beyond 64-bit. Within a version a bit is **never** reused or
+  reassigned — that invariant is what lets old decoders stay correct.
+- **Mixed-version coexistence is the norm.** A fleet runs many SDK releases at
+  once, so `v1` and `v2` tokens appear simultaneously for a long time (old SDKs
+  keep emitting `v1`). The decoder keeps **every** published `(language,
+  version)` table and selects by the token's version; the `v1` table is retained
+  indefinitely for historical decode.
+- **Unknown version → do not guess.** A decoder without the `vN` table must
+  record "unknown registry version" rather than decode against an older table —
+  bit meanings may differ across versions, so mis-attribution is worse than
+  no data.
+- **Producing v2:** publish the v2 table alongside v1 in the registry doc, bump
+  that SDK's `FeatureBit` enum + version constant; the SDK emits `v2` from the
+  release it ships in. Prefer staying on v1 (additive) and reserving a clean v2
+  for an eventual deliberate re-partition.
+
+## Limitations
+
+| Limitation | Caused by (choice) | Why we accepted it |
+| --- | --- | --- |
+| **No signal for self-hosted or third-party-only traffic.** If a process never calls Azure/Foundry, we see nothing. | First-party-only emission (A) | We can't read third-party logs anyway, and must not leak a fingerprint into them. Reach traded for privacy. |
+| **No OTel / per-call signal in v1.** | OTel deferred (C) — primarily on **privacy** grounds | A broadly-emitted span attribute would push the fingerprint into the user's general telemetry / third-party APM vendors, undoing the first-party-only scoping. Left open to add later if there is a compelling reason to add. |
+| **Mask reflects "usage so far," not the whole session.** Early requests carry fewer bits than later ones. | Process-global accumulator + per-request stamping | Honest and still useful; the team aggregates across requests. The per-request design is what makes it *grow* rather than freeze. |
+| **No per-agent / per-call attribution.** The mask is one process-wide value — "this process used X", not "this agent/call used X". | Single global accumulator (simplicity) | Per-call attribution is what the deferred OTel span path would add; not needed for portfolio-level questions. |
+| **Coarse granularity.** Can't distinguish sub-features (e.g. openai chat vs embeddings, which shell tool). | Per-package granularity (F) + 64-bit (keeps .NET lock-free) | Matches the actual questions; finer bits can be promoted later behind the version prefix. |
+| **Fingerprinting risk is reduced, not eliminated.** A feature-combination mask is still a deployment signature, and it transits intermediaries (proxies/CDNs) even when first-party-scoped. | Emitting any feature-combination value | Scope + opt-out + coarse granularity mitigate it; residual risk is the subject of the privacy review below. |
+
+## Open Questions (for decider discussion)
+
+These are unresolved and should be decided before/at approval:
+
+1. **Privacy / telemetry-acceptance review (blocking).** Is a coarse,
+   first-party-only, opt-out-able feature-combination mask acceptable telemetry?
+   Even scoped, it transits intermediaries and is a deployment fingerprint. This
+   is a **release precondition**. Possible outcomes that would change the design:
+   require a dedicated opt-out flag (Q2), coarser granularity, hashing, or
+   explicit opt-in.
+2. **Dedicated opt-out flag?** v1 reuses `AGENT_FRAMEWORK_USER_AGENT_DISABLED`
+   (mask dies with the whole UA). Do we add a mask-only flag now (keep base UA,
+   drop the fingerprint), or wait until asked / until the privacy review requires
+   it?
+3. **When (if ever) to add the OTel path?** Held back mainly for **privacy**: a
+   span attribute broadcasts the fingerprint into the user's general telemetry
+   and onward to third-party APM vendors, contradicting the first-party-only
+   stance. It also carries a metric-cardinality hazard. Would the privacy review
+   allow a broadly-emitted mask, a scoped/redacted variant, or none? Decide if/when
+   to revisit.
+
+## More Information
+
+- Mechanism & API: [SPEC-002](../specs/002-feature-usage-telemetry.md)
+- Per-language bit lists: [`docs/feature-usage-bit-registry.json`](../feature-usage-bit-registry.json)
+- Encoding / opt-out / governance prose: [feature-usage-bit-registry.md](../specs/feature-usage-bit-registry.md)
+- Existing accumulator pattern: `python/packages/core/agent_framework/_telemetry.py`
+- .NET emission policies: `dotnet/src/Microsoft.Agents.AI.Foundry/AgentFrameworkUserAgentPolicy.cs`,
+  `dotnet/src/Microsoft.Agents.AI.Foundry.Hosting/HostedAgentUserAgentPolicy.cs`
diff --git a/docs/specs/002-feature-usage-telemetry.md b/docs/specs/002-feature-usage-telemetry.md
new file mode 100644
index 00000000000..e2b52cdbfcf
--- /dev/null
+++ b/docs/specs/002-feature-usage-telemetry.md
@@ -0,0 +1,356 @@
+---
+status: proposed
+contact: eavanvalkenburg
+date: 2026-06-12
+deciders: eavanvalkenburg
+consulted:
+informed:
+---
+
+# Feature-usage telemetry via an accumulating bitmask
+
+> Companion design for [ADR-0027](../decisions/0027-feature-usage-bitmask-user-agent.md).
+> The per-language bit tables, encoding, opt-out, and governance live in
+> [feature-usage-bit-registry.md](feature-usage-bit-registry.md). Each SDK's
+> hand-written `FeatureBit` enum is the source of truth for that language.
+
+## What is the goal of this feature?
+
+Give the Agent Framework team a lightweight signal about **which framework
+features are actually exercised** at runtime (not merely installed), so we can
+prioritise investment based on real usage. We emit a single small number — a
+*feature mask* — on the User-Agent that already goes out with each request.
+
+**Reach is deliberately bounded.** The mask accumulates from *all* feature usage,
+but the `feat=` token is only stamped on requests to **first-party (Azure /
+Foundry) endpoints** — the only backends whose telemetry the team can ingest. We
+do **not** send the token to third-party providers (OpenAI direct, Anthropic,
+Bedrock, Gemini, Ollama, Mistral); doing so would leak a deployment fingerprint
+into logs we cannot read (see [Emission](#emission)).
+
+**Granularity is per package**, with core broken out per feature: one bit per
+orchestration pattern (sequential / concurrent / group-chat / magentic / handoff)
+and **one bit per built-in context/history provider** (memory, skills,
+file-access, compaction, todo, agent-mode, background-agents, in-memory/file
+history) — because those serve different purposes and we want to know which are
+used. See the [registry](feature-usage-bit-registry.md). The question is "are
+people using workflows / which orchestration / which providers / MCP / Foundry
+memory / Redis?", not which exact subclass. It still fits a 64-bit mask, keeps
+the .NET accumulator lock-free, and keeps the registry small enough to
+hand-maintain. Finer detail can be earned later via the version prefix.
+
+Success metric: within one release after rollout, ≥80% of first-party (Foundry)
+requests carry a **non-empty** feature token whose mask reflects features marked
+**after** client construction (i.e. the token is live, not frozen — see the
+per-request requirement below). Secondary: ability to break down first-party
+traffic by feature combination (e.g. "% of Foundry traffic that also uses
+workflows").
+
+This is done **transparently**: the bit registry is public and the emitted value
+is human-decodable, and the existing User-Agent opt-out disables it.
+
+## What is the problem being solved?
+
+Today we only know which packages are *installed* (from package telemetry) or
+that *some* Agent Framework call happened (the existing
+`agent-framework-python/{version}` User-Agent). We have no usage-based signal
+about feature combinations, and no way to tell that, say, a process uses
+workflows + MCP + Foundry together. Collecting this through bespoke events would
+add cost and new data flows; folding a tiny accumulating integer into telemetry
+we already send is far cheaper and easier to reason about for privacy.
+
+## Mechanism
+
+### Process-global accumulator in `core`
+
+The accumulator and its helpers live in the existing
+`agent_framework/_telemetry.py` (alongside `get_user_agent()` /
+`prepend_agent_framework_to_user_agent()`), so the User-Agent machinery stays in
+one module. It owns a process-global 64-bit accumulator. The existing
+`AGENT_FRAMEWORK_USER_AGENT_DISABLED` flag (`IS_TELEMETRY_ENABLED` in that module)
+already gates the whole User-Agent contribution, so it gates the mask too — no
+new env var:
+
+```python
+# agent_framework/_telemetry.py (same module as get_user_agent)
+# IS_TELEMETRY_ENABLED already defined here (AGENT_FRAMEWORK_USER_AGENT_DISABLED)
+
+REGISTRY_VERSION = 1
+
+_feature_mask = 0
+_feature_mask_lock = threading.Lock()
+
+
+def mark_feature_used(bit: int) -> None:
+    """OR a feature bit into the process-global mask.
+
+    Called the first time a feature is exercised. Cheap and idempotent;
+    a no-op when the User-Agent contribution is disabled.
+    """
+    global _feature_mask
+    if not IS_TELEMETRY_ENABLED:
+        return
+    with _feature_mask_lock:
+        _feature_mask |= 1 << bit
+
+
+def get_feature_token() -> str | None:
+    """Return ``v<version>.<hex_mask>`` for the accumulated mask, or None."""
+    if not IS_TELEMETRY_ENABLED or _feature_mask == 0:
+        return None
+    return f"v{REGISTRY_VERSION}.{_feature_mask:x}"
+```
+
+- **Per package/feature, usage-based:** `mark_feature_used()` is called the first
+  time a feature is genuinely exercised — at construction of a representative
+  type (e.g. `Agent`, an `MCPTool`, a provider, a Foundry surface), never at
+  import time. The mask grows over the process lifetime.
+- **No import cycles:** the call lives in each package's own module, so `core`
+  never imports optional packages. Each package references its bit via the shared
+  `FeatureBit` IntEnum exported from `core`.
+
+### Bit constants
+
+`core` exports a hand-written `FeatureBit` IntEnum (defined in `_telemetry.py`
+alongside the accumulator). **The enum is the source of truth** for Python; the
+Python table in [feature-usage-bit-registry.md](feature-usage-bit-registry.md) is
+its published contract, kept aligned in the same PR (see
+[Keeping the bitmap in sync](#keeping-the-bitmap-in-sync)). Each package imports
+its named member and marks it where the feature is first exercised:
+
+```python
+# agent_framework_foundry/_chat_client.py
+from agent_framework import FeatureBit, mark_feature_used
+
+class RawFoundryChatClient(...):  # base client; FoundryChatClient builds on it
+    def __init__(self, ...):
+        mark_feature_used(FeatureBit.FOUNDRY_CHAT_CLIENT)  # bit 22 in v1
+        ...
+```
+
+Mark in the **`Raw*` base client** (e.g. `RawFoundryChatClient`) so every path
+that constructs a Foundry chat client — including the higher-level
+`FoundryChatClient` — sets the bit exactly once.
+
+Using the shared enum (not literals) keeps `core` free of optional-package
+imports while guaranteeing the bit values match the registry. For reference, in
+v1 `FoundryChatClient` → bit 22, `FoundryAgent` → bit 23, Foundry memory → bit 24.
+
+## Emission
+
+**One path in v1: the User-Agent `feat=` token, stamped per request on
+first-party (Azure/Foundry) clients only.**
+
+Marking (`mark_feature_used`) is **universal** — every feature sets its bit
+regardless of provider. Only **emission** is scoped. A user who never calls a
+first-party endpoint emits no token; this is the honest, intended behaviour (no
+third-party leakage, no signal we couldn't read anyway).
+
+The base User-Agent (`agent-framework-python/{version}` plus any hosting prefix)
+is unchanged and still set once via `default_headers` on **every** client.
+`get_user_agent()` stays base-only (no `feat=`). The `feat=` token is **separate**,
+added **only** by Azure/Foundry-based clients, and **re-evaluated on each
+request** so it reflects the mask accumulated so far. A helper stamps it:
+
+```python
+# agent_framework/_telemetry.py
+def apply_feature_token(user_agent: str) -> str:
+    """Append/refresh the live ``(feat=v<ver>.<hex>)`` comment on a UA string.
+
+    Re-reads the current mask on every call, so newly accumulated bits are
+    reflected immediately. Idempotent: replaces an existing ``(feat=...)``
+    comment rather than appending a second.
+    """
+    token = get_feature_token()  # None when disabled or mask == 0
+    base = _strip_feature_comment(user_agent)
+    return f"{base} (feat={token})" if token else base
+```
+
+Because `default_headers` are static, first-party clients install a
+**per-request hook** that calls `apply_feature_token()` on each outgoing request:
+
+- **httpx-based clients** (`AzureOpenAI*` via the `openai` SDK): construct the
+  underlying client with
+  `http_client=httpx.AsyncClient(event_hooks={"request": [_stamp_feat_hook]})`,
+  where the hook mutates `request.headers["User-Agent"]`. Gate on the existing
+  `use_azure` signal in `agent_framework_openai/_shared.py` so generic OpenAI
+  clients never get the hook.
+- **azure-core pipeline clients** (`AIProjectClient`, `SearchClient`,
+  `CosmosClient`, …): add a tiny `SansIOHTTPPolicy` whose `on_request` calls
+  `apply_feature_token()` on `request.http_request.headers["User-Agent"]`. This
+  mirrors .NET's per-request `PipelinePolicy` exactly.
+
+This fixes the frozen-at-construction problem: the token is materialised at
+**send time**, not client-init time, so it carries features constructed after the
+client. It also confines the token to first-party endpoints.
+
+Encoding uses the RFC 7231 **comment** form `(feat=v1.<hex>)` (metadata, not a
+product token), placed after the agent-framework product token, e.g.:
+
+```text
+foundry-hosting/agent-framework-python/1.2.3 (feat=v1.2a)
+```
+
+### OpenTelemetry — not in v1
+
+An OTel span attribute carrying the same value was considered but **deferred —
+primarily for privacy, not complexity**. Unlike the first-party-only UA token, a
+span attribute broadcasts the feature-combination fingerprint into the user's
+**general** telemetry pipeline, which is commonly exported to third-party APM
+vendors (Datadog, Honeycomb, …) — re-introducing exactly the leakage the
+first-party scoping was chosen to avoid. (It also carries a cardinality footgun:
+a monotonically-growing, combinatorial value must never become a metric
+dimension.) The version prefix leaves the door open to add it later **if** the
+privacy review blesses a broadly-emitted or scoped/redacted variant; v1 ships the
+UA path only. See [ADR-0027 → option C](../decisions/0027-feature-usage-bitmask-user-agent.md#considered-options).
+
+## API Changes
+
+New public surface in `agent-framework-core` (exported from
+`agent_framework`):
+
+- `mark_feature_used(bit: int) -> None`
+- `get_feature_token() -> str | None` — returns `v<ver>.<hex>` or `None`.
+- `apply_feature_token(user_agent: str) -> str` — live, idempotent UA stamper
+  used by first-party per-request hooks.
+- `FeatureBit` (IntEnum) — hand-written source of truth for the Python bit list
+  (see [Keeping the bitmap in sync](#keeping-the-bitmap-in-sync)).
+
+No new env var: the existing `AGENT_FRAMEWORK_USER_AGENT_DISABLED` disables the
+mask along with the rest of the User-Agent contribution.
+
+Behavioural change to existing API:
+
+- `get_user_agent()` / `prepend_agent_framework_to_user_agent()` are
+  **unchanged** — they keep returning the base UA with no `feat=` token. The
+  token is added only by first-party per-request hooks via
+  `apply_feature_token()`.
+
+No breaking changes: when the mask is empty or disabled, or for any non
+first-party client, output is byte-for-byte identical to today.
+
+## Opt-out
+
+The mask is part of the User-Agent contribution, so the existing flag covers it —
+no new env var in v1:
+
+| Env var | Effect |
+| --- | --- |
+| `AGENT_FRAMEWORK_USER_AGENT_DISABLED` | disables the **entire** AF User-Agent contribution, mask included |
+
+(If a privacy review later requires keeping the base UA while dropping only the
+mask, a dedicated flag can be added then — not built speculatively now.)
+
+## E2E example
+
+```python
+from agent_framework import Agent
+from agent_framework_foundry import FoundryChatClient
+from agent_framework_openai import OpenAIChatClient
+
+# First-party (Foundry) client: per-request hook stamps the live feat token.
+agent = Agent(client=FoundryChatClient(...), instructions="...")
+# Agent use marks bit 0; FoundryChatClient marks bit 22
+await agent.run("Hello")
+# Outgoing request to Foundry carries:
+#   User-Agent: agent-framework-python/1.2.3 (feat=v1.<mask-at-send-time>)
+
+# Third-party client: NO feat token is added (no first-party hook).
+other = Agent(client=OpenAIChatClient(...), instructions="...")
+await other.run("Hi")
+# Outgoing request to OpenAI carries only:
+#   User-Agent: agent-framework-python/1.2.3
+```
+
+Disabling the User-Agent contribution (mask included):
+
+```bash
+AGENT_FRAMEWORK_USER_AGENT_DISABLED=true python app.py
+```
+
+## .NET mapping
+
+- `core` has a hand-written `FeatureBit` enum (`: ulong`) — the **source of
+  truth** for the .NET bit list, matching the .NET table in the registry doc —
+  plus `FeatureUsage.MarkUsed(FeatureBit)` (universal marking, as in Python).
+- 64-bit width means the accumulator is **lock-free**:
+  `Interlocked.Or(ref _mask, (long)bit)`. No lock, no `UInt128`, no split-long.
+- **Emission is per-request and first-party-scoped**, matching Python. The
+  existing `AgentFrameworkUserAgentPolicy` / `HostedAgentUserAgentPolicy`
+  pipeline policies already run per request — extend them to append/refresh the
+  `(feat=...)` comment, and register the feat-stamping policy **only on
+  Azure/Foundry clients** (e.g. `FoundryChatClient`), not on third-party
+  `IChatClient`s.
+- Same `v<version>.<hex>` comment format ⇒ decoded numbers mean the same thing in
+  both SDKs. (.NET's policy was already per-request, so there is no Python/.NET
+  timing asymmetry.)
+
+## Keeping the bitmap in sync
+
+The **`FeatureBit` enum in each SDK is the source of truth** for that language.
+[feature-usage-bit-registry.md](feature-usage-bit-registry.md) holds the matching
+**published table per language** — the contract a decoder reads. There is
+deliberately **no shared numbering** and **no machine-readable registry file**: a
+Python bit and a .NET bit with the same index need not mean the same thing, and
+each SDK adds features without coordinating with the other.
+
+Adding a feature is one PR: add the `FeatureBit` enum member, add the matching
+row in that language's table, and mark it at the call site. Review keeps the enum
+and table aligned (≈40 entries, changing a few times a year — not worth a
+generator or a generated-file drift test). If a programmatic decoder is built
+later, export that language's table to JSON for it then.
+
+### Decoding
+
+```
+UA: agent-framework-python/1.2.3 (feat=v1.2a)
+        │                          │   └ hex mask
+        │                          └ version
+        └ language → pick the Python table (version 1)
+```
+
+Read language → pick the table; read `vN` → pick that version; `AND` the hex mask
+against each bit. Unknown high bits (from a newer SDK than the decoder's copy of
+the table) are ignored.
+
+## Implementation plan (post-approval)
+
+1. **Core accumulator + enum** — in `agent_framework/_telemetry.py` add the
+   64-bit mask, lock, `mark_feature_used`, `get_feature_token`,
+   `apply_feature_token`, and the hand-written `FeatureBit` IntEnum (source of
+   truth, matching the Python table in the registry doc); `get_user_agent()`
+   stays base-only. Unit tests for the live/idempotent stamper.
+2. **First-party per-request hooks** — add the httpx `event_hooks` request hook
+   (gated on `use_azure` in `agent_framework_openai/_shared.py`) and the
+   azure-core `SansIOHTTPPolicy` (for `AIProjectClient`/`SearchClient`/Cosmos).
+   Verify against a real Foundry call that the UA carries a **non-empty,
+   post-construction** mask. **Do not** add hooks to third-party clients.
+3. **Mark feature usage** — call `mark_feature_used(FeatureBit.X)` once per
+   feature, the first time it is exercised: at the **`Raw*` base client/entry
+   point** per package (e.g. `RawFoundryChatClient`) so every higher-level
+   wrapper inherits the marking, and in the `__init__` of **each** core
+   construct that owns a bit — including every built-in context/history provider
+   (memory, skills, file-access, compaction, todo, agent-mode, background-agents,
+   in-memory/file history) and each orchestration builder. Marking is universal;
+   emission stays first-party-only.
+4. **.NET parity** — hand-written `FeatureBit : ulong` enum (source of truth for
+   the .NET table); `FeatureUsage.MarkUsed` with lock-free `Interlocked.Or`;
+   extend the existing per-request UA policy to stamp `(feat=...)` **only on
+   Azure/Foundry clients**. The .NET enum is **independent** of Python's.
+5. **Docs & tests** — update package `AGENTS.md`/skills; tests for the UA opt-out,
+   first-party scoping, and the live (non-frozen) UA.
+
+## Limitations & open questions
+
+The decision-level limitations and unresolved trade-offs — privacy review
+(blocking), reach, per-process (not per-call) attribution, coarse granularity,
+fingerprinting residue, and the dedicated-opt-out / OTel questions — are owned by
+the ADR. See **[ADR-0027 → Limitations](../decisions/0027-feature-usage-bitmask-user-agent.md#limitations)**
+and **[Open Questions](../decisions/0027-feature-usage-bitmask-user-agent.md#open-questions-for-decider-discussion)**.
+This spec is the implementation reference; it does not re-litigate those choices.
+
+Implementation-only note:
+
+- **Per-request hook overhead is negligible** (a flag check, a lock-free read of
+  the mask, and a string concat per first-party request), but benchmark the hot
+  path once if a high-QPS Foundry scenario is in scope.
diff --git a/docs/specs/feature-usage-bit-registry.md b/docs/specs/feature-usage-bit-registry.md
new file mode 100644
index 00000000000..ce70d669280
--- /dev/null
+++ b/docs/specs/feature-usage-bit-registry.md
@@ -0,0 +1,210 @@
+# Feature-usage bit registry (per-language)
+
+> **Status:** draft, accompanies [ADR-0027](../decisions/0027-feature-usage-bitmask-user-agent.md)
+> and [SPEC-002](002-feature-usage-telemetry.md).
+> **Version:** `1` per language · **Width:** 64-bit
+
+This document is the human-readable registry for the feature-usage mask. The
+**source of truth for each SDK is its own hand-written `FeatureBit` enum**; the
+tables below are the published contract a decoder (or a human) uses to turn a
+mask back into feature names. Keep the enum and the matching table in sync in the
+same PR — review is the check; there is no generated artifact.
+
+This telemetry is intentionally **transparent**: this registry is public, the
+emitted value is human-decodable, and the existing User-Agent opt-out disables it.
+
+## What is collected
+
+A single 64-bit integer (the *feature mask*) describing **which Agent Framework
+features were exercised** in a process — not which packages are installed.
+**Granularity is per package**, with core broken out per feature — each agent,
+workflow engine, MCP, orchestration pattern, and **each individual built-in
+context/history provider** gets its own bit, because they serve different
+purposes and we want to know which are used. A feature sets its bit the first
+time it is genuinely used; the SDK ORs the bits together and emits the value.
+
+No identifiers, arguments, prompts, payloads, or user data are encoded — only the
+coarse boolean \"this feature was used\" per registered bit.
+
+## Per-language, not shared
+
+The two tables below are **independent**. Bit indexes are **not** shared across
+languages — Python bit 13 and .NET bit 13 do not mean the same thing. This is
+deliberate: the User-Agent product token already names the language
+(`agent-framework-python` vs `agent-framework-dotnet`), so a decoder selects the
+right table from the UA and decodes against it. Each SDK numbers and evolves its
+features independently — no cross-language synchronization, no null placeholders,
+no \"same bit, same meaning\" rule.
+
+## Encoding
+
+- **Width:** 64-bit unsigned integer per language.
+- **Versioning:** the emission carries the version so a decoder knows the bit
+  mapping in effect (version is per language).
+- **User-Agent:** the mask is an RFC 7231 **comment** (metadata, not a product
+  token), placed after the agent-framework product token:
+
+  ```text
+  agent-framework-python/1.2.3 (feat=v1.<hex_mask>)
+  ```
+
+  where `<hex_mask>` is lowercase hex, no leading zeros, no `0x` prefix. Example
+  for bits 0, 1, 5 set (`0b100011 = 0x23`):
+
+  ```text
+  agent-framework-python/1.2.3 (feat=v1.23)
+  ```
+
+- **Decoding:** read the **language** from the product token, pick that table;
+  read `vN`, pick that version; `AND` the hex mask against each bit. Unknown high
+  bits (newer SDK than the decoder's copy) are ignored.
+
+## Emission scope (where the mask is sent)
+
+- **Marking is universal:** every feature sets its bit the first time it is used,
+  regardless of provider.
+- **User-Agent `(feat=...)` comment — first-party only, per request.** Stamped
+  only on requests to **Azure / Foundry** endpoints (the telemetry the team can
+  ingest), re-evaluated **per request** so it reflects the live mask. It is
+  **never** sent to third-party providers — a feature fingerprint must not leak
+  into logs we cannot read. See [SPEC-002](002-feature-usage-telemetry.md#emission).
+- **OpenTelemetry: not in v1.** Deferred primarily for privacy (a span attribute
+  would broadcast the fingerprint into the user's general telemetry / third-party
+  APM vendors). Left open behind the version prefix; see
+  [ADR-0027](../decisions/0027-feature-usage-bitmask-user-agent.md#considered-options).
+
+## Bit table — Python (`agent-framework-python`, version 1)
+
+Layout: core feature + provider bits 0–15 (contiguous, with room to grow),
+orchestration patterns 16–21, provider/integration packages from 22.
+
+| Bit | Id | Feature | Marked at (representative) |
+| --- | --- | --- | --- |
+| 0 | `core.agent` | Agent | `agent_framework.Agent` |
+| 1 | `core.harness_agent` | Harness agent | `agent_framework.create_harness_agent` |
+| 2 | `core.workflow` | Workflow engine (custom graphs) | `agent_framework.WorkflowBuilder` |
+| 3 | `core.mcp` | MCP tool (any transport) | `agent_framework.MCPStdioTool` |
+| 4 | `core.tool_approval` | Tool-approval harness | `agent_framework.ToolApprovalMiddleware` |
+| 5 | `core.memory_provider` | Memory context provider | `agent_framework.MemoryContextProvider` |
+| 6 | `core.skills_provider` | Skills provider | `agent_framework.SkillsProvider` |
+| 7 | `core.file_access_provider` | File-access provider | `agent_framework.FileAccessProvider` |
+| 8 | `core.compaction_provider` | Context compaction provider | `agent_framework.CompactionProvider` |
+| 9 | `core.todo_provider` | Todo provider | `agent_framework.TodoProvider` |
+| 10 | `core.agent_mode_provider` | Agent-mode provider | `agent_framework.AgentModeProvider` |
+| 11 | `core.background_agents_provider` | Background-agents provider | `agent_framework.BackgroundAgentsProvider` |
+| 12 | `core.in_memory_history_provider` | In-memory history provider | `agent_framework.InMemoryHistoryProvider` |
+| 13 | `core.file_history_provider` | File history provider | `agent_framework.FileHistoryProvider` |
+| 14–15 | _reserved_ | growth | — |
+| 16 | `orchestration.sequential` | Sequential orchestration | `agent_framework_orchestrations.SequentialBuilder` |
+| 17 | `orchestration.concurrent` | Concurrent orchestration | `agent_framework_orchestrations.ConcurrentBuilder` |
+| 18 | `orchestration.group_chat` | Group-chat orchestration | `agent_framework_orchestrations.GroupChatBuilder` |
+| 19 | `orchestration.magentic` | Magentic orchestration | `agent_framework_orchestrations.MagenticBuilder` |
+| 20 | `orchestration.handoff` | Handoff orchestration | `agent_framework_orchestrations.HandoffBuilder` |
+| 21 | _reserved_ | growth | — |
+| 22 | `foundry.chat_client` | Foundry chat client | `agent_framework_foundry` `RawFoundryChatClient` |
+| 23 | `foundry.agent` | Foundry agent | `agent_framework_foundry.FoundryAgent` |
+| 24 | `foundry.memory` | Foundry memory provider | `agent_framework_foundry.FoundryMemoryProvider` |
+| 25 | `foundry_local` | Foundry Local client | `agent_framework_foundry_local.FoundryLocalClient` |
+| 26 | `foundry_hosting` | Foundry hosting layer | `agent_framework_foundry_hosting` |
+| 27 | `openai` | OpenAI clients | `agent_framework_openai` |
+| 28 | `anthropic` | Anthropic clients | `agent_framework_anthropic` |
+| 29 | `bedrock` | AWS Bedrock clients | `agent_framework_bedrock` |
+| 30 | `gemini` | Gemini chat client | `agent_framework_gemini` |
+| 31 | `mistral` | Mistral embedding client | `agent_framework_mistral` |
+| 32 | `ollama` | Ollama clients | `agent_framework_ollama` |
+| 33 | `claude` | Claude Agent SDK agent | `agent_framework_claude` |
+| 34 | `copilotstudio` | Copilot Studio agent | `agent_framework_copilotstudio` |
+| 35 | `github_copilot` | GitHub Copilot agent | `agent_framework_github_copilot` |
+| 36 | `azure_ai_search` | Azure AI Search context provider | `agent_framework_azure_ai_search` |
+| 37 | `azure_cosmos` | Azure Cosmos history / checkpoint store | `agent_framework_azure_cosmos` |
+| 38 | `azure_contentunderstanding` | Azure Content Understanding context provider | `agent_framework_azure_contentunderstanding` |
+| 39 | `redis` | Redis context / history provider | `agent_framework_redis` |
+| 40 | `mem0` | Mem0 memory provider | `agent_framework_mem0` |
+| 41 | `purview` | Purview client | `agent_framework_purview` |
+| 42 | `a2a` | A2A agent / executor | `agent_framework_a2a` |
+| 43 | `ag_ui` | AG-UI chat client / agent | `agent_framework_ag_ui` |
+| 44 | `chatkit` | ChatKit integration | `agent_framework_chatkit` |
+| 45 | `devui` | DevUI served | `agent_framework_devui` |
+| 46 | `declarative` | Declarative agent / workflow | `agent_framework_declarative` |
+| 47 | `durabletask` | Durable task runtime | `agent_framework_durabletask` |
+| 48 | `azurefunctions` | Azure Functions agent host | `agent_framework_azurefunctions` |
+| 49 | `tools` | Shell tools | `agent_framework_tools.shell` |
+| 50 | `monty` | Monty CodeAct provider | `agent_framework_monty` |
+| 51 | `hyperlight` | Hyperlight CodeAct provider | `agent_framework_hyperlight` |
+| 52–63 | _reserved_ | future packages | — |
+
+## Bit table — .NET (`agent-framework-dotnet`, version 1)
+
+| Bit | Id | Feature | Marked at (representative) |
+| --- | --- | --- | --- |
+| 0 | `core.agent` | Agent | `Microsoft.Agents.AI.ChatClientAgent` |
+| 1 | `core.harness_agent` | Harness agent | `Microsoft.Agents.AI.HarnessAgent` |
+| 2 | `core.workflow` | Workflow engine (custom graphs) | `Microsoft.Agents.AI.Workflows.WorkflowBuilder` |
+| 3 | `core.tool_approval` | Tool-approval agent | `Microsoft.Agents.AI.ToolApprovalAgent` |
+| 4 | `core.chat_history_memory_provider` | Chat-history memory provider | `Microsoft.Agents.AI.ChatHistoryMemoryProvider` |
+| 5 | `core.file_memory_provider` | File memory provider | `Microsoft.Agents.AI.FileMemoryProvider` |
+| 6 | `core.text_search_provider` | Text-search provider | `Microsoft.Agents.AI.TextSearchProvider` |
+| 7 | `core.file_access_provider` | File-access provider | `Microsoft.Agents.AI.FileAccessProvider` |
+| 8 | `core.skills_provider` | Skills provider | `Microsoft.Agents.AI.AgentSkillsProviderBuilder` |
+| 9 | `core.compaction_provider` | Context compaction provider | `Microsoft.Agents.AI.Compaction.CompactionProvider` |
+| 10 | `core.todo_provider` | Todo provider | `Microsoft.Agents.AI.TodoProvider` |
+| 11 | `core.agent_mode_provider` | Agent-mode provider | `Microsoft.Agents.AI.AgentModeProvider` |
+| 12 | `core.background_agents_provider` | Background-agents provider | `Microsoft.Agents.AI.BackgroundAgentsProvider` |
+| 13 | `core.in_memory_history_provider` | In-memory history provider | `Microsoft.Agents.AI.InMemoryChatHistoryProvider` |
+| 14–15 | _reserved_ | growth | — |
+| 16 | `orchestration.sequential` | Sequential orchestration | `Microsoft.Agents.AI.Workflows.SequentialWorkflowBuilder` |
+| 17 | `orchestration.concurrent` | Concurrent orchestration | `Microsoft.Agents.AI.Workflows.ConcurrentWorkflowBuilder` |
+| 18 | `orchestration.group_chat` | Group-chat orchestration | `Microsoft.Agents.AI.Workflows.GroupChatWorkflowBuilder` |
+| 19 | `orchestration.magentic` | Magentic orchestration | `Microsoft.Agents.AI.Workflows.MagenticWorkflowBuilder` |
+| 20 | `orchestration.handoff` | Handoff orchestration | `Microsoft.Agents.AI.Workflows.HandoffWorkflowBuilder` |
+| 21 | _reserved_ | growth | — |
+| 22 | `foundry.chat_client` | Foundry chat client | `Microsoft.Agents.AI.Foundry.FoundryChatClient` |
+| 23 | `foundry.agent` | Foundry agent | `Microsoft.Agents.AI.Foundry.FoundryAgent` |
+| 24 | `foundry.memory` | Foundry memory provider | `Microsoft.Agents.AI.Foundry.FoundryMemoryProvider` |
+| 25 | `foundry_hosting` | Foundry hosting layer | `Microsoft.Agents.AI.Foundry.Hosting` |
+| 26 | `openai` | OpenAI integration | `Microsoft.Agents.AI.OpenAI` |
+| 27 | `anthropic` | Anthropic integration | `Microsoft.Agents.AI.Anthropic` |
+| 28 | `copilotstudio` | Copilot Studio agent | `Microsoft.Agents.AI.CopilotStudio.CopilotStudioAgent` |
+| 29 | `github_copilot` | GitHub Copilot agent | `Microsoft.Agents.AI.GitHub.Copilot.GitHubCopilotAgent` |
+| 30 | `azure_cosmos` | Cosmos history / checkpoint store | `Microsoft.Agents.AI.CosmosChatHistoryProvider` |
+| 31 | `valkey` | Valkey chat-history provider | `Microsoft.Agents.AI.Valkey.ValkeyChatHistoryProvider` |
+| 32 | `mem0` | Mem0 memory provider | `Microsoft.Agents.AI.Mem0.Mem0Provider` |
+| 33 | `purview` | Purview integration | `Microsoft.Agents.AI.Purview` |
+| 34 | `a2a` | A2A agent | `Microsoft.Agents.AI.A2A.A2AAgent` |
+| 35 | `ag_ui` | AG-UI chat client | `Microsoft.Agents.AI.AGUI.AGUIChatClient` |
+| 36 | `devui` | DevUI served | `Microsoft.Agents.AI.DevUI` |
+| 37 | `declarative` | Declarative agent factory | `Microsoft.Agents.AI.ChatClientPromptAgentFactory` |
+| 38 | `durabletask` | Durable task runtime | `Microsoft.Agents.AI.DurableTask` |
+| 39 | `azurefunctions` | Azure Functions agent host | `Microsoft.Agents.AI.Hosting.AzureFunctions` |
+| 40 | `tools` | Shell tools | `Microsoft.Agents.AI.Tools.Shell.ShellExecutor` |
+| 41 | `hyperlight` | Hyperlight CodeAct provider | `Microsoft.Agents.AI.Hyperlight.HyperlightCodeActProvider` |
+| 42 | `hosting` | Generic AF hosting | `Microsoft.Agents.AI.Hosting` |
+| 43–63 | _reserved_ | future packages | — |
+
+## Opt-out
+
+The mask is part of the User-Agent contribution, so the existing flag covers it —
+no dedicated flag in v1:
+
+- `AGENT_FRAMEWORK_USER_AGENT_DISABLED=true|1` — suppresses the entire Agent
+  Framework User-Agent contribution (mask included).
+
+(If a privacy review later requires keeping the base UA while dropping only the
+mask, a dedicated flag can be added then.)
+
+## Governance
+
+1. One bit per package/feature, **numbered independently per language**, in the
+   table for that language. New bits are added by editing this file in a reviewed
+   PR; bits are never reused within a `(language, version)`.
+2. The **`FeatureBit` enum in each SDK is the source of truth**; the matching
+   table here is the published contract. Add the enum member and the table row in
+   the same PR — review keeps them aligned (no generated artifact).
+3. Adding a feature: add the enum member, add the table row, mark it at the call
+   site (the `Raw*` base / entry point so wrappers inherit it).
+4. Widening beyond 64-bit or re-partitioning bumps that language's version; old
+   decoders keep working because the version prefix disambiguates the mapping.
+
+> **No machine-readable registry file ships today.** Nothing consumes one at
+> runtime (each SDK owns its enum). If/when a programmatic decoder is built, this
+> table is the contract to export to JSON for it then.

From e5cc3d0abc2ee66c507dbdcd132c4f4902fa2cf0 Mon Sep 17 00:00:00 2001
From: eavanvalkenburg <github@vanvalkenburg.eu>
Date: Fri, 12 Jun 2026 16:37:56 +0200
Subject: [PATCH 2/6] docs: fix dead links to removed registry JSON in ADR-0027

The registry JSON was consolidated into feature-usage-bit-registry.md; point
the ADR's two remaining links at the markdown instead of the deleted file
(fixes markdown-link-check 404s).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---
 docs/decisions/0027-feature-usage-bitmask-user-agent.md | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/docs/decisions/0027-feature-usage-bitmask-user-agent.md b/docs/decisions/0027-feature-usage-bitmask-user-agent.md
index d9ebff92d6c..b806478ea57 100644
--- a/docs/decisions/0027-feature-usage-bitmask-user-agent.md
+++ b/docs/decisions/0027-feature-usage-bitmask-user-agent.md
@@ -19,8 +19,8 @@ can we collect a lightweight, privacy-respecting signal of feature usage for the
 traffic we can actually read, without standing up new event pipelines?
 
 The detailed mechanism is in [SPEC-002](../specs/002-feature-usage-telemetry.md);
-the per-language bit lists are in
-[`docs/feature-usage-bit-registry.json`](../feature-usage-bit-registry.json).
+the per-language bit tables are in
+[feature-usage-bit-registry.md](../specs/feature-usage-bit-registry.md).
 
 ## Decision Drivers
 
@@ -274,8 +274,7 @@ These are unresolved and should be decided before/at approval:
 ## More Information
 
 - Mechanism & API: [SPEC-002](../specs/002-feature-usage-telemetry.md)
-- Per-language bit lists: [`docs/feature-usage-bit-registry.json`](../feature-usage-bit-registry.json)
-- Encoding / opt-out / governance prose: [feature-usage-bit-registry.md](../specs/feature-usage-bit-registry.md)
+- Per-language bit tables, encoding, opt-out, governance: [feature-usage-bit-registry.md](../specs/feature-usage-bit-registry.md)
 - Existing accumulator pattern: `python/packages/core/agent_framework/_telemetry.py`
 - .NET emission policies: `dotnet/src/Microsoft.Agents.AI.Foundry/AgentFrameworkUserAgentPolicy.cs`,
   `dotnet/src/Microsoft.Agents.AI.Foundry.Hosting/HostedAgentUserAgentPolicy.cs`

From 676d8845795c864bcd23032e30655824adbe4e10 Mon Sep 17 00:00:00 2001
From: eavanvalkenburg <github@vanvalkenburg.eu>
Date: Fri, 12 Jun 2026 16:42:26 +0200
Subject: [PATCH 3/6] =?UTF-8?q?docs:=20address=20review=20=E2=80=94=20drop?=
 =?UTF-8?q?=20JSON-parity=20wording,=20clarify=20per-language=20decode?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- ADR option J: the parity test compares the enum against the per-language table
  in the registry doc, not a (now-removed) JSON file.
- Spec .NET mapping: the wire format is shared, but the mask is decoded
  per-language (select the table via the UA product token) — fixes the
  "decoded numbers mean the same thing in both SDKs" wording that conflicted
  with the per-language, non-synchronized bit indexes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---
 docs/decisions/0027-feature-usage-bitmask-user-agent.md | 4 ++--
 docs/specs/002-feature-usage-telemetry.md               | 8 +++++---
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/docs/decisions/0027-feature-usage-bitmask-user-agent.md b/docs/decisions/0027-feature-usage-bitmask-user-agent.md
index b806478ea57..948fd98362a 100644
--- a/docs/decisions/0027-feature-usage-bitmask-user-agent.md
+++ b/docs/decisions/0027-feature-usage-bitmask-user-agent.md
@@ -139,8 +139,8 @@ already present in the UA product token.
 
 #### J. Hand-written enum + parity test (chosen)
 
-- Good, ~40 members that change a few times a year; a 10-line test (enum vs JSON
-  list) is enough.
+- Good, ~40 members that change a few times a year; a 10-line test (the enum vs
+  the per-language table in the registry doc) is enough.
 - Good, no build step, no generator to own.
 
 #### K. Code-generate the enums from the registry
diff --git a/docs/specs/002-feature-usage-telemetry.md b/docs/specs/002-feature-usage-telemetry.md
index e2b52cdbfcf..68bfbba90ae 100644
--- a/docs/specs/002-feature-usage-telemetry.md
+++ b/docs/specs/002-feature-usage-telemetry.md
@@ -281,9 +281,11 @@ AGENT_FRAMEWORK_USER_AGENT_DISABLED=true python app.py
   `(feat=...)` comment, and register the feat-stamping policy **only on
   Azure/Foundry clients** (e.g. `FoundryChatClient`), not on third-party
   `IChatClient`s.
-- Same `v<version>.<hex>` comment format ⇒ decoded numbers mean the same thing in
-  both SDKs. (.NET's policy was already per-request, so there is no Python/.NET
-  timing asymmetry.)
+- Same **wire format** (`v<version>.<hex>` comment, hex encoding) in both SDKs —
+  but the **mask is decoded per language**: indexes are not shared, so a decoder
+  must read the language from the UA product token and select that language's
+  table before decoding. (.NET's policy was already per-request, so there is no
+  Python/.NET timing asymmetry.)
 
 ## Keeping the bitmap in sync
 

From c945b33a137eb091988a69785e87a104993fefc3 Mon Sep 17 00:00:00 2001
From: eavanvalkenburg <github@vanvalkenburg.eu>
Date: Mon, 15 Jun 2026 17:48:33 +0200
Subject: [PATCH 4/6] docs: add dedicated mask-only opt-out env var
 (AGENT_FRAMEWORK_FEATURE_MASK_DISABLED)

Re-introduce a dedicated opt-out that disables only the feature mask while keeping
the base agent-framework-<lang>/{version} User-Agent, alongside the existing
AGENT_FRAMEWORK_USER_AGENT_DISABLED (whole UA). Updates the spec accumulator gate,
API surface, opt-out table and examples; the registry opt-out section; and the
ADR (decision outcome, consequences, open questions -> decided).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---
 .../0027-feature-usage-bitmask-user-agent.md  | 41 ++++++----
 docs/specs/002-feature-usage-telemetry.md     | 74 ++++++++++++-------
 docs/specs/feature-usage-bit-registry.md      | 15 ++--
 3 files changed, 83 insertions(+), 47 deletions(-)

diff --git a/docs/decisions/0027-feature-usage-bitmask-user-agent.md b/docs/decisions/0027-feature-usage-bitmask-user-agent.md
index 948fd98362a..7c1094f0d5b 100644
--- a/docs/decisions/0027-feature-usage-bitmask-user-agent.md
+++ b/docs/decisions/0027-feature-usage-bitmask-user-agent.md
@@ -191,18 +191,22 @@ This is the smallest design that answers the question. A 64-bit mask accumulates
 from universal `mark_feature_used()` calls; the token is stamped per request only
 on Azure/Foundry clients (live, no third-party leak); each SDK owns an
 independent bit list selected by the language already in the UA; the mask is
-rendered as hex (`feat=v1.8410005`). OTel (C) is deferred — mainly because a
-broadly-emitted span attribute would leak the fingerprint into the user's general
-telemetry, against the first-party-only stance — but left open behind the version
-prefix. Per-construct granularity (G), a shared registry (I), codegen (K), and the
-decimal/binary/base-N representations (L, N, O) are rejected as complexity or
-length the problem does not require.
+rendered as hex (`feat=v1.8410005`). **Two opt-out env vars are provided:** a
+dedicated `AGENT_FRAMEWORK_FEATURE_MASK_DISABLED` that drops only the mask while
+keeping the base SDK identity/version User-Agent, and the existing
+`AGENT_FRAMEWORK_USER_AGENT_DISABLED` that drops the whole contribution. OTel (C)
+is deferred — mainly because a broadly-emitted span attribute would leak the
+fingerprint into the user's general telemetry, against the first-party-only
+stance — but left open behind the version prefix. Per-construct granularity (G),
+a shared registry (I), codegen (K), and the decimal/binary/base-N representations
+(L, N, O) are rejected as complexity or length the problem does not require.
 
 ### Consequences
 
 - Good, adds usage signal at near-zero cost, no new data flow, few moving parts.
-- Good, transparent (public registry, human-decodable token) and disabled by the
-  existing User-Agent opt-out.
+- Good, transparent (public registry, human-decodable token) and disabled by
+  **two** opt-out env vars: a dedicated `AGENT_FRAMEWORK_FEATURE_MASK_DISABLED`
+  (mask only) and the existing `AGENT_FRAMEWORK_USER_AGENT_DISABLED` (whole UA).
 - Good, first-party-only + per-request emission gives a live mask and no
   third-party fingerprint leak.
 - Good, 64-bit keeps .NET lock-free; per-language lists remove all cross-language
@@ -257,20 +261,25 @@ These are unresolved and should be decided before/at approval:
 1. **Privacy / telemetry-acceptance review (blocking).** Is a coarse,
    first-party-only, opt-out-able feature-combination mask acceptable telemetry?
    Even scoped, it transits intermediaries and is a deployment fingerprint. This
-   is a **release precondition**. Possible outcomes that would change the design:
-   require a dedicated opt-out flag (Q2), coarser granularity, hashing, or
-   explicit opt-in.
-2. **Dedicated opt-out flag?** v1 reuses `AGENT_FRAMEWORK_USER_AGENT_DISABLED`
-   (mask dies with the whole UA). Do we add a mask-only flag now (keep base UA,
-   drop the fingerprint), or wait until asked / until the privacy review requires
-   it?
-3. **When (if ever) to add the OTel path?** Held back mainly for **privacy**: a
+   is a **release precondition**. Possible outcomes that would further change the
+   design: coarser granularity, hashing, or explicit opt-in (a dedicated mask-only
+   opt-out flag is already included — see below).
+2. **When (if ever) to add the OTel path?** Held back mainly for **privacy**: a
    span attribute broadcasts the fingerprint into the user's general telemetry
    and onward to third-party APM vendors, contradicting the first-party-only
    stance. It also carries a metric-cardinality hazard. Would the privacy review
    allow a broadly-emitted mask, a scoped/redacted variant, or none? Decide if/when
    to revisit.
 
+### Decided
+
+- **Dedicated opt-out flag — included.** In addition to the existing
+  `AGENT_FRAMEWORK_USER_AGENT_DISABLED` (drops the whole UA), v1 ships
+  `AGENT_FRAMEWORK_FEATURE_MASK_DISABLED`, which drops **only** the feature mask
+  while keeping the base SDK identity/version User-Agent. This lets a
+  privacy-conscious user withhold the usage signal without losing the
+  support/compat value of the SDK-version header.
+
 ## More Information
 
 - Mechanism & API: [SPEC-002](../specs/002-feature-usage-telemetry.md)
diff --git a/docs/specs/002-feature-usage-telemetry.md b/docs/specs/002-feature-usage-telemetry.md
index 68bfbba90ae..c09d48e635f 100644
--- a/docs/specs/002-feature-usage-telemetry.md
+++ b/docs/specs/002-feature-usage-telemetry.md
@@ -46,8 +46,10 @@ per-request requirement below). Secondary: ability to break down first-party
 traffic by feature combination (e.g. "% of Foundry traffic that also uses
 workflows").
 
-This is done **transparently**: the bit registry is public and the emitted value
-is human-decodable, and the existing User-Agent opt-out disables it.
+This is done **transparently**: the bit registry is public, the emitted value is
+human-decodable, and two env vars disable it — a dedicated
+`AGENT_FRAMEWORK_FEATURE_MASK_DISABLED` (mask only) and the existing
+`AGENT_FRAMEWORK_USER_AGENT_DISABLED` (whole User-Agent).
 
 ## What is the problem being solved?
 
@@ -66,29 +68,38 @@ we already send is far cheaper and easier to reason about for privacy.
 The accumulator and its helpers live in the existing
 `agent_framework/_telemetry.py` (alongside `get_user_agent()` /
 `prepend_agent_framework_to_user_agent()`), so the User-Agent machinery stays in
-one module. It owns a process-global 64-bit accumulator. The existing
-`AGENT_FRAMEWORK_USER_AGENT_DISABLED` flag (`IS_TELEMETRY_ENABLED` in that module)
-already gates the whole User-Agent contribution, so it gates the mask too — no
-new env var:
+one module. It owns a process-global 64-bit accumulator. Two env vars can disable
+it: the existing `AGENT_FRAMEWORK_USER_AGENT_DISABLED` (which drops the whole
+User-Agent contribution, mask included), and a **dedicated**
+`AGENT_FRAMEWORK_FEATURE_MASK_DISABLED` that drops **only** the feature mask while
+keeping the base `agent-framework-python/{version}` User-Agent:
 
 ```python
 # agent_framework/_telemetry.py (same module as get_user_agent)
 # IS_TELEMETRY_ENABLED already defined here (AGENT_FRAMEWORK_USER_AGENT_DISABLED)
 
+FEATURE_MASK_DISABLED_ENV_VAR = "AGENT_FRAMEWORK_FEATURE_MASK_DISABLED"
 REGISTRY_VERSION = 1
 
 _feature_mask = 0
 _feature_mask_lock = threading.Lock()
 
 
+def _feature_mask_enabled() -> bool:
+    """Mask is on unless the UA is disabled or the dedicated flag is set."""
+    if not IS_TELEMETRY_ENABLED:
+        return False
+    return os.environ.get(FEATURE_MASK_DISABLED_ENV_VAR, "false").lower() not in ("true", "1")
+
+
 def mark_feature_used(bit: int) -> None:
     """OR a feature bit into the process-global mask.
 
     Called the first time a feature is exercised. Cheap and idempotent;
-    a no-op when the User-Agent contribution is disabled.
+    a no-op when the feature mask is disabled.
     """
     global _feature_mask
-    if not IS_TELEMETRY_ENABLED:
+    if not _feature_mask_enabled():
         return
     with _feature_mask_lock:
         _feature_mask |= 1 << bit
@@ -96,7 +107,7 @@ def mark_feature_used(bit: int) -> None:
 
 def get_feature_token() -> str | None:
     """Return ``v<version>.<hex_mask>`` for the accumulated mask, or None."""
-    if not IS_TELEMETRY_ENABLED or _feature_mask == 0:
+    if not _feature_mask_enabled() or _feature_mask == 0:
         return None
     return f"v{REGISTRY_VERSION}.{_feature_mask:x}"
 ```
@@ -215,9 +226,10 @@ New public surface in `agent-framework-core` (exported from
   used by first-party per-request hooks.
 - `FeatureBit` (IntEnum) — hand-written source of truth for the Python bit list
   (see [Keeping the bitmap in sync](#keeping-the-bitmap-in-sync)).
+- `FEATURE_MASK_DISABLED_ENV_VAR` constant — the dedicated mask-only opt-out env
+  var name (`AGENT_FRAMEWORK_FEATURE_MASK_DISABLED`).
 
-No new env var: the existing `AGENT_FRAMEWORK_USER_AGENT_DISABLED` disables the
-mask along with the rest of the User-Agent contribution.
+Two independent opt-outs gate the mask; see [Opt-out](#opt-out).
 
 Behavioural change to existing API:
 
@@ -231,15 +243,17 @@ first-party client, output is byte-for-byte identical to today.
 
 ## Opt-out
 
-The mask is part of the User-Agent contribution, so the existing flag covers it —
-no new env var in v1:
+Two independent env vars, so users can drop just the mask or the whole UA:
 
 | Env var | Effect |
 | --- | --- |
+| `AGENT_FRAMEWORK_FEATURE_MASK_DISABLED` | disables **only** the feature mask; the base `agent-framework-python/{version}` User-Agent is still sent |
 | `AGENT_FRAMEWORK_USER_AGENT_DISABLED` | disables the **entire** AF User-Agent contribution, mask included |
 
-(If a privacy review later requires keeping the base UA while dropping only the
-mask, a dedicated flag can be added then — not built speculatively now.)
+Both accept `true`/`1` (case-insensitive). The dedicated flag lets a
+privacy-conscious user keep contributing the SDK identity/version (useful for
+support and compat triage) while withholding the feature-usage signal. The mask
+is also disabled implicitly whenever the whole User-Agent is.
 
 ## E2E example
 
@@ -262,7 +276,14 @@ await other.run("Hi")
 #   User-Agent: agent-framework-python/1.2.3
 ```
 
-Disabling the User-Agent contribution (mask included):
+Drop only the feature mask (keep the base User-Agent):
+
+```bash
+AGENT_FRAMEWORK_FEATURE_MASK_DISABLED=true python app.py
+# Foundry request User-Agent: agent-framework-python/1.2.3   (no (feat=...) comment)
+```
+
+Drop the entire User-Agent contribution (mask included):
 
 ```bash
 AGENT_FRAMEWORK_USER_AGENT_DISABLED=true python app.py
@@ -281,11 +302,12 @@ AGENT_FRAMEWORK_USER_AGENT_DISABLED=true python app.py
   `(feat=...)` comment, and register the feat-stamping policy **only on
   Azure/Foundry clients** (e.g. `FoundryChatClient`), not on third-party
   `IChatClient`s.
-- Same **wire format** (`v<version>.<hex>` comment, hex encoding) in both SDKs —
-  but the **mask is decoded per language**: indexes are not shared, so a decoder
-  must read the language from the UA product token and select that language's
-  table before decoding. (.NET's policy was already per-request, so there is no
-  Python/.NET timing asymmetry.)
+- Same **wire format** (`v<version>.<hex>` comment, hex encoding) and the same
+  two opt-out env vars (`AGENT_FRAMEWORK_FEATURE_MASK_DISABLED`,
+  `AGENT_FRAMEWORK_USER_AGENT_DISABLED`) in both SDKs — but the **mask is decoded
+  per language**: indexes are not shared, so a decoder must read the language from
+  the UA product token and select that language's table before decoding. (.NET's
+  policy was already per-request, so there is no Python/.NET timing asymmetry.)
 
 ## Keeping the bitmap in sync
 
@@ -339,15 +361,17 @@ the table) are ignored.
    the .NET table); `FeatureUsage.MarkUsed` with lock-free `Interlocked.Or`;
    extend the existing per-request UA policy to stamp `(feat=...)` **only on
    Azure/Foundry clients**. The .NET enum is **independent** of Python's.
-5. **Docs & tests** — update package `AGENTS.md`/skills; tests for the UA opt-out,
-   first-party scoping, and the live (non-frozen) UA.
+5. **Docs & tests** — update package `AGENTS.md`/skills; tests for **both**
+   opt-out env vars (mask-only and whole-UA), first-party scoping, and the live
+   (non-frozen) UA.
 
 ## Limitations & open questions
 
 The decision-level limitations and unresolved trade-offs — privacy review
 (blocking), reach, per-process (not per-call) attribution, coarse granularity,
-fingerprinting residue, and the dedicated-opt-out / OTel questions — are owned by
-the ADR. See **[ADR-0027 → Limitations](../decisions/0027-feature-usage-bitmask-user-agent.md#limitations)**
+fingerprinting residue, and the OTel question — are owned by the ADR (the
+dedicated mask-only opt-out is now decided and included). See
+**[ADR-0027 → Limitations](../decisions/0027-feature-usage-bitmask-user-agent.md#limitations)**
 and **[Open Questions](../decisions/0027-feature-usage-bitmask-user-agent.md#open-questions-for-decider-discussion)**.
 This spec is the implementation reference; it does not re-litigate those choices.
 
diff --git a/docs/specs/feature-usage-bit-registry.md b/docs/specs/feature-usage-bit-registry.md
index ce70d669280..6c486ea7586 100644
--- a/docs/specs/feature-usage-bit-registry.md
+++ b/docs/specs/feature-usage-bit-registry.md
@@ -11,7 +11,8 @@ mask back into feature names. Keep the enum and the matching table in sync in th
 same PR — review is the check; there is no generated artifact.
 
 This telemetry is intentionally **transparent**: this registry is public, the
-emitted value is human-decodable, and the existing User-Agent opt-out disables it.
+emitted value is human-decodable, and two env vars disable it (mask-only or the
+whole User-Agent — see [Opt-out](#opt-out)).
 
 ## What is collected
 
@@ -183,14 +184,16 @@ orchestration patterns 16–21, provider/integration packages from 22.
 
 ## Opt-out
 
-The mask is part of the User-Agent contribution, so the existing flag covers it —
-no dedicated flag in v1:
+Two independent environment variables disable the mask:
 
-- `AGENT_FRAMEWORK_USER_AGENT_DISABLED=true|1` — suppresses the entire Agent
+- `AGENT_FRAMEWORK_FEATURE_MASK_DISABLED=true|1` — drops **only** the feature
+  mask; the base `agent-framework-<lang>/{version}` User-Agent is still sent.
+- `AGENT_FRAMEWORK_USER_AGENT_DISABLED=true|1` — suppresses the **entire** Agent
   Framework User-Agent contribution (mask included).
 
-(If a privacy review later requires keeping the base UA while dropping only the
-mask, a dedicated flag can be added then.)
+The dedicated flag lets a privacy-conscious user keep contributing SDK
+identity/version (useful for support and compatibility triage) while withholding
+the feature-usage signal.
 
 ## Governance
 

From e075a78625d738d281a1b79fa4aa4d3d5955cb64 Mon Sep 17 00:00:00 2001
From: eavanvalkenburg <github@vanvalkenburg.eu>
Date: Mon, 15 Jun 2026 17:52:22 +0200
Subject: [PATCH 5/6] docs: add prior-art comparison (AWS botocore m/,
 Stainless, Azure, etc.)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Add a Prior art section to ADR-0027 surveying how comparable SDKs encode
identity/usage in the User-Agent or sidecar headers, with citations:

- AWS botocore `m/` feature-code list — the direct analog (per-request,
  usage-based feature flags in the UA); contrasts short-code set vs our hex
  bitmask.
- OpenAI/Anthropic Stainless `X-Stainless-*` headers (static identity).
- Azure azure-core UserAgentPolicy + AZURE_TELEMETRY_DISABLED.
- Google x-goog-api-client; LangSmith version token + tracing opt-in.

Also add an Open Question on honoring the cross-tool DO_NOT_TRACK convention.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---
 .../0027-feature-usage-bitmask-user-agent.md  | 47 +++++++++++++++++++
 1 file changed, 47 insertions(+)

diff --git a/docs/decisions/0027-feature-usage-bitmask-user-agent.md b/docs/decisions/0027-feature-usage-bitmask-user-agent.md
index 7c1094f0d5b..08fb3bd7c8e 100644
--- a/docs/decisions/0027-feature-usage-bitmask-user-agent.md
+++ b/docs/decisions/0027-feature-usage-bitmask-user-agent.md
@@ -216,6 +216,47 @@ a shared registry (I), codegen (K), and the decimal/binary/base-N representation
 - Bad, each feature must add a `mark_feature_used()` call, and first-party clients
   need a per-request hook (small, mirrors existing patterns).
 
+## Prior art
+
+SDK telemetry-in-the-User-Agent is well-established; this design is closest to
+AWS's, and conventional in the rest. Summary of what comparable SDKs do:
+
+| SDK | What's in the UA / headers | Usage-based? | Opt-out | Closest to ours? |
+| --- | --- | --- | --- | --- |
+| **AWS botocore** | structured UA with an `m/` token: a per-request set of **short feature codes** for features actually exercised (`WAITER`→`B`, `PAGINATOR`→`C`, retry mode, checksums, credential source, …) | **Yes** — registered at call time via `register_feature_id`, contextvar-scoped per request | `AWS_SDK_UA_APP_ID` sets app id (no opt-out for `m/`) | **Yes — direct analog** |
+| **OpenAI / Anthropic** (Stainless) | sidecar `X-Stainless-*` headers: lang, package version, OS, arch, runtime, runtime version; plus per-request `x-stainless-retry-count`, `x-stainless-read-timeout` | Mostly static identity (retry/timeout are per-request) | none | No (static identity) |
+| **Azure SDK** (`azure-core`) | `User-Agent: azsdk-python-{pkg}/{ver} Python/{pyver} ({platform})` | No | `AZURE_TELEMETRY_DISABLED` (tracing spans only, **not** the UA) | No |
+| **Google API core** | `x-goog-api-client: gl-python/… grpc/… gax/… gapic/…` | No | none | No |
+| **LangSmith** | `User-Agent: langsmith-py/{ver}`; usage lives in trace payloads | No (header) | opt-in via `LANGSMITH_TRACING_V2`/`LANGCHAIN_TRACING_V2`; `…HIDE_INPUTS/OUTPUTS` | No |
+
+Takeaways that shaped (or validate) our choices:
+
+- **AWS `m/` is the precedent for usage-based feature flags in a first-party
+  User-Agent.** It validates the core idea. Its key *difference* is the encoding:
+  AWS uses a **comma-separated set of 1–2 char short codes** (open-ended, no bit
+  coordination, but variable length), whereas we use a fixed-width **hex
+  bitmask** (compact, bounded, decode-by-AND, but needs per-language bit
+  allocation). We keep the bitmask for boundedness and trivial AND-decoding;
+  AWS's short-code set is recorded as a viable alternative if bit-position
+  coordination ever becomes painful (it would also drop the 64-bit ceiling).
+- **First-party-only emission** is stricter than any of the above; the closest in
+  spirit is Stainless headers, which only reach the owning API. We make the
+  hostname/endpoint allowlist explicit (Azure/Foundry only).
+- **Opt-out naming.** `AZURE_TELEMETRY_DISABLED` is the family precedent for our
+  `AGENT_FRAMEWORK_*_DISABLED` names. Separately, the cross-tool
+  the cross-tool `DO_NOT_TRACK` convention (honored by e.g.
+  HuggingFace Hub) is worth considering — see Open Questions.
+
+Sources: botocore [`useragent.py`](https://github.com/boto/botocore/blob/develop/botocore/useragent.py)
+(`_USERAGENT_FEATURE_MAPPINGS`, `register_feature_id`, `_build_feature_metadata`);
+openai-python [`_base_client.py` `platform_headers()`](https://github.com/openai/openai-python/blob/main/src/openai/_base_client.py);
+anthropic-sdk-python [`_base_client.py`](https://github.com/anthropics/anthropic-sdk-python/blob/main/src/anthropic/_base_client.py);
+azure-core [`_universal.py` `UserAgentPolicy`](https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/core/azure-core/azure/core/pipeline/policies/_universal.py);
+google-api-core [`client_info.py`](https://github.com/googleapis/python-api-core/blob/main/google/api_core/client_info.py);
+langsmith-sdk [`client.py`](https://github.com/langchain-ai/langsmith-sdk/blob/main/python/langsmith/client.py) /
+[`utils.py`](https://github.com/langchain-ai/langsmith-sdk/blob/main/python/langsmith/utils.py);
+huggingface_hub [`constants.py`](https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/constants.py).
+
 ## Registry versioning and migration (v1 → v2)
 
 The token carries a **per-language** version (`feat=v1.<hex>`); a version bump is
@@ -270,6 +311,12 @@ These are unresolved and should be decided before/at approval:
    stance. It also carries a metric-cardinality hazard. Would the privacy review
    allow a broadly-emitted mask, a scoped/redacted variant, or none? Decide if/when
    to revisit.
+3. **Honor the cross-tool `DO_NOT_TRACK` convention?** Several ecosystems treat
+   `DO_NOT_TRACK=1` as a universal telemetry opt-out (HuggingFace Hub honors it;
+   see [Prior art](#prior-art)). Should our opt-out also respect `DO_NOT_TRACK`
+   (in addition to the two `AGENT_FRAMEWORK_*` flags)? Cheap to add and
+   community-friendly, but it widens the opt-out surface and needs a clear
+   precedence rule. Recommend yes; confirm with the deciders.
 
 ### Decided
 

From 2ef56b8d89673cbef0666dfdf94642947631aecf Mon Sep 17 00:00:00 2001
From: eavanvalkenburg <github@vanvalkenburg.eu>
Date: Mon, 15 Jun 2026 19:07:43 +0200
Subject: [PATCH 6/6] docs: fold in botocore lessons; record accumulation-scope
 decision
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

botocore's m/ feature list scopes features to a per-request contextvars set that
resets between calls — clean per-call attribution, but it assumes every feature
lives inside a service request. That holds for an SDK natively bound to its own
services; it does not for us, where many features (agent/workflow/provider
construction, session setup) are not bound to any request.

- ADR: add Accumulation scope options — P (process-global monotonic, chosen) vs
  Q (botocore per-request set, rejected) with the request-binding rationale;
  reference P in the decision; reframe the "no per-call attribution" limitation
  as a deliberate scope choice.
- ADR Prior art: bitmask gives bounded token size for free (vs botocore's
  1024-byte cap + truncation); mechanism is private, wire format is the contract;
  fix a duplicated phrase.
- Spec: note the mask is process-global, monotonic, never reset (intentional,
  lock/Interlocked.Or-safe), the token is safe-by-construction (no sanitization),
  and the helpers are private API.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---
 .../0027-feature-usage-bitmask-user-agent.md  | 93 +++++++++++++++----
 docs/specs/002-feature-usage-telemetry.md     | 14 +++
 2 files changed, 90 insertions(+), 17 deletions(-)

diff --git a/docs/decisions/0027-feature-usage-bitmask-user-agent.md b/docs/decisions/0027-feature-usage-bitmask-user-agent.md
index 08fb3bd7c8e..c34769222eb 100644
--- a/docs/decisions/0027-feature-usage-bitmask-user-agent.md
+++ b/docs/decisions/0027-feature-usage-bitmask-user-agent.md
@@ -91,6 +91,41 @@ re-evaluate it per request.
 - Bad, measures installation, not usage; cannot capture feature combinations —
   does not solve the problem.
 
+### Accumulation scope
+
+#### P. Process-global, monotonic mask (chosen)
+
+A single mask per process; bits are OR-ed in as features are first used and never
+cleared. The token reflects "what this process has used so far."
+
+- Good, fits our **mixed feature lifecycle**: many features are *not* bound to an
+  outbound request — an `Agent`, a workflow/orchestration, or a context/history
+  provider is constructed once and lives for the session/process. A process-wide
+  mask is the only scope that can represent them at all.
+- Good, trivial and cheap: one OR under a lock (Python) / `Interlocked.Or`
+  (.NET); no per-request state plumbing.
+- Neutral, coarser than per-call — early requests carry fewer bits than later
+  ones, and the token says "this process used X", not "this call used X".
+
+#### Q. Per-request set, reset between calls (botocore's model — rejected)
+
+AWS botocore scopes its `m/` feature codes to a `contextvars` set that is reset
+between requests, giving exact per-call attribution (and it deliberately no-ops
+when called outside a request context to avoid features bleeding across requests).
+See [Prior art](#prior-art).
+
+- Good, exact per-call attribution directly in the User-Agent.
+- Bad, **assumes every feature is exercised inside a single service request** —
+  true for botocore (an SDK natively bound to AWS service calls), but *not* for
+  us. Our features split into request-scoped ones (a chat call, an MCP tool
+  invocation) and decidedly non-request ones (agent/workflow/provider
+  construction, session setup). The latter have no request to attach to, so a
+  per-request set would simply miss them.
+- Bad, needs `contextvars` propagation through every async/threaded path and a
+  reset discipline; the bleed-guard botocore documents is the warning sign.
+- Note, per-call attribution for the request-scoped subset is better served by
+  the deferred OTel span path (option C) than by reshaping the UA token.
+
 ### Granularity
 
 #### F. Per package, with core broken out per feature/provider (chosen)
@@ -184,22 +219,27 @@ the Python v1 list) = decimal `138477573`.
 ## Decision Outcome
 
 Chosen: **a per-request, first-party-only User-Agent `(feat=...)` token (A),
-with per-package granularity (F), per-language bit lists (H), hand-written enums
-kept honest by a parity test (J), rendered as lowercase hex (M).**
-
-This is the smallest design that answers the question. A 64-bit mask accumulates
-from universal `mark_feature_used()` calls; the token is stamped per request only
-on Azure/Foundry clients (live, no third-party leak); each SDK owns an
-independent bit list selected by the language already in the UA; the mask is
-rendered as hex (`feat=v1.8410005`). **Two opt-out env vars are provided:** a
-dedicated `AGENT_FRAMEWORK_FEATURE_MASK_DISABLED` that drops only the mask while
-keeping the base SDK identity/version User-Agent, and the existing
+with a process-global monotonic accumulator (P), per-package granularity (F),
+per-language bit lists (H), hand-written enums kept honest by a parity test (J),
+rendered as lowercase hex (M).**
+
+This is the smallest design that answers the question. A 64-bit
+**process-global, monotonic** mask accumulates from universal
+`mark_feature_used()` calls (so it spans construction-time and session-scoped
+features that aren't bound to any request — the per-request set model (Q) can't);
+the token is **stamped per request** only on Azure/Foundry clients, so it reflects
+the live mask without freezing at construction (live, no third-party leak); each
+SDK owns an independent bit list selected by the language already in the UA; the
+mask is rendered as hex (`feat=v1.8410005`). **Two opt-out env vars are
+provided:** a dedicated `AGENT_FRAMEWORK_FEATURE_MASK_DISABLED` that drops only
+the mask while keeping the base SDK identity/version User-Agent, and the existing
 `AGENT_FRAMEWORK_USER_AGENT_DISABLED` that drops the whole contribution. OTel (C)
 is deferred — mainly because a broadly-emitted span attribute would leak the
 fingerprint into the user's general telemetry, against the first-party-only
-stance — but left open behind the version prefix. Per-construct granularity (G),
-a shared registry (I), codegen (K), and the decimal/binary/base-N representations
-(L, N, O) are rejected as complexity or length the problem does not require.
+stance — but left open behind the version prefix. Per-request scoping (Q),
+per-construct granularity (G), a shared registry (I), codegen (K), and the
+decimal/binary/base-N representations (L, N, O) are rejected as complexity or
+length the problem does not require.
 
 ### Consequences
 
@@ -239,13 +279,32 @@ Takeaways that shaped (or validate) our choices:
   allocation). We keep the bitmask for boundedness and trivial AND-decoding;
   AWS's short-code set is recorded as a viable alternative if bit-position
   coordination ever becomes painful (it would also drop the 64-bit ceiling).
+- **A fixed-width bitmask gives bounded token size for free.** botocore must cap
+  the `m/` component at 1024 bytes and truncate at delimiter boundaries (with a
+  fallback log) precisely *because* its short-code set is unbounded. Our 64-bit
+  hex is ≤16 chars by construction — no size cap, no truncation logic.
+- **Scope is where we diverge most — and deliberately.** botocore collects
+  features into a per-request `contextvars` set that is **reset between
+  requests**, and no-ops outside a request context to prevent cross-request
+  bleed. That works because every botocore feature is exercised *inside* an AWS
+  service request. We are more general: some features are request-scoped (a chat
+  call, an MCP tool invocation) but many are **not bound to any request**
+  (agent / workflow / provider construction, session setup). So we use a
+  **process-global, monotonic** mask (option P), which is the only scope that can
+  represent the non-request features. Our mask therefore intentionally "bleeds"
+  (accumulates) for the life of the process — the opposite of botocore's reset —
+  and that is the intended semantic, not the bug botocore guards against.
+- **The mechanism is private; the wire format is the contract.** botocore marks
+  its whole user-agent module private and "subject to abrupt breaking changes."
+  Same for us: the Python/.NET helpers are internal, and only the emitted token +
+  the per-language registry tables are the stable, decodable contract.
 - **First-party-only emission** is stricter than any of the above; the closest in
   spirit is Stainless headers, which only reach the owning API. We make the
   hostname/endpoint allowlist explicit (Azure/Foundry only).
 - **Opt-out naming.** `AZURE_TELEMETRY_DISABLED` is the family precedent for our
-  `AGENT_FRAMEWORK_*_DISABLED` names. Separately, the cross-tool
-  the cross-tool `DO_NOT_TRACK` convention (honored by e.g.
-  HuggingFace Hub) is worth considering — see Open Questions.
+  `AGENT_FRAMEWORK_*_DISABLED` names. Separately, the cross-tool `DO_NOT_TRACK`
+  convention (honored by e.g. HuggingFace Hub) is worth considering — see Open
+  Questions.
 
 Sources: botocore [`useragent.py`](https://github.com/boto/botocore/blob/develop/botocore/useragent.py)
 (`_USERAGENT_FEATURE_MAPPINGS`, `register_feature_id`, `_build_feature_metadata`);
@@ -291,7 +350,7 @@ independent for Python and .NET.
 | **No signal for self-hosted or third-party-only traffic.** If a process never calls Azure/Foundry, we see nothing. | First-party-only emission (A) | We can't read third-party logs anyway, and must not leak a fingerprint into them. Reach traded for privacy. |
 | **No OTel / per-call signal in v1.** | OTel deferred (C) — primarily on **privacy** grounds | A broadly-emitted span attribute would push the fingerprint into the user's general telemetry / third-party APM vendors, undoing the first-party-only scoping. Left open to add later if there is a compelling reason to add. |
 | **Mask reflects "usage so far," not the whole session.** Early requests carry fewer bits than later ones. | Process-global accumulator + per-request stamping | Honest and still useful; the team aggregates across requests. The per-request design is what makes it *grow* rather than freeze. |
-| **No per-agent / per-call attribution.** The mask is one process-wide value — "this process used X", not "this agent/call used X". | Single global accumulator (simplicity) | Per-call attribution is what the deferred OTel span path would add; not needed for portfolio-level questions. |
+| **No per-agent / per-call attribution.** The mask is one process-wide value — "this process used X", not "this agent/call used X". | Process-global monotonic scope (P) | A deliberate choice, not a transport limit: botocore *does* per-call attribution in the UA via a per-request `contextvars` set (Q), but that assumes every feature lives inside a service request. Many of ours don't (agent/workflow/provider construction, session setup), so process-global is the only scope that captures them. Per-call detail for the request-scoped subset is left to the deferred OTel path. |
 | **Coarse granularity.** Can't distinguish sub-features (e.g. openai chat vs embeddings, which shell tool). | Per-package granularity (F) + 64-bit (keeps .NET lock-free) | Matches the actual questions; finer bits can be promoted later behind the version prefix. |
 | **Fingerprinting risk is reduced, not eliminated.** A feature-combination mask is still a deployment signature, and it transits intermediaries (proxies/CDNs) even when first-party-scoped. | Emitting any feature-combination value | Scope + opt-out + coarse granularity mitigate it; residual risk is the subject of the privacy review below. |
 
diff --git a/docs/specs/002-feature-usage-telemetry.md b/docs/specs/002-feature-usage-telemetry.md
index c09d48e635f..a164d26e9ec 100644
--- a/docs/specs/002-feature-usage-telemetry.md
+++ b/docs/specs/002-feature-usage-telemetry.md
@@ -116,6 +116,20 @@ def get_feature_token() -> str | None:
   time a feature is genuinely exercised — at construction of a representative
   type (e.g. `Agent`, an `MCPTool`, a provider, a Foundry surface), never at
   import time. The mask grows over the process lifetime.
+- **Process-global and monotonic — intentionally never reset.** Unlike a
+  per-request scheme (e.g. botocore's `contextvars` feature set that resets
+  between calls), our mask spans the whole process because many features are not
+  bound to any request — an agent, workflow/orchestration, or context/history
+  provider is constructed once and used across the session. The single global
+  mask is the only scope that can represent them, and its monotonic "usage so
+  far" growth is the intended semantic, not a bleed bug. Concurrency-safe via the
+  module lock (Python) / `Interlocked.Or` (.NET).
+- **Token is safe by construction.** The emitted value is `v{int}.{hex}` —
+  characters limited to `[0-9a-fv.]` — so no header-injection sanitization is
+  required (contrast botocore, which must sanitize arbitrary component strings).
+- **Private API.** `mark_feature_used`, `get_feature_token`, `apply_feature_token`
+  and the mask itself are internal helpers; only the emitted token and the
+  per-language registry tables are the stable, decodable contract.
 - **No import cycles:** the call lives in each package's own module, so `core`
   never imports optional packages. Each package references its bit via the shared
   `FeatureBit` IntEnum exported from `core`.