feat: Add token usage telemetry by Vamshi-Microsoft · Pull Request #435 · microsoft/agentic-applications-for-unified-data-foundation-solution-accelerator

Vamshi-Microsoft · 2026-06-19T10:25:54Z

Purpose

This pull request adds telemetry for LLM token usage to the chat API, enabling better tracking of model usage and associated costs. The changes introduce a process-wide telemetry emitter, integrate token usage reporting into chat streaming endpoints, and provide configuration via environment variables for sampling, user ID hashing, and model pricing. The .coveragerc file is also updated to exclude the new telemetry module from coverage reports.

Telemetry infrastructure and configuration:

Introduced a new token_emitter singleton in telemetry.py, which configures a TokenUsageEmitter for process-wide use. This supports environment variable configuration for sample rate (LLM_TOKEN_SAMPLE_RATE), user ID hashing (LLM_TOKEN_USER_ID_HMAC_KEY), and model pricing (LLM_TOKEN_PRICING).
Updated .coveragerc to omit the llm_token_telemetry.py file from coverage reports.

Integration with chat endpoints:

In chat.py, imported the telemetry emitter and supporting utilities, and integrated token usage telemetry into the stream_openai_text endpoint. Token usage is extracted from responses and emitted after streaming completes, with error handling to avoid breaking the response flow. [1] [2]
In the stream_openai_text_workshop endpoint, added a TokenUsageScope to accumulate token usage from streaming chunks, emitting telemetry after the stream completes. Errors in telemetry emission are logged but do not affect the main response. [1] [2]

Does this introduce a breaking change?

Yes
No

Golden Path Validation

I have tested the primary workflows (the "golden path") to ensure they function correctly without errors.

Deployment Validation

I have validated the deployment process successfully and all services are running as expected with this change.

What to Check

Verify that the following are valid

...

Other Information

The extract_usage_from_stream_chunk function only checked messages[*].contents[*].usage_details, but agent-framework-foundry AgentResponseUpdate objects expose contents directly (no wrapping messages list). Usage Content items with usage_details were being missed, causing LLM_Token_Usage_Summary events to never emit in workshop (IS_WORKSHOP=True) mode. Now also checks chunk.contents[*].usage_details directly. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-06-19T10:27:00Z

Coverage Report •

File	Stmts	Miss	Cover	Missing
chat.py	456	138	69%	92–93, 150, 209, 215–225, 235–239, 241–249, 251, 253–254, 256, 263, 269–271, 274–275, 279–280, 287–288, 298–302, 304–305, 307–315, 379, 404–405, 408, 412, 417, 453–455, 461, 474–476, 480–481, 484–487, 489, 493, 497–498, 520–521, 523–528, 530–533, 539, 543–551, 560–561, 582, 595, 606–614, 616–617, 619–622, 706–707, 709, 713, 716, 722–727, 732–735
telemetry.py	46	24	47%	49–53, 60, 62–64, 66, 73–86
TOTAL	1796	306	82%

Tests	Skipped	Failures	Errors	Time
489	0 💤	0 ❌	0 🔥	7.629s ⏱️

github-actions · 2026-06-19T10:27:07Z

Unit Test Results

489 tests 489 ✅ 7s ⏱️
1 suites 0 💤
1 files 0 ❌

Results for commit c029bae.

♻️ This comment has been updated with latest results.

Copilot

Pull request overview

This PR introduces LLM token-usage telemetry for the Python chat API by adding a shared, process-wide TokenUsageEmitter, wiring token extraction/emission into streaming chat endpoints, and providing environment-variable configuration for sampling, user ID hashing, and model pricing.

Changes:

Added llm_token_telemetry.py (extraction helpers, emitter, and scope/decorator utilities) plus a telemetry.py singleton (token_emitter) configured via env vars.
Integrated token usage reporting into stream_openai_text and stream_openai_text_workshop.
Updated .coveragerc to omit the new telemetry helper module from coverage.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 9 comments.

File	Description
src/api/python/telemetry.py	Adds a process-wide `token_emitter` singleton configured from environment variables.
src/api/python/llm_token_telemetry.py	Introduces a shared telemetry helper module (usage extraction + standardized event emission).
src/api/python/chat.py	Emits token usage telemetry for chat streaming endpoints (standard + workshop mode).
.coveragerc	Excludes the new telemetry helper module from coverage collection.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Move telemetry imports after load_dotenv() so .env values apply - Use AZURE_AI_AGENT_MODEL_DEPLOYMENT_NAME instead of agent name for model labeling - Accumulate token usage across all tool-call iterations (non-workshop) - Wrap workshop streaming in try/finally for exception-safe telemetry emission - Update telemetry.py docstring to document actual import-time side effects - Downgrade emit_all() log from INFO to DEBUG to avoid PII/volume issues - Fix double extraction in TokenUsageScope.add() Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Preserve original behavior where None is passed when env var is unset, rather than empty string which could behave differently on the API side. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

+            agent_name = os.getenv("AGENT_NAME_CHAT", "")
+            model_deployment_name = os.getenv("AZURE_AI_AGENT_MODEL_DEPLOYMENT_NAME", "")
+
            response = await openai_client.responses.create(
                conversation=thread_conversation_id,
                input=query,
                extra_body={"agent_reference": {"name": os.getenv("AGENT_NAME_CHAT"), "type": "agent_reference"}}
            )


                # Submit tool outputs and get next response
                response = await openai_client.responses.create(
                    conversation=thread_conversation_id,
                    input=tool_outputs,
                    extra_body={"agent_reference": {"name": os.getenv("AGENT_NAME_CHAT"), "type": "agent_reference"}}
                )


+        start_ns = time.perf_counter_ns()
+        try:
+            found = extract_usage_from_stream_chunk(source) or extract_usage(source)
+        except Exception as exc:  # belt + braces; extractors are already safe


 [run]
 omit =
    */test_*.py
+    */llm_token_telemetry.py


+            try:
+                if accumulated_usage and accumulated_usage.has_any:
+                    resolved_model = getattr(response, "model", "") or model_deployment_name
+                    token_emitter.emit_all(
+                        agent_name=agent_name,
+                        model_deployment_name=resolved_model,
+                        usage=accumulated_usage,
+                        conversation_id=conversation_id,
+                        user_id=user_id,
+                    )
+            except Exception:
+                logger.debug("Token usage telemetry failed", exc_info=True)


Vamshi-Microsoft and others added 2 commits June 18, 2026 15:19

add token usage telemetry and emitter for LLM interactions

25c2160

Copilot AI review requested due to automatic review settings June 19, 2026 10:25

Vamshi-Microsoft requested review from Avijit-Microsoft, Roopan-Microsoft, aniaroramsft, brittneek, dgp10801, malrose07, nchandhi and toherman-msft as code owners June 19, 2026 10:25

Vamshi-Microsoft temporarily deployed to production June 19, 2026 10:26 — with GitHub Actions Inactive

Copilot started reviewing on behalf of Vamshi-Microsoft June 19, 2026 10:26 View session

Copilot AI reviewed Jun 19, 2026

View reviewed changes

Vamshi-Microsoft had a problem deploying to production June 19, 2026 11:04 — with GitHub Actions Failure

Vamshi-Microsoft temporarily deployed to production June 19, 2026 11:11 — with GitHub Actions Inactive

Vamshi-Microsoft temporarily deployed to production June 19, 2026 11:13 — with GitHub Actions Inactive

fix: revert extra_body agent_name to original os.getenv() call

c029bae

Preserve original behavior where None is passed when env var is unset, rather than empty string which could behave differently on the API side. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings June 19, 2026 12:08

Vamshi-Microsoft temporarily deployed to production June 19, 2026 12:08 — with GitHub Actions Inactive

Copilot started reviewing on behalf of Vamshi-Microsoft June 19, 2026 12:09 View session

Copilot AI reviewed Jun 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add token usage telemetry#435

feat: Add token usage telemetry#435
Vamshi-Microsoft wants to merge 4 commits into
devfrom
psl-tokenMonitoring

Vamshi-Microsoft commented Jun 19, 2026

Uh oh!

github-actions Bot commented Jun 19, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 19, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Vamshi-Microsoft commented Jun 19, 2026

Purpose

Does this introduce a breaking change?

Golden Path Validation

Deployment Validation

What to Check

Other Information

Uh oh!

github-actions Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Unit Test Results

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented Jun 19, 2026 •

edited

Loading

github-actions Bot commented Jun 19, 2026 •

edited

Loading