Adding embedding token metric by shuningc · Pull Request #248 · signalfx/splunk-otel-python-contrib

shuningc · 2026-04-07T01:08:37Z

Summary

This PR adds token usage metrics for embedding operations, bringing parity with LLM invocations. Previously, embeddings only emitted duration metrics; now they also emit gen_ai.client.token.usage with input token counts.

Changes

Core (opentelemetry-util-genai)

util/opentelemetry-util-genai/src/opentelemetry/util/genai/emitters/metrics.py

Added _record_token_metrics() call in on_end() for EmbeddingInvocation
Added _record_token_metrics() call in on_error() for EmbeddingInvocation
Added get_context_metric_attributes() for session context on embedding metrics
Passes None for completion_tokens (embeddings don't produce output tokens)

util/opentelemetry-util-genai/tests/test_metrics.py

Added EmbeddingInvocation import
Added _invoke_embedding() helper method
Added _invoke_embedding_failure() helper method
Added test_embedding_emits_input_token_metric - verifies token metric with correct attributes
Added test_embedding_failure_emits_token_metric - verifies metrics on error path

LangChain Instrumentation

instrumentation-genai/opentelemetry-instrumentation-langchain/src/opentelemetry/instrumentation/langchain/init.py](instrumentation-genai/opentelemetry-instrumentation-langchain/src/opentelemetry/instrumentation/langchain/init.py

Added _count_tokens(self, texts, model) method using tiktoken library
Uses model-specific encoding with cl100k_base fallback
Modified _start_embedding() to count tokens client-side and populate input_tokens on EmbeddingInvocation

Why client-side counting? LangChain's embed_documents() returns only the embedding vectors—it strips the API response metadata including usage.prompt_tokens. Unlike ChatOpenAI which exposes response.llm_output.usage, there's no way to get server-reported token counts for embeddings through LangChain's API.

instrumentation-genai/opentelemetry-instrumentation-langchain/tests/test_langchain_embedding.py

New VCR-based integration test for embedding token metrics
Validates span attributes (gen_ai.operation.name, gen_ai.request.model, gen_ai.usage.input_tokens)
Validates metrics (gen_ai.client.token.usage, gen_ai.client.operation.duration)
Outputs full OTLP-style JSON for debugging/verification

instrumentation-genai/opentelemetry-instrumentation-langchain/tests/cassettes/test_langchain_embedding_call.yaml

VCR cassette recording for the embedding API call

instrumentation-genai/opentelemetry-instrumentation-langchain/tests/conftest.py

Added ignore_hosts: ["openaipublic.blob.core.windows.net"] to VCR config to prevent intercepting tiktoken encoding downloads

Metrics Emitted

After this change, embedding operations emit:

Metric	Description	Token Type
`gen_ai.client.token.usage`	Number of tokens used	`input` only
`gen_ai.client.operation.duration`	Duration in seconds	N/A

Example metric output:

"metrics": [
                        {
                            "name": "gen_ai.client.token.usage",
                            "description": "Number of input and output tokens used",
                            "unit": "{token}",
                            "data": {
                                "data_points": [
                                    {
                                        "attributes": {
                                            "gen_ai.token.type": "input",
                                            "gen_ai.provider.name": "openai",
                                            "gen_ai.operation.name": "embedding",
                                            "gen_ai.request.model": "text-embedding-ada-002"
                                        },
                                        "start_time_unix_nano": 1775523710327305000,
                                        "time_unix_nano": 1775523710327705000,
                                        "count": 1,
                                        "sum": 7,
                                        "min": 7,
                                        "max": 7,
                                        "exemplars": [],
                                        "bucket_counts": [
                                            0,
                                            0,
                                            1,
                                            0,
                                            0,
                                            0,
                                            0,
                                            0,
                                            0,
                                            0,
                                            0,
                                            0,
                                            0,
                                            0,
                                            0
                                        ],
                                        "explicit_bounds": [
                                            1,
                                            4,
                                            16,
                                            64,
                                            256,
                                            1024,
                                            4096,
                                            16384,
                                            65536,
                                            262144,
                                            1048576,
                                            4194304,
                                            16777216,
                                            67108864
                                        ]
                                    }
                                ],
                                "aggregation_temporality": 2
                            }
                        }
]

Testing

# Run util-genai metric tests
pytest ./util/opentelemetry-util-genai/tests/test_metrics.py -v -k embedding

# Run LangChain embedding test
pytest ./instrumentation-genai/opentelemetry-instrumentation-langchain/tests/test_langchain_embedding.py -v -s -p no:deepeval

Dependencies

tiktoken (optional): Used for client-side token estimation in LangChain. If not installed, token metrics won't be emitted for embeddings but duration metrics still work.

…ngchain

pradystar · 2026-04-07T16:19:34Z

why are we not making this change in other instrumentations?

We are using langchain as an example right now. If it looks good, I will add it to other instrumentations.

pradystar · 2026-04-07T16:23:18Z

Dependencies
tiktoken (optional): Used for client-side token estimation in LangChain. If not installed, token metrics won't be emitted for embeddings but duration metrics still work.

Are you sure langchain does not provide tokens consumed in embedding callbacks? Is this dependency ok to be added in upstream and where is this documented for SDOT? I don't see it in pyproject.toml or requirements.

shuningc · 2026-04-07T22:50:23Z

Dependencies
tiktoken (optional): Used for client-side token estimation in LangChain. If not installed, token metrics won't be emitted for embeddings but duration metrics still work.

Are you sure langchain does not provide tokens consumed in embedding callbacks? Is this dependency ok to be added in upstream and where is this documented for SDOT? I don't see it in pyproject.toml or requirements.

Langchain doesn't have embedding callback such as LLMInvocation. The token is internally computed with tiktoken but the return value is a list[list[float]] for the embeddings, no token values exposed. https://github.com/langchain-ai/langchain/blob/master/libs/partners/openai/langchain_openai/embeddings/base.py#L578 If the user has already had langchain-openai imported, it will contain tiktoken dependency. I just updated the Readme.

Adding embedding token metric in genai-util and embedding tests in la…

3b001de

…ngchain

shuningc requested review from a team as code owners April 7, 2026 01:08

shuningc added 2 commits April 6, 2026 18:30

Fixing embedding test

601af33

Fixing test_async_embeddings

eccc397

pradystar reviewed Apr 7, 2026

View reviewed changes

Updating langchain REDME for the token metric description

2b3f1ac

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding embedding token metric #248

Adding embedding token metric #248
shuningc wants to merge 4 commits intomainfrom
AddingEmbeddingTokenMetrics

shuningc commented Apr 7, 2026 •

edited

Loading

Uh oh!

pradystar Apr 7, 2026

Uh oh!

shuningc Apr 7, 2026

Uh oh!

pradystar commented Apr 7, 2026

Uh oh!

shuningc commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shuningc commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Core (opentelemetry-util-genai)

LangChain Instrumentation

Metrics Emitted

Testing

Dependencies

Uh oh!

pradystar Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

shuningc Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

pradystar commented Apr 7, 2026

Uh oh!

shuningc commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shuningc commented Apr 7, 2026 •

edited

Loading