Conversation
There was a problem hiding this comment.
why are we not making this change in other instrumentations?
There was a problem hiding this comment.
We are using langchain as an example right now. If it looks good, I will add it to other instrumentations.
Are you sure langchain does not provide tokens consumed in embedding callbacks? Is this dependency ok to be added in upstream and where is this documented for SDOT? I don't see it in pyproject.toml or requirements. |
Langchain doesn't have embedding callback such as LLMInvocation. The token is internally computed with tiktoken but the return value is a list[list[float]] for the embeddings, no token values exposed. https://github.com/langchain-ai/langchain/blob/master/libs/partners/openai/langchain_openai/embeddings/base.py#L578 If the user has already had langchain-openai imported, it will contain tiktoken dependency. I just updated the Readme. |
Summary
This PR adds token usage metrics for embedding operations, bringing parity with LLM invocations. Previously, embeddings only emitted duration metrics; now they also emit
gen_ai.client.token.usagewithinputtoken counts.Changes
Core (opentelemetry-util-genai)
util/opentelemetry-util-genai/src/opentelemetry/util/genai/emitters/metrics.py
_record_token_metrics()call inon_end()forEmbeddingInvocation_record_token_metrics()call inon_error()forEmbeddingInvocationget_context_metric_attributes()for session context on embedding metricsNoneforcompletion_tokens(embeddings don't produce output tokens)util/opentelemetry-util-genai/tests/test_metrics.py
EmbeddingInvocationimport_invoke_embedding()helper method_invoke_embedding_failure()helper methodtest_embedding_emits_input_token_metric- verifies token metric with correct attributestest_embedding_failure_emits_token_metric- verifies metrics on error pathLangChain Instrumentation
instrumentation-genai/opentelemetry-instrumentation-langchain/src/opentelemetry/instrumentation/langchain/init.py](instrumentation-genai/opentelemetry-instrumentation-langchain/src/opentelemetry/instrumentation/langchain/init.py
_count_tokens(self, texts, model)method using tiktoken librarycl100k_basefallback_start_embedding()to count tokens client-side and populateinput_tokensonEmbeddingInvocationinstrumentation-genai/opentelemetry-instrumentation-langchain/tests/test_langchain_embedding.py
gen_ai.operation.name,gen_ai.request.model,gen_ai.usage.input_tokens)gen_ai.client.token.usage,gen_ai.client.operation.duration)instrumentation-genai/opentelemetry-instrumentation-langchain/tests/cassettes/test_langchain_embedding_call.yaml
instrumentation-genai/opentelemetry-instrumentation-langchain/tests/conftest.py
ignore_hosts: ["openaipublic.blob.core.windows.net"]to VCR config to prevent intercepting tiktoken encoding downloadsMetrics Emitted
After this change, embedding operations emit:
gen_ai.client.token.usageinputonlygen_ai.client.operation.durationExample metric output:
Testing
Dependencies