fix(instrumentations): include cache tokens in gen ai.usage.input tokens#1028
Conversation
|
Warning Review limit reached
More reviews will be available in 17 minutes and 56 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughCache token span attributes ( ChangesCache Token Span Attribute Support
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/instrumentation-langchain/src/callback_handler.ts (1)
293-356:⚠️ Potential issue | 🟠 Major | ⚡ Quick winEmpty
usage_metadatacurrently suppresses fallback token extraction.
extractUsageMetadataFromGenerationsreturns any object (including{}), andhandleLLMEndthen skipsllmOutput.tokenUsagefallback. This can silently drop all usage token attributes.💡 Suggested fix
private extractUsageMetadataFromGenerations(output: LLMResult): { input_tokens?: number; output_tokens?: number; total_tokens?: number; input_token_details?: { cache_read?: number; cache_creation?: number }; } | null { if (!output.generations) return null; for (const group of output.generations) { if (!group) continue; for (const gen of group) { const message = (gen as any)?.message; const usageMetadata = message?.usage_metadata; if (usageMetadata && typeof usageMetadata === "object") { - return usageMetadata; + const hasTopLevelTokens = + typeof usageMetadata.input_tokens === "number" || + typeof usageMetadata.output_tokens === "number" || + typeof usageMetadata.total_tokens === "number"; + const details = usageMetadata.input_token_details; + const hasCacheDetails = + details && + typeof details === "object" && + (typeof details.cache_read === "number" || + typeof details.cache_creation === "number"); + + if (hasTopLevelTokens || hasCacheDetails) { + return usageMetadata; + } } } } return null; }Also applies to: 585-603
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/instrumentation-langchain/src/callback_handler.ts` around lines 293 - 356, The extractUsageMetadataFromGenerations method returns empty objects which are truthy in JavaScript, causing the code to skip the fallback tokenUsage extraction in the subsequent else if block even when no actual token data was extracted. Modify the condition that checks if (usageMetadata) to verify that the returned object actually contains valid token data (such as checking if input_tokens, output_tokens, or total_tokens properties exist with numeric values) before treating it as a valid result, or alternatively modify extractUsageMetadataFromGenerations to return null or undefined instead of empty objects when no usage metadata is found.
🧹 Nitpick comments (4)
packages/instrumentation-langchain/src/callback_handler.ts (1)
33-34: ⚡ Quick winUse semantic attribute constants from
@traceloop/ai-semantic-conventionsin instrumentation packages.Switching these cache-token attribute constants to
SpanAttributes.*keeps this file aligned with the repository import rule for instrumentation code.As per coding guidelines:
packages/{instrumentation-*,traceloop-sdk}/**/*.{ts,tsx}should import AI/LLM semantic attribute constants from@traceloop/ai-semantic-conventions.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/instrumentation-langchain/src/callback_handler.ts` around lines 33 - 34, In the callback_handler.ts file, replace the direct usage of ATTR_GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS and ATTR_GEN_AI_USAGE_CACHE_CREATION_INPUT_TOKENS constants with SpanAttributes.ATTR_GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS and SpanAttributes.ATTR_GEN_AI_USAGE_CACHE_CREATION_INPUT_TOKENS respectively. Ensure that SpanAttributes is imported from `@traceloop/ai-semantic-conventions` and remove any direct imports of these individual constants to align with the repository's instrumentation code guidelines.Source: Coding guidelines
packages/instrumentation-bedrock/tests/cache-token-fold-in.test.ts (1)
70-83: ⚡ Quick winAdd a zero-value cache-token test case.
Current coverage has positive and absent cache values, but not explicit
0. Adding it will guard against truthy-check regressions.✅ Suggested test addition
+ it("emits cache attributes when cache token values are explicitly zero", () => { + const attrs = setResponseAttrs({ + input_tokens: 100, + output_tokens: 50, + cache_read_input_tokens: 0, + cache_creation_input_tokens: 0, + }); + assert.strictEqual(attrs[ATTR_GEN_AI_USAGE_INPUT_TOKENS], 100); + assert.strictEqual(attrs[SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS], 150); + assert.strictEqual(attrs[ATTR_GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS], 0); + assert.strictEqual(attrs[ATTR_GEN_AI_USAGE_CACHE_CREATION_INPUT_TOKENS], 0); + });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/instrumentation-bedrock/tests/cache-token-fold-in.test.ts` around lines 70 - 83, Add a new test case in the cache-token-fold-in.test.ts file that explicitly tests the behavior when cache token values are set to zero (0) rather than absent (undefined). Create a test similar to the existing "folds only cache_read when cache_creation is absent" test, but set either cache_read_input_tokens or cache_creation_input_tokens to 0 and verify the folding logic handles the zero value correctly. This will ensure the code does not have truthy-check regressions that might incorrectly treat 0 as absent.packages/instrumentation-langchain/test/cache-token-fold-in.test.ts (1)
169-205: ⚡ Quick winAdd a regression test for empty
usage_metadataobjects.The suite verifies fallback when metadata is absent, but not when it is present as
{}. That edge case is important for protecting fallback behavior.✅ Suggested test addition
+ it("falls back to llmOutput.tokenUsage when usage_metadata is empty", async () => { + const runId = `run-${Math.random()}`; + await handler.handleChatModelStart(serializedLLM, [[]], runId, undefined, { + invocation_params: { model: "gpt-4" }, + }); + await handler.handleLLMEnd( + { + generations: [ + [ + { + text: "hello", + message: new AIMessage({ content: "hello", usage_metadata: {} }), + generationInfo: { finish_reason: "stop" }, + } as any, + ], + ], + llmOutput: { + tokenUsage: { + promptTokens: 1000, + completionTokens: 50, + totalTokens: 1050, + }, + }, + }, + runId, + ); + + const span = exporter.getFinishedSpans().at(-1); + assert.ok(span); + assert.strictEqual(span.attributes[ATTR_GEN_AI_USAGE_INPUT_TOKENS], 1000); + assert.strictEqual(span.attributes[ATTR_GEN_AI_USAGE_OUTPUT_TOKENS], 50); + assert.strictEqual( + span.attributes[SpanAttributes.GEN_AI_USAGE_TOTAL_TOKENS], + 1050, + ); + });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/instrumentation-langchain/test/cache-token-fold-in.test.ts` around lines 169 - 205, Add a new regression test case following the existing test that verifies the fallback behavior when usage_metadata is an empty object instead of completely absent. This test should be structured similarly to the current "falls back to llmOutput.tokenUsage when usage_metadata is absent" test but should pass an empty usage_metadata object (e.g., usage_metadata: {}) in the handleLLMEnd call to ensure the fallback to llmOutput.tokenUsage still works correctly in this edge case. Verify that the token usage attributes are still properly extracted from llmOutput.tokenUsage when usage_metadata is present but empty.packages/instrumentation-vertexai/tests/gemini.test.ts (1)
220-240: ⚡ Quick winAdd a zero-value cache token regression test.
This test only exercises a positive cache count. Add a
cachedContentTokenCount: 0case and assert the span attribute is0to prevent falsy-check regressions.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/instrumentation-vertexai/tests/gemini.test.ts` around lines 220 - 240, The current test only verifies the behavior when cachedContentTokenCount has a positive value (5), but does not test the zero-value case which could expose falsy-check regressions. Add a new test case similar to the existing one in the same file that configures MockGenerativeModel to return cachedContentTokenCount: 0 in the response, then assert that the span attribute ATTR_GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS is still correctly set to 0 (not undefined or missing) to ensure zero values are properly handled and recorded.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/instrumentation-bedrock/src/instrumentation.ts`:
- Around line 715-726: The conditional spreads for
ATTR_GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS and
ATTR_GEN_AI_USAGE_CACHE_CREATION_INPUT_TOKENS use truthy checks that incorrectly
skip explicit zero values (since 0 is falsy in JavaScript), causing cache token
attributes to disappear from the output when tokens are legitimately zero.
Replace the truthy conditions with explicit null checks using the `!= null`
operator for both usage["cache_read_input_tokens"] and
usage["cache_creation_input_tokens"] to preserve zero values while still
filtering out undefined or null entries.
In `@packages/instrumentation-openai/src/instrumentation.ts`:
- Around line 972-978: The truthy check `if (cachedTokens)` at the beginning of
the block prevents recording cache-read usage when the API returns 0 cached
tokens, which is a valid telemetry value. Replace the condition with a numeric
type guard such as `typeof cachedTokens === 'number'` or `cachedTokens !==
undefined` to ensure the span attribute
ATTR_GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS is set even when cachedTokens has a
value of 0.
In
`@packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-cache-read-input-tokens-in-span-for-responses-with-cached-tokens_3930499540/recording.har`:
- Around line 102-112: In the HAR fixture file, replace all sensitive data
values with generic placeholders while preserving the JSON structure.
Specifically, redact the cookie "value" field (like the __cf_bm cookie shown),
and replace actual values in response headers for set-cookie, openai-project,
openai-organization, and x-request-id with redacted placeholders such as
"redacted-cookie-value", "redacted-project-id", etc. Apply these changes
throughout the entire fixture file, including the additional locations mentioned
at lines 156-166 and 196-203, to ensure no sensitive identifiers or
authentication tokens are exposed in the committed test recording.
In `@packages/instrumentation-vertexai/src/vertexai-instrumentation.ts`:
- Around line 256-260: The conditional check for cachedContentTokenCount uses a
truthy evaluation which incorrectly skips setting the
ATTR_GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS attribute when the value is 0, since 0
is falsy. Replace the truthy check with an explicit null and undefined check
using the nullish coalescing or explicit comparison operators so that a zero
value is correctly recognized as a valid usage metric and the span.setAttribute
call for ATTR_GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS is executed with the
cachedContentTokenCount value.
---
Outside diff comments:
In `@packages/instrumentation-langchain/src/callback_handler.ts`:
- Around line 293-356: The extractUsageMetadataFromGenerations method returns
empty objects which are truthy in JavaScript, causing the code to skip the
fallback tokenUsage extraction in the subsequent else if block even when no
actual token data was extracted. Modify the condition that checks if
(usageMetadata) to verify that the returned object actually contains valid token
data (such as checking if input_tokens, output_tokens, or total_tokens
properties exist with numeric values) before treating it as a valid result, or
alternatively modify extractUsageMetadataFromGenerations to return null or
undefined instead of empty objects when no usage metadata is found.
---
Nitpick comments:
In `@packages/instrumentation-bedrock/tests/cache-token-fold-in.test.ts`:
- Around line 70-83: Add a new test case in the cache-token-fold-in.test.ts file
that explicitly tests the behavior when cache token values are set to zero (0)
rather than absent (undefined). Create a test similar to the existing "folds
only cache_read when cache_creation is absent" test, but set either
cache_read_input_tokens or cache_creation_input_tokens to 0 and verify the
folding logic handles the zero value correctly. This will ensure the code does
not have truthy-check regressions that might incorrectly treat 0 as absent.
In `@packages/instrumentation-langchain/src/callback_handler.ts`:
- Around line 33-34: In the callback_handler.ts file, replace the direct usage
of ATTR_GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS and
ATTR_GEN_AI_USAGE_CACHE_CREATION_INPUT_TOKENS constants with
SpanAttributes.ATTR_GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS and
SpanAttributes.ATTR_GEN_AI_USAGE_CACHE_CREATION_INPUT_TOKENS respectively.
Ensure that SpanAttributes is imported from `@traceloop/ai-semantic-conventions`
and remove any direct imports of these individual constants to align with the
repository's instrumentation code guidelines.
In `@packages/instrumentation-langchain/test/cache-token-fold-in.test.ts`:
- Around line 169-205: Add a new regression test case following the existing
test that verifies the fallback behavior when usage_metadata is an empty object
instead of completely absent. This test should be structured similarly to the
current "falls back to llmOutput.tokenUsage when usage_metadata is absent" test
but should pass an empty usage_metadata object (e.g., usage_metadata: {}) in the
handleLLMEnd call to ensure the fallback to llmOutput.tokenUsage still works
correctly in this edge case. Verify that the token usage attributes are still
properly extracted from llmOutput.tokenUsage when usage_metadata is present but
empty.
In `@packages/instrumentation-vertexai/tests/gemini.test.ts`:
- Around line 220-240: The current test only verifies the behavior when
cachedContentTokenCount has a positive value (5), but does not test the
zero-value case which could expose falsy-check regressions. Add a new test case
similar to the existing one in the same file that configures MockGenerativeModel
to return cachedContentTokenCount: 0 in the response, then assert that the span
attribute ATTR_GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS is still correctly set to 0
(not undefined or missing) to ensure zero values are properly handled and
recorded.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9ccf6e07-41cf-4928-b3f6-c7c88068f9c1
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (16)
packages/instrumentation-anthropic/src/instrumentation.tspackages/instrumentation-anthropic/test/instrumentation.test.tspackages/instrumentation-bedrock/src/instrumentation.tspackages/instrumentation-bedrock/tests/anthropic.test.tspackages/instrumentation-bedrock/tests/cache-token-fold-in.test.tspackages/instrumentation-langchain/src/callback_handler.tspackages/instrumentation-langchain/test/cache-token-fold-in.test.tspackages/instrumentation-openai/src/instrumentation.tspackages/instrumentation-openai/test/instrumentation.test.tspackages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-cache-read-input-tokens-in-span-for-responses-with-cached-tokens_3930499540/recording.harpackages/instrumentation-together/package.jsonpackages/instrumentation-together/src/instrumentation.tspackages/instrumentation-together/test/instrumentation.test.tspackages/instrumentation-vertexai/package.jsonpackages/instrumentation-vertexai/src/vertexai-instrumentation.tspackages/instrumentation-vertexai/tests/gemini.test.ts
Summary by CodeRabbit
New Features
gen_ai.usage.input_tokensand recomputegen_ai.usage.total_tokensaccordingly.usage_metadatafor token/cached-token reporting, with legacy fallback.Tests
Dependencies
^1.40.0(Together, VertexAI).