Summary
The Bedrock Converse instrumentation extracts inputTokens, outputTokens, and totalTokens from the response usage object, but silently drops the prompt caching fields cacheReadInputTokens, cacheWriteInputTokens, and cacheDetails. These fields are returned by the Bedrock Converse API when prompt caching is active and are important for understanding cache hit rates and cost savings.
Both the non-streaming (Converse) and streaming (ConverseStream) paths are affected.
What is missing
Non-streaming path
In InstrumentationSemConv.tagBedrockResponse() (lines 350–357), only three usage fields are extracted:
if (usage.has("inputTokens")) metrics.put("prompt_tokens", usage.get("inputTokens"));
if (usage.has("outputTokens")) metrics.put("completion_tokens", usage.get("outputTokens"));
if (usage.has("totalTokens")) metrics.put("tokens", usage.get("totalTokens"));
The following fields from the Bedrock usage object are never extracted:
cacheReadInputTokens — tokens served from the prompt cache
cacheWriteInputTokens — tokens written to the prompt cache
cacheDetails — array of per-checkpoint cache details including TTL
Streaming path
In BraintrustBedrockInterceptor.TeeingSubscriber.parseTokenUsage() (lines 362–379), only inputTokens and outputTokens are parsed from the metadata event payload. Cache token fields in the same payload are ignored. The buildConverseJson() method (lines 385–410) then constructs a synthetic response with only inputTokens, outputTokens, and totalTokens — cache fields are lost before they reach tagBedrockResponse.
A real Converse response with prompt caching looks like:
"usage": {
"inputTokens": 1200,
"outputTokens": 350,
"totalTokens": 1550,
"cacheReadInputTokens": 800,
"cacheWriteInputTokens": 400,
"cacheDetails": [
{ "inputTokens": 800, "ttl": "5m" }
]
}
Today, only inputTokens, outputTokens, and totalTokens are captured. The cache fields are silently dropped.
For comparison, the Google GenAI handler in this repo already extracts cachedContentTokenCount as prompt_cached_tokens (line 142–146 of BraintrustApiClient.java), showing that cache token extraction is an established pattern here. Similar gaps for Anthropic (#57) and OpenAI (#58, #70) cache tokens have already been filed.
Braintrust docs status
Upstream sources
Local files inspected
braintrust-sdk/src/main/java/dev/braintrust/instrumentation/InstrumentationSemConv.java — lines 350–357 (tagBedrockResponse: only inputTokens, outputTokens, totalTokens extracted from usage)
braintrust-sdk/instrumentation/aws_bedrock_2_30_0/src/main/java/dev/braintrust/instrumentation/awsbedrock/v2_30_0/BraintrustBedrockInterceptor.java — lines 362–379 (parseTokenUsage: only inputTokens and outputTokens parsed); lines 385–410 (buildConverseJson: synthetic response omits cache fields)
braintrust-sdk/instrumentation/aws_bedrock_2_30_0/src/test/java/dev/braintrust/instrumentation/awsbedrock/v2_30_0/BraintrustAWSBedrockTest.java — no test exercises prompt caching responses
braintrust-sdk/instrumentation/genai_1_18_0/src/main/java/com/google/genai/BraintrustApiClient.java — lines 142–146 (GenAI handler already extracts cachedContentTokenCount as prompt_cached_tokens)
Summary
The Bedrock Converse instrumentation extracts
inputTokens,outputTokens, andtotalTokensfrom the responseusageobject, but silently drops the prompt caching fieldscacheReadInputTokens,cacheWriteInputTokens, andcacheDetails. These fields are returned by the Bedrock Converse API when prompt caching is active and are important for understanding cache hit rates and cost savings.Both the non-streaming (Converse) and streaming (ConverseStream) paths are affected.
What is missing
Non-streaming path
In
InstrumentationSemConv.tagBedrockResponse()(lines 350–357), only three usage fields are extracted:The following fields from the Bedrock
usageobject are never extracted:cacheReadInputTokens— tokens served from the prompt cachecacheWriteInputTokens— tokens written to the prompt cachecacheDetails— array of per-checkpoint cache details including TTLStreaming path
In
BraintrustBedrockInterceptor.TeeingSubscriber.parseTokenUsage()(lines 362–379), onlyinputTokensandoutputTokensare parsed from themetadataevent payload. Cache token fields in the same payload are ignored. ThebuildConverseJson()method (lines 385–410) then constructs a synthetic response with onlyinputTokens,outputTokens, andtotalTokens— cache fields are lost before they reachtagBedrockResponse.A real Converse response with prompt caching looks like:
Today, only
inputTokens,outputTokens, andtotalTokensare captured. The cache fields are silently dropped.For comparison, the Google GenAI handler in this repo already extracts
cachedContentTokenCountasprompt_cached_tokens(line 142–146 ofBraintrustApiClient.java), showing that cache token extraction is an established pattern here. Similar gaps for Anthropic (#57) and OpenAI (#58, #70) cache tokens have already been filed.Braintrust docs status
Upstream sources
cacheReadInputTokensandcacheWriteInputTokensin Converse response usageTokenUsageincludescacheReadInputTokens,cacheWriteInputTokens,cacheDetailsLocal files inspected
braintrust-sdk/src/main/java/dev/braintrust/instrumentation/InstrumentationSemConv.java— lines 350–357 (tagBedrockResponse: onlyinputTokens,outputTokens,totalTokensextracted from usage)braintrust-sdk/instrumentation/aws_bedrock_2_30_0/src/main/java/dev/braintrust/instrumentation/awsbedrock/v2_30_0/BraintrustBedrockInterceptor.java— lines 362–379 (parseTokenUsage: onlyinputTokensandoutputTokensparsed); lines 385–410 (buildConverseJson: synthetic response omits cache fields)braintrust-sdk/instrumentation/aws_bedrock_2_30_0/src/test/java/dev/braintrust/instrumentation/awsbedrock/v2_30_0/BraintrustAWSBedrockTest.java— no test exercises prompt caching responsesbraintrust-sdk/instrumentation/genai_1_18_0/src/main/java/com/google/genai/BraintrustApiClient.java— lines 142–146 (GenAI handler already extractscachedContentTokenCountasprompt_cached_tokens)