feat(instrumentation-llamaindex): migrate to OTel 1.40 GenAI semantic conventions#925
feat(instrumentation-llamaindex): migrate to OTel 1.40 GenAI semantic conventions#925lenatraceloop wants to merge 2 commits intomainfrom
Conversation
📝 WalkthroughWalkthroughAdds OpenAI GenAI semantic-convention support to LlamaIndex instrumentation, instruments OpenAI chat directly, replaces legacy attributes with GenAI attributes (provider, operation, input/output messages, finish reasons, usage), adjusts streaming handling, introduces a shared CustomLLMInstrumentation instance, and expands tests and SDK initialization options. Changes
Sequence Diagram(s)sequenceDiagram
participant App as Application
participant LI as LlamaIndex Chat
participant CLLI as CustomLLMInstrumentation
participant OAI as OpenAI Client
participant Span as Span
App->>LI: chat(messages)
activate LI
LI->>CLLI: chatWrapper invoked
activate CLLI
CLLI->>Span: set provider_name, operation_name, model
alt traceContent enabled
CLLI->>Span: formatInputMessages -> gen_ai.input.messages
end
CLLI->>OAI: forward request (streaming or non-streaming)
activate OAI
OAI-->>CLLI: response / stream chunks
deactivate OAI
alt Streaming
CLLI->>CLLI: collect chunks, record lastChunk
CLLI->>Span: on completion set finish_reasons and usage from lastChunk.raw
else Non-streaming
CLLI->>Span: set finish_reasons and usage from result.raw
end
alt traceContent enabled
CLLI->>Span: formatOutputMessage -> gen_ai.output.messages
end
CLLI-->>LI: return result
deactivate CLLI
LI-->>App: response
deactivate LI
sequenceDiagram
participant Init as Initialize
participant SDK as Traceloop SDK
participant LIINST as LlamaIndex Instrumentation
participant OAI as OpenAI Module
participant CLLI as CustomLLMInstrumentation
Init->>SDK: initialize({ instrumentModules })
activate SDK
SDK->>LIINST: manuallyInstrument(llamaIndex, llamaIndexOpenAI)
activate LIINST
LIINST->>CLLI: create persistent CustomLLMInstrumentation
activate CLLI
CLLI-->>LIINST: ready
deactivate CLLI
LIINST->>OAI: patch OpenAI.prototype.chat (if provided)
LIINST-->>SDK: instrumentation wired
deactivate LIINST
SDK-->>Init: done
deactivate SDK
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/traceloop-sdk/src/lib/tracing/index.ts (1)
216-224:⚠️ Potential issue | 🟡 Minor
llamaIndexOpenAIis silently ignored unlessllamaIndexis also set.
InitializeOptions.instrumentModulesnow exposesllamaIndexOpenAIas a sibling option, but this branch only runs whenllamaIndexexists. PassingllamaIndexOpenAIalone becomes a silent no-op in the manual-init path, so please either validate that combination and warn/throw, or encode the dependency in the API shape.Possible guard
if (instrumentModules?.llamaIndex) { llamaIndexInstrumentation = new LlamaIndexInstrumentation({ exceptionLogger, }); instrumentations.push(llamaIndexInstrumentation); llamaIndexInstrumentation.manuallyInstrument( instrumentModules.llamaIndex, instrumentModules.llamaIndexOpenAI, ); + } else if (instrumentModules?.llamaIndexOpenAI) { + diag.warn( + "[Traceloop] instrumentModules.llamaIndexOpenAI requires instrumentModules.llamaIndex", + ); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/traceloop-sdk/src/lib/tracing/index.ts` around lines 216 - 224, The code currently only initializes llamaIndexInstrumentation when instrumentModules.llamaIndex is present, so instrumentModules.llamaIndexOpenAI passed alone is ignored; update the initialization logic in the block that references instrumentModules to detect when llamaIndexOpenAI is provided without llamaIndex and either (a) emit a clear warning/error (using exceptionLogger or process logger) or (b) enforce the dependency by throwing an error; specifically, check instrumentModules.llamaIndexOpenAI && !instrumentModules.llamaIndex and then call exceptionLogger.error (or throw) with a message explaining that llamaIndexOpenAI requires llamaIndex, or alternatively adjust the creation of LlamaIndexInstrumentation / the call to llamaIndexInstrumentation.manuallyInstrument to accept and handle llamaIndexOpenAI-only usage so manuallyInstrument receives instrumentModules.llamaIndexOpenAI when llamaIndex is present or otherwise fail fast.
🧹 Nitpick comments (1)
packages/instrumentation-llamaindex/test/instrumentation.test.ts (1)
234-240: Consider adding test coverage for streaming with usage data.The streaming mock only yields
{ delta }without arawfield. This tests the case wherestream_options: { include_usage: true }is not enabled on the LLM.Per the PR description, the instrumentation now extracts token usage and finish reason from the last chunk's
rawfield when available. Consider adding a test that verifies this behavior works correctly when the final streaming chunk includesraw.choices[0].finish_reasonandraw.usage.💡 Example streaming mock with usage
function makeMockChatWithStreamUsage(options: { responseContent?: string; finishReason?: string; promptTokens?: number; completionTokens?: number; }) { const responseContent = options.responseContent ?? "Hello!"; const finishReason = options.finishReason ?? "stop"; const promptTokens = options.promptTokens ?? 10; const completionTokens = options.completionTokens ?? 5; return async function chat({ stream }: any) { if (stream) { async function* generate() { yield { delta: responseContent.slice(0, 3) }; // Final chunk with raw usage (like OpenAI with include_usage: true) yield { delta: responseContent.slice(3), raw: { choices: [{ finish_reason: finishReason }], usage: { prompt_tokens: promptTokens, completion_tokens: completionTokens, total_tokens: promptTokens + completionTokens, }, }, }; } return generate(); } // ... non-streaming path }; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/instrumentation-llamaindex/test/instrumentation.test.ts` around lines 234 - 240, Add a new test that uses a streaming mock which yields multiple chunks and includes a final chunk with a raw field containing choices[0].finish_reason and usage (prompt_tokens, completion_tokens, total_tokens); update or add a helper like makeMockChatWithStreamUsage (or modify the existing chat/generate generator in the test) to yield partial delta(s) and then a final yield with raw. In the test, invoke the instrumented LLM with stream: true and assert that the instrumentation recorded the finish reason and the token usage values (prompt/completion/total) from the final chunk's raw field and that non-streaming behavior is unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/instrumentation-llamaindex/src/instrumentation.ts`:
- Around line 39-47: CustomLLMInstrumentation is being constructed with a stale
snapshot of this._config; change the constructor call in
LlamaIndexInstrumentation to pass a getter callback (e.g., () => this._config)
instead of this._config so the wrapper reads the live config after setConfig()
updates it, and then update CustomLLMInstrumentation's constructor signature and
any internal references in custom-llm-instrumentation.ts to call that getter
(e.g., configGetter()) rather than treating config as a static object; apply the
same change for the other instantiations mentioned (lines 50-52) so all uses
read the live config.
---
Outside diff comments:
In `@packages/traceloop-sdk/src/lib/tracing/index.ts`:
- Around line 216-224: The code currently only initializes
llamaIndexInstrumentation when instrumentModules.llamaIndex is present, so
instrumentModules.llamaIndexOpenAI passed alone is ignored; update the
initialization logic in the block that references instrumentModules to detect
when llamaIndexOpenAI is provided without llamaIndex and either (a) emit a clear
warning/error (using exceptionLogger or process logger) or (b) enforce the
dependency by throwing an error; specifically, check
instrumentModules.llamaIndexOpenAI && !instrumentModules.llamaIndex and then
call exceptionLogger.error (or throw) with a message explaining that
llamaIndexOpenAI requires llamaIndex, or alternatively adjust the creation of
LlamaIndexInstrumentation / the call to
llamaIndexInstrumentation.manuallyInstrument to accept and handle
llamaIndexOpenAI-only usage so manuallyInstrument receives
instrumentModules.llamaIndexOpenAI when llamaIndex is present or otherwise fail
fast.
---
Nitpick comments:
In `@packages/instrumentation-llamaindex/test/instrumentation.test.ts`:
- Around line 234-240: Add a new test that uses a streaming mock which yields
multiple chunks and includes a final chunk with a raw field containing
choices[0].finish_reason and usage (prompt_tokens, completion_tokens,
total_tokens); update or add a helper like makeMockChatWithStreamUsage (or
modify the existing chat/generate generator in the test) to yield partial
delta(s) and then a final yield with raw. In the test, invoke the instrumented
LLM with stream: true and assert that the instrumentation recorded the finish
reason and the token usage values (prompt/completion/total) from the final
chunk's raw field and that non-streaming behavior is unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 55b87ea2-35bf-4754-95ad-f41fd1b11b02
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (11)
packages/instrumentation-llamaindex/package.jsonpackages/instrumentation-llamaindex/src/custom-llm-instrumentation.tspackages/instrumentation-llamaindex/src/instrumentation.tspackages/instrumentation-llamaindex/src/utils.tspackages/instrumentation-llamaindex/test/finish_reasons.test.tspackages/instrumentation-llamaindex/test/instrumentation.test.tspackages/instrumentation-llamaindex/test/semconv.test.tspackages/sample-app/src/sample_llama_index_openai_agent.tspackages/sample-app/src/sample_llamaindex.tspackages/traceloop-sdk/src/lib/interfaces/initialize-options.interface.tspackages/traceloop-sdk/src/lib/tracing/index.ts
There was a problem hiding this comment.
🧹 Nitpick comments (3)
packages/instrumentation-llamaindex/test/instrumentation.test.ts (3)
215-215: Misleading variable name:noopDiagis assigned the real diagnostics logger.The variable is named
noopDiagbutdiagis the actual OpenTelemetry diagnostics logger, not a no-op. This could cause confusion for maintainers and may produce unexpected log output during tests.🔧 Suggested fix: either rename or use a true no-op
Option 1: Rename to reflect actual behavior:
-const noopDiag: DiagLogger = diag; +const testDiag: DiagLogger = diag;Option 2: Use an actual no-op logger if silence is intended:
-const noopDiag: DiagLogger = diag; +const noopDiag: DiagLogger = { + error: () => {}, + warn: () => {}, + info: () => {}, + debug: () => {}, + verbose: () => {}, +};🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/instrumentation-llamaindex/test/instrumentation.test.ts` at line 215, The variable named noopDiag is misleading because it references the real OpenTelemetry diagnostics logger (diag); either rename noopDiag to a truthful identifier like diagLogger or realDiag (update all usages in instrumentation.test.ts accordingly), or replace the assignment with an actual no-op DiagLogger implementation (an object implementing the DiagLogger methods such as debug/info/warn/error as no-ops) so tests are truly silent; update the variable type (DiagLogger) and any assertions to match the chosen approach.
228-234: Missing test coverage for streaming with usage data.The streaming mock only yields
{ delta: responseContent }without therawfield. This correctly tests the scenario where OpenAI streaming doesn't include usage. However, the PR description mentions fixing streaming handling "whenstream_options: { include_usage: true }is set on the LLM" — but there's no test covering that scenario where streaming DOES provide finish_reason and usage in the last chunk.🧪 Suggested: Add test for streaming with usage
Add a variant mock that includes
rawin the final chunk:function makeMockChatWithStreamUsage(options: { responseContent?: string; finishReason?: string; promptTokens?: number; completionTokens?: number; }) { const responseContent = options.responseContent ?? "Hello!"; const finishReason = options.finishReason ?? "stop"; const promptTokens = options.promptTokens ?? 10; const completionTokens = options.completionTokens ?? 5; return async function chat({ stream }: any) { if (stream) { async function* generate() { yield { delta: "partial" }; // Final chunk includes raw with usage (OpenAI behavior with stream_options) yield { delta: responseContent, raw: { choices: [{ finish_reason: finishReason }], usage: { prompt_tokens: promptTokens, completion_tokens: completionTokens, total_tokens: promptTokens + completionTokens, }, }, }; } return generate(); } // ... non-streaming case }; }Then add a test:
it("sets finish_reasons and usage when streaming with stream_options", async () => { // test that attributes ARE set when raw data is available });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/instrumentation-llamaindex/test/instrumentation.test.ts` around lines 228 - 234, The streaming mock in the test only yields { delta: responseContent } and doesn't cover the case where the final streaming chunk includes raw usage/finish_reason; add a new mock (e.g., makeMockChatWithStreamUsage) that returns an async generator from chat({ stream }) which yields a partial delta and then a final chunk containing delta plus raw: { choices: [{ finish_reason }], usage: { prompt_tokens, completion_tokens, total_tokens } }, then add a test (e.g., it("sets finish_reasons and usage when streaming with stream_options")) that uses this mock to assert the code under test records finish_reasons and usage fields when the streaming raw data is present; update any existing test harness to swap in the new chat mock for this scenario.
502-520: Consider updating the test description for accuracy.The test name says "not available" but finish_reasons can be available in streaming when
stream_options: { include_usage: true }is set. The current mock simply doesn't provide this data. Consider renaming for clarity:- it("does NOT set finish_reasons in streaming (not available)", async () => { + it("does NOT set finish_reasons in streaming when raw data is absent", async () => {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/instrumentation-llamaindex/test/instrumentation.test.ts` around lines 502 - 520, The test description is misleading — it claims finish_reasons are "not available" in streaming, but they can be present when stream_options include usage; update the test title string in the it(...) block to accurately reflect that the mock simply doesn't include finish_reasons (e.g., "does NOT set finish_reasons in streaming when mock omits them"), keeping the test body and assertions (references: makeInstrumentation, chatWrapper, wrapped.call, ATTR_GEN_AI_RESPONSE_FINISH_REASONS) unchanged so the behavior is still validated against the mocked stream.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@packages/instrumentation-llamaindex/test/instrumentation.test.ts`:
- Line 215: The variable named noopDiag is misleading because it references the
real OpenTelemetry diagnostics logger (diag); either rename noopDiag to a
truthful identifier like diagLogger or realDiag (update all usages in
instrumentation.test.ts accordingly), or replace the assignment with an actual
no-op DiagLogger implementation (an object implementing the DiagLogger methods
such as debug/info/warn/error as no-ops) so tests are truly silent; update the
variable type (DiagLogger) and any assertions to match the chosen approach.
- Around line 228-234: The streaming mock in the test only yields { delta:
responseContent } and doesn't cover the case where the final streaming chunk
includes raw usage/finish_reason; add a new mock (e.g.,
makeMockChatWithStreamUsage) that returns an async generator from chat({ stream
}) which yields a partial delta and then a final chunk containing delta plus
raw: { choices: [{ finish_reason }], usage: { prompt_tokens, completion_tokens,
total_tokens } }, then add a test (e.g., it("sets finish_reasons and usage when
streaming with stream_options")) that uses this mock to assert the code under
test records finish_reasons and usage fields when the streaming raw data is
present; update any existing test harness to swap in the new chat mock for this
scenario.
- Around line 502-520: The test description is misleading — it claims
finish_reasons are "not available" in streaming, but they can be present when
stream_options include usage; update the test title string in the it(...) block
to accurately reflect that the mock simply doesn't include finish_reasons (e.g.,
"does NOT set finish_reasons in streaming when mock omits them"), keeping the
test body and assertions (references: makeInstrumentation, chatWrapper,
wrapped.call, ATTR_GEN_AI_RESPONSE_FINISH_REASONS) unchanged so the behavior is
still validated against the mocked stream.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3ac1a5c7-5bc8-4001-9d7f-29c36c8034b8
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (1)
packages/instrumentation-llamaindex/test/instrumentation.test.ts
42af26a to
51cd26b
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/instrumentation-llamaindex/src/utils.ts (1)
69-83:⚠️ Potential issue | 🟠 MajorEnsure callback always runs on stream close/error, not only normal completion.
On Line 82,
fn(message, lastChunk)runs only after full iteration. If the consumer stops early or iteration throws, callback execution is skipped, which can miss usage/finish_reason emission and leave streaming spans unfinalized.Proposed fix
export async function* llmGeneratorWrapper( streamingResult: | AsyncIterable<llamaindex.ChatResponseChunk> | AsyncIterable<llamaindex.CompletionResponse>, ctx: Context, fn: (message: string, lastChunk?: any) => void, ) { let message = ""; // Track the last chunk so the callback can extract usage/finish_reason from // chunk.raw — OpenAI sends these in the final streaming chunk when // stream_options: { include_usage: true } is set on the LLM. let lastChunk: any; - for await (const messageChunk of bindAsyncGenerator( - ctx, - streamingResult as AsyncGenerator, - )) { - if ((messageChunk as llamaindex.ChatResponseChunk).delta) { - message += (messageChunk as llamaindex.ChatResponseChunk).delta; - } - if ((messageChunk as llamaindex.CompletionResponse).text) { - message += (messageChunk as llamaindex.CompletionResponse).text; - } - lastChunk = messageChunk; - yield messageChunk; - } - fn(message, lastChunk); + try { + for await (const messageChunk of bindAsyncGenerator( + ctx, + streamingResult as AsyncGenerator, + )) { + if ((messageChunk as llamaindex.ChatResponseChunk).delta) { + message += (messageChunk as llamaindex.ChatResponseChunk).delta; + } + if ((messageChunk as llamaindex.CompletionResponse).text) { + message += (messageChunk as llamaindex.CompletionResponse).text; + } + lastChunk = messageChunk; + yield messageChunk; + } + } finally { + fn(message, lastChunk); + } }As per coding guidelines: "Instrumentations must extract request/response data and token usage from wrapped calls" and "Instrumentations must capture and record errors appropriately."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/instrumentation-llamaindex/src/utils.ts` around lines 69 - 83, The callback fn(message, lastChunk) currently only runs after successful full iteration; wrap the for-await loop in try/catch/finally so fn is always invoked on normal completion or on error/early return: use try { for await (const messageChunk of bindAsyncGenerator(ctx, streamingResult as AsyncGenerator)) { ... } } catch (err) { fn(message, lastChunk); throw err; } finally { if not already called, fn(message, lastChunk); } — ensure this change is applied around the bindAsyncGenerator loop and preserves rethrowing errors from streamingResult so streaming spans are finalized and errors still propagate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@packages/instrumentation-llamaindex/src/utils.ts`:
- Around line 69-83: The callback fn(message, lastChunk) currently only runs
after successful full iteration; wrap the for-await loop in try/catch/finally so
fn is always invoked on normal completion or on error/early return: use try {
for await (const messageChunk of bindAsyncGenerator(ctx, streamingResult as
AsyncGenerator)) { ... } } catch (err) { fn(message, lastChunk); throw err; }
finally { if not already called, fn(message, lastChunk); } — ensure this change
is applied around the bindAsyncGenerator loop and preserves rethrowing errors
from streamingResult so streaming spans are finalized and errors still
propagate.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e0e7c660-d053-495e-8d8d-58c1f39c7067
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (8)
packages/instrumentation-llamaindex/package.jsonpackages/instrumentation-llamaindex/src/custom-llm-instrumentation.tspackages/instrumentation-llamaindex/src/instrumentation.tspackages/instrumentation-llamaindex/src/utils.tspackages/instrumentation-llamaindex/test/finish_reasons.test.tspackages/instrumentation-llamaindex/test/instrumentation.test.tspackages/instrumentation-llamaindex/test/semconv.test.tspackages/sample-app/src/sample_llamaindex.ts
✅ Files skipped from review due to trivial changes (2)
- packages/instrumentation-llamaindex/package.json
- packages/sample-app/src/sample_llamaindex.ts
🚧 Files skipped from review as they are similar to previous changes (5)
- packages/instrumentation-llamaindex/test/finish_reasons.test.ts
- packages/instrumentation-llamaindex/test/semconv.test.ts
- packages/instrumentation-llamaindex/test/instrumentation.test.ts
- packages/instrumentation-llamaindex/src/custom-llm-instrumentation.ts
- packages/instrumentation-llamaindex/src/instrumentation.ts
be5d69f to
97367b4
Compare
|
Thanks for the thorough reviews! Here's a rundown of everything we addressed:
Fixed in a previous commit — wrapped the
Added a JSDoc comment on Nitpick: Fixed — renamed to Nitpick: Missing test for streaming with usage data Fixed — added Nitpick: Test description "not available in streaming" is misleading Fixed — updated to |
123df85 to
ed7af2a
Compare
| ATTR_GEN_AI_USAGE_OUTPUT_TOKENS, | ||
| GEN_AI_OPERATION_NAME_VALUE_CHAT, | ||
| GEN_AI_PROVIDER_NAME_VALUE_OPENAI, | ||
| } from "@opentelemetry/semantic-conventions/incubating"; |
There was a problem hiding this comment.
The import from the incubating was when the att were not official yet.
See if some of the att can be imported from the official import now:
} from "@opentelemetry/semantic-conventions";
If not leave it of course
There was a problem hiding this comment.
Checked it — in v1.40.0, the main @opentelemetry/semantic-conventions entry point only exports stable attributes (HTTP, network, URL, etc.).
All GEN_AI attributes are still in Development status and only available via /incubating.
Confirmed in the https://opentelemetry.io/docs/specs/semconv/gen-ai/ and the
https://github.com/open-telemetry/semantic-conventions/releases/tag/v1.40.0.
So keeping the import from /incubating for now.
| .startSpan(`llamaindex.${lodash.snakeCase(className)}.chat`, { | ||
| kind: SpanKind.CLIENT, | ||
| }); | ||
| const span = plugin.tracer().startSpan(`chat ${this.metadata.model}`, { |
There was a problem hiding this comment.
Just verifying this is the new converntion ?
There was a problem hiding this comment.
Yes. This is the new OTel 1.40 GenAI convention — span name format is {operation} {model} (e.g. chat gpt-4o), as specified in the https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-spans/.
The old llamaindex.open_ai.chat format was replaced as part of the migration.
Fixed TLP-2036
Summary
@llamaindex/openai—customLLMInstrumentationpromoted to class field,patchOpenAI()now correctly wraps theOpenAILLM class from the separate@llamaindex/openaipackage
gen_ai.system,gen_ai.prompt,gen_ai.completionattributes with OTel 1.40 standard:gen_ai.provider.name,gen_ai.input.messages,gen_ai.output.messagesgen_ai.response.finish_reasonsand token usage (gen_ai.usage.input_tokens,gen_ai.usage.output_tokens) — always emitted as metadata regardless oftraceContentsettingrawfield to extract token usage and finish reason whenstream_options: { include_usage: true }is set on the LLMllamaindex.open_ai.chattochat {model}(OTel 1.40 convention)New tests
test/semconv.test.ts— pure unit tests for OTel constants,mapOpenAIContentBlock,formatInputMessages,formatOutputMessage; coverstraceContent: true/falsebehaviortest/finish_reasons.test.ts— unit tests foropenAIFinishReasonMap; verifies every OpenAI raw finish reason maps to a valid OTel valuetest/instrumentation.test.ts— new mock LLM tests covering span attributes for non-streaming, streaming, andtraceContent: falsescenarios;Test plan
pnpm nx run @traceloop/instrumentation-llamaindex:test— 40 tests passLink for data
https://app.traceloop.dev/projects/lena-bedrock-anthropic-js-tests/trace?projectSlug=lena-bedrock-anthropic-js-tests
Summary by CodeRabbit
New Features
Tests
Chores