feat: Send GenAI spans as V2 envelope items #6079
5 issues
find-bugs: Found 5 issues (1 high, 3 medium, 1 low)
High
GenAI V2 spans missing metadata because wrong event object passed to converter - `sentry_sdk/client.py:1138`
At line 1138, _serialized_v1_span_to_serialized_v2_span is called with event (the original, unprepared event dict) instead of event_opt (the prepared event with release, environment, SDK info, and user data populated by _prepare_event). The converter function reads event.get("release"), event.get("environment"), event.get("sdk"), event.get("user"), and event.get("contexts") to populate V2 span attributes like sentry.release, sentry.environment, sentry.sdk.name, etc. Since the original event may not have these fields populated, the converted GenAI spans will be missing critical tracing metadata.
Also found at:
tests/integrations/langchain/test_langchain.py:1370tests/integrations/openai/test_openai.py:3767-3769
Medium
test_async_exception_handling patches embeddings client instead of completions client - `tests/integrations/litellm/test_litellm.py:866-868`
The test test_async_exception_handling patches client.embeddings._client._client.send but calls litellm.acompletion() which uses the completions API. This mismatch means the mock won't intercept the actual request, potentially causing the test to either fail or pass for the wrong reasons. The sync version test_exception_handling correctly patches client.completions._client._client.send.
test_multiple_providers never validates span attributes due to missing span capture - `tests/integrations/litellm/test_litellm.py:945`
The test_multiple_providers function calls capture_items("transaction") at line 945, but later attempts to assert on span attributes at lines 1020-1023. Since spans are not captured, the spans list will always be empty and the for span in spans: loop will never execute, making the assertion assert SPANDATA.GEN_AI_SYSTEM in span["attributes"] ineffective. This could allow bugs where the GenAI system attribute is missing to go undetected.
Also found at:
tests/integrations/litellm/test_litellm.py:1020-1023tests/integrations/google_genai/test_google_genai.py:330-331
Test checks wrong key 'attributes' instead of 'data' for transaction context - `tests/integrations/openai_agents/test_openai_agents.py:3540-3542`
The test for verifying conversation_id absence uses transaction["contexts"]["trace"].get("attributes", {}) at line 3540, but transactions use data not attributes for trace context data. The capture_items fixture only transforms attributes for 'metric', 'log', and 'span' types (conftest.py line 361-367), while transactions are passed through unchanged (line 369-370). Other tests in this file correctly use ["data"] (lines 3341, 3479). This will cause the assertion to always pass regardless of whether conversation_id is incorrectly set, defeating the purpose of the test.
Also found at:
tests/integrations/pydantic_ai/test_pydantic_ai.py:831-833tests/integrations/openai_agents/test_openai_agents.py:2257tests/integrations/pydantic_ai/test_pydantic_ai.py:490-496
Low
Test no longer validates that transaction event is captured - `tests/integrations/google_genai/test_google_genai.py:1104`
The old test asserted len(events) == 2 to verify both the transaction and error events were captured. The new test only extracts and validates the error event using (error_event,) = (item.payload for item in items if item.type == "event"), silently ignoring whether a transaction was captured. This removes a regression check - if the transaction event stops being sent due to a bug, this test would still pass.
Also found at:
tests/integrations/langchain/test_langchain.py:1842-1846
Duration: 45m 18s · Tokens: 23.3M in / 239.3k out · Cost: $31.99 (+extraction: $0.04, +merge: $0.01, +fix_gate: $0.02)
Annotations
Check failure on line 1138 in sentry_sdk/client.py
sentry-warden / warden: find-bugs
GenAI V2 spans missing metadata because wrong event object passed to converter
At line 1138, `_serialized_v1_span_to_serialized_v2_span` is called with `event` (the original, unprepared event dict) instead of `event_opt` (the prepared event with release, environment, SDK info, and user data populated by `_prepare_event`). The converter function reads `event.get("release")`, `event.get("environment")`, `event.get("sdk")`, `event.get("user")`, and `event.get("contexts")` to populate V2 span attributes like `sentry.release`, `sentry.environment`, `sentry.sdk.name`, etc. Since the original event may not have these fields populated, the converted GenAI spans will be missing critical tracing metadata.
Check failure on line 1370 in tests/integrations/langchain/test_langchain.py
sentry-warden / warden: find-bugs
[DGA-NPF] GenAI V2 spans missing metadata because wrong event object passed to converter (additional location)
At line 1138, `_serialized_v1_span_to_serialized_v2_span` is called with `event` (the original, unprepared event dict) instead of `event_opt` (the prepared event with release, environment, SDK info, and user data populated by `_prepare_event`). The converter function reads `event.get("release")`, `event.get("environment")`, `event.get("sdk")`, `event.get("user")`, and `event.get("contexts")` to populate V2 span attributes like `sentry.release`, `sentry.environment`, `sentry.sdk.name`, etc. Since the original event may not have these fields populated, the converted GenAI spans will be missing critical tracing metadata.
Check failure on line 3769 in tests/integrations/openai/test_openai.py
sentry-warden / warden: find-bugs
[DGA-NPF] GenAI V2 spans missing metadata because wrong event object passed to converter (additional location)
At line 1138, `_serialized_v1_span_to_serialized_v2_span` is called with `event` (the original, unprepared event dict) instead of `event_opt` (the prepared event with release, environment, SDK info, and user data populated by `_prepare_event`). The converter function reads `event.get("release")`, `event.get("environment")`, `event.get("sdk")`, `event.get("user")`, and `event.get("contexts")` to populate V2 span attributes like `sentry.release`, `sentry.environment`, `sentry.sdk.name`, etc. Since the original event may not have these fields populated, the converted GenAI spans will be missing critical tracing metadata.
Check warning on line 868 in tests/integrations/litellm/test_litellm.py
sentry-warden / warden: find-bugs
test_async_exception_handling patches embeddings client instead of completions client
The test `test_async_exception_handling` patches `client.embeddings._client._client.send` but calls `litellm.acompletion()` which uses the completions API. This mismatch means the mock won't intercept the actual request, potentially causing the test to either fail or pass for the wrong reasons. The sync version `test_exception_handling` correctly patches `client.completions._client._client.send`.
Check warning on line 945 in tests/integrations/litellm/test_litellm.py
sentry-warden / warden: find-bugs
test_multiple_providers never validates span attributes due to missing span capture
The `test_multiple_providers` function calls `capture_items("transaction")` at line 945, but later attempts to assert on span attributes at lines 1020-1023. Since spans are not captured, the `spans` list will always be empty and the `for span in spans:` loop will never execute, making the assertion `assert SPANDATA.GEN_AI_SYSTEM in span["attributes"]` ineffective. This could allow bugs where the GenAI system attribute is missing to go undetected.
Check warning on line 1023 in tests/integrations/litellm/test_litellm.py
sentry-warden / warden: find-bugs
[X7J-3WW] test_multiple_providers never validates span attributes due to missing span capture (additional location)
The `test_multiple_providers` function calls `capture_items("transaction")` at line 945, but later attempts to assert on span attributes at lines 1020-1023. Since spans are not captured, the `spans` list will always be empty and the `for span in spans:` loop will never execute, making the assertion `assert SPANDATA.GEN_AI_SYSTEM in span["attributes"]` ineffective. This could allow bugs where the GenAI system attribute is missing to go undetected.
Check warning on line 331 in tests/integrations/google_genai/test_google_genai.py
sentry-warden / warden: find-bugs
[X7J-3WW] test_multiple_providers never validates span attributes due to missing span capture (additional location)
The `test_multiple_providers` function calls `capture_items("transaction")` at line 945, but later attempts to assert on span attributes at lines 1020-1023. Since spans are not captured, the `spans` list will always be empty and the `for span in spans:` loop will never execute, making the assertion `assert SPANDATA.GEN_AI_SYSTEM in span["attributes"]` ineffective. This could allow bugs where the GenAI system attribute is missing to go undetected.
Check warning on line 3542 in tests/integrations/openai_agents/test_openai_agents.py
sentry-warden / warden: find-bugs
Test checks wrong key 'attributes' instead of 'data' for transaction context
The test for verifying conversation_id absence uses `transaction["contexts"]["trace"].get("attributes", {})` at line 3540, but transactions use `data` not `attributes` for trace context data. The `capture_items` fixture only transforms attributes for 'metric', 'log', and 'span' types (conftest.py line 361-367), while transactions are passed through unchanged (line 369-370). Other tests in this file correctly use `["data"]` (lines 3341, 3479). This will cause the assertion to always pass regardless of whether conversation_id is incorrectly set, defeating the purpose of the test.
Check warning on line 833 in tests/integrations/pydantic_ai/test_pydantic_ai.py
sentry-warden / warden: find-bugs
[6UX-SYX] Test checks wrong key 'attributes' instead of 'data' for transaction context (additional location)
The test for verifying conversation_id absence uses `transaction["contexts"]["trace"].get("attributes", {})` at line 3540, but transactions use `data` not `attributes` for trace context data. The `capture_items` fixture only transforms attributes for 'metric', 'log', and 'span' types (conftest.py line 361-367), while transactions are passed through unchanged (line 369-370). Other tests in this file correctly use `["data"]` (lines 3341, 3479). This will cause the assertion to always pass regardless of whether conversation_id is incorrectly set, defeating the purpose of the test.
Check warning on line 2257 in tests/integrations/openai_agents/test_openai_agents.py
sentry-warden / warden: find-bugs
[6UX-SYX] Test checks wrong key 'attributes' instead of 'data' for transaction context (additional location)
The test for verifying conversation_id absence uses `transaction["contexts"]["trace"].get("attributes", {})` at line 3540, but transactions use `data` not `attributes` for trace context data. The `capture_items` fixture only transforms attributes for 'metric', 'log', and 'span' types (conftest.py line 361-367), while transactions are passed through unchanged (line 369-370). Other tests in this file correctly use `["data"]` (lines 3341, 3479). This will cause the assertion to always pass regardless of whether conversation_id is incorrectly set, defeating the purpose of the test.
Check warning on line 496 in tests/integrations/pydantic_ai/test_pydantic_ai.py
sentry-warden / warden: find-bugs
[6UX-SYX] Test checks wrong key 'attributes' instead of 'data' for transaction context (additional location)
The test for verifying conversation_id absence uses `transaction["contexts"]["trace"].get("attributes", {})` at line 3540, but transactions use `data` not `attributes` for trace context data. The `capture_items` fixture only transforms attributes for 'metric', 'log', and 'span' types (conftest.py line 361-367), while transactions are passed through unchanged (line 369-370). Other tests in this file correctly use `["data"]` (lines 3341, 3479). This will cause the assertion to always pass regardless of whether conversation_id is incorrectly set, defeating the purpose of the test.