Skip to content

add constant again

7bd12ae
Select commit
Loading
Failed to load commit list.
Draft

feat: Send GenAI spans as V2 envelope items #6079

add constant again
7bd12ae
Select commit
Loading
Failed to load commit list.
@sentry/warden / warden: code-review completed Apr 20, 2026 in 41m 15s

6 issues

code-review: Found 6 issues (6 medium)

Medium

GenAI spans lose release, environment, and SDK metadata due to passing unprepared event - `sentry_sdk/client.py:1134`

On line 1134, event (the original input) is passed to _serialized_v1_span_to_serialized_v2_span instead of event_opt (the prepared event). The _prepare_event method enriches the event with release, environment, server_name, dist, and sdk info (lines 811-817), but _serialized_v1_span_to_serialized_v2_span reads these fields (lines 231-247) from the passed event to populate attributes like sentry.release, sentry.environment, sentry.sdk.name, and sentry.sdk.version. As a result, GenAI v2 spans will be missing these critical metadata attributes.

Also found at:

  • tests/integrations/huggingface_hub/test_huggingface_hub.py:521
Sort key uses 'name' twice instead of 'name' and 'description' - `tests/integrations/google_genai/test_google_genai.py:330`

The sorting lambda was changed from sorting by (name, description) to (name, name), but the comment says 'sort by name and description for comparison'. This makes the sort key effectively single-field, which could lead to non-deterministic ordering when multiple tools have the same name but different descriptions, causing flaky test failures.

Also found at:

  • tests/integrations/litellm/test_litellm.py:945
Mock patches wrong client attribute in test_async_exception_handling - `tests/integrations/litellm/test_litellm.py:866-868`

The test mocks client.embeddings._client._client.send but calls litellm.acompletion which uses the completions endpoint. The synchronous test_exception_handling correctly mocks client.completions._client._client.send. This mismatch means the mock may not intercept the request, causing the test to not properly verify exception handling behavior.

Also found at:

  • tests/integrations/openai_agents/test_openai_agents.py:1966
Test assertions never execute because spans list is always empty - `tests/integrations/litellm/test_litellm.py:1020-1023`

The test_multiple_providers function calls capture_items("transaction") on line 945 which only captures transaction items, not span items. The newly added code on lines 1020-1023 tries to iterate over spans with spans = [item.payload for item in items if item.type == "span"], but this will always be empty since span items are not being captured. The for loop never executes, making the SPANDATA.GEN_AI_SYSTEM assertion dead code that doesn't validate anything.

Also found at:

  • tests/integrations/litellm/test_litellm.py:1279
Test checks wrong field 'attributes' instead of 'data' for transaction context - `tests/integrations/openai_agents/test_openai_agents.py:3540-3542`

The test checks transaction["contexts"]["trace"].get("attributes", {}) but should check transaction["contexts"]["trace"].get("data", {}). Earlier in the same file (lines 3341, 3479), the transaction context uses data to store attributes like gen_ai.conversation.id. This inconsistency means the test may pass regardless of whether gen_ai.conversation.id is actually absent from the transaction, since it's checking the wrong field.

Also found at:

  • tests/integrations/pydantic_ai/test_pydantic_ai.py:831-834
Missing event validation when handled_tool_call_exceptions is False - `tests/integrations/pydantic_ai/test_pydantic_ai.py:490-496`

In test_agent_with_tool_validation_error, when handled_tool_call_exceptions=False, the test no longer validates that a model_behaviour_error event is captured. The original code had an else branch that unpacked (model_behaviour_error, transaction) = events to verify the expected events were present. Without this validation, if the integration fails to emit the expected error event when the flag is False, the test won't catch it.

Also found at:

  • tests/integrations/langchain/test_langchain.py:1842-1846

Duration: 40m 58s · Tokens: 16.2M in / 192.4k out · Cost: $20.99 (+extraction: $0.02, +merge: $0.00, +fix_gate: $0.02)

Annotations

Check warning on line 1134 in sentry_sdk/client.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

GenAI spans lose release, environment, and SDK metadata due to passing unprepared event

On line 1134, `event` (the original input) is passed to `_serialized_v1_span_to_serialized_v2_span` instead of `event_opt` (the prepared event). The `_prepare_event` method enriches the event with `release`, `environment`, `server_name`, `dist`, and `sdk` info (lines 811-817), but `_serialized_v1_span_to_serialized_v2_span` reads these fields (lines 231-247) from the passed event to populate attributes like `sentry.release`, `sentry.environment`, `sentry.sdk.name`, and `sentry.sdk.version`. As a result, GenAI v2 spans will be missing these critical metadata attributes.

Check warning on line 521 in tests/integrations/huggingface_hub/test_huggingface_hub.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

[YSC-XNK] GenAI spans lose release, environment, and SDK metadata due to passing unprepared event (additional location)

On line 1134, `event` (the original input) is passed to `_serialized_v1_span_to_serialized_v2_span` instead of `event_opt` (the prepared event). The `_prepare_event` method enriches the event with `release`, `environment`, `server_name`, `dist`, and `sdk` info (lines 811-817), but `_serialized_v1_span_to_serialized_v2_span` reads these fields (lines 231-247) from the passed event to populate attributes like `sentry.release`, `sentry.environment`, `sentry.sdk.name`, and `sentry.sdk.version`. As a result, GenAI v2 spans will be missing these critical metadata attributes.

Check warning on line 330 in tests/integrations/google_genai/test_google_genai.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

Sort key uses 'name' twice instead of 'name' and 'description'

The sorting lambda was changed from sorting by `(name, description)` to `(name, name)`, but the comment says 'sort by name and description for comparison'. This makes the sort key effectively single-field, which could lead to non-deterministic ordering when multiple tools have the same name but different descriptions, causing flaky test failures.

Check warning on line 945 in tests/integrations/litellm/test_litellm.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

[3VH-9Y9] Sort key uses 'name' twice instead of 'name' and 'description' (additional location)

The sorting lambda was changed from sorting by `(name, description)` to `(name, name)`, but the comment says 'sort by name and description for comparison'. This makes the sort key effectively single-field, which could lead to non-deterministic ordering when multiple tools have the same name but different descriptions, causing flaky test failures.

Check warning on line 868 in tests/integrations/litellm/test_litellm.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

Mock patches wrong client attribute in test_async_exception_handling

The test mocks `client.embeddings._client._client.send` but calls `litellm.acompletion` which uses the completions endpoint. The synchronous `test_exception_handling` correctly mocks `client.completions._client._client.send`. This mismatch means the mock may not intercept the request, causing the test to not properly verify exception handling behavior.

Check warning on line 1966 in tests/integrations/openai_agents/test_openai_agents.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

[N8C-36U] Mock patches wrong client attribute in test_async_exception_handling (additional location)

The test mocks `client.embeddings._client._client.send` but calls `litellm.acompletion` which uses the completions endpoint. The synchronous `test_exception_handling` correctly mocks `client.completions._client._client.send`. This mismatch means the mock may not intercept the request, causing the test to not properly verify exception handling behavior.

Check warning on line 1023 in tests/integrations/litellm/test_litellm.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

Test assertions never execute because spans list is always empty

The `test_multiple_providers` function calls `capture_items("transaction")` on line 945 which only captures transaction items, not span items. The newly added code on lines 1020-1023 tries to iterate over spans with `spans = [item.payload for item in items if item.type == "span"]`, but this will always be empty since span items are not being captured. The for loop never executes, making the `SPANDATA.GEN_AI_SYSTEM` assertion dead code that doesn't validate anything.

Check warning on line 1279 in tests/integrations/litellm/test_litellm.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

[824-3SV] Test assertions never execute because spans list is always empty (additional location)

The `test_multiple_providers` function calls `capture_items("transaction")` on line 945 which only captures transaction items, not span items. The newly added code on lines 1020-1023 tries to iterate over spans with `spans = [item.payload for item in items if item.type == "span"]`, but this will always be empty since span items are not being captured. The for loop never executes, making the `SPANDATA.GEN_AI_SYSTEM` assertion dead code that doesn't validate anything.

Check warning on line 3542 in tests/integrations/openai_agents/test_openai_agents.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

Test checks wrong field 'attributes' instead of 'data' for transaction context

The test checks `transaction["contexts"]["trace"].get("attributes", {})` but should check `transaction["contexts"]["trace"].get("data", {})`. Earlier in the same file (lines 3341, 3479), the transaction context uses `data` to store attributes like `gen_ai.conversation.id`. This inconsistency means the test may pass regardless of whether `gen_ai.conversation.id` is actually absent from the transaction, since it's checking the wrong field.

Check warning on line 834 in tests/integrations/pydantic_ai/test_pydantic_ai.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

[VV8-US7] Test checks wrong field 'attributes' instead of 'data' for transaction context (additional location)

The test checks `transaction["contexts"]["trace"].get("attributes", {})` but should check `transaction["contexts"]["trace"].get("data", {})`. Earlier in the same file (lines 3341, 3479), the transaction context uses `data` to store attributes like `gen_ai.conversation.id`. This inconsistency means the test may pass regardless of whether `gen_ai.conversation.id` is actually absent from the transaction, since it's checking the wrong field.

Check warning on line 496 in tests/integrations/pydantic_ai/test_pydantic_ai.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

Missing event validation when handled_tool_call_exceptions is False

In `test_agent_with_tool_validation_error`, when `handled_tool_call_exceptions=False`, the test no longer validates that a `model_behaviour_error` event is captured. The original code had an `else` branch that unpacked `(model_behaviour_error, transaction) = events` to verify the expected events were present. Without this validation, if the integration fails to emit the expected error event when the flag is False, the test won't catch it.

Check warning on line 1846 in tests/integrations/langchain/test_langchain.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: code-review

[8RN-KST] Missing event validation when handled_tool_call_exceptions is False (additional location)

In `test_agent_with_tool_validation_error`, when `handled_tool_call_exceptions=False`, the test no longer validates that a `model_behaviour_error` event is captured. The original code had an `else` branch that unpacked `(model_behaviour_error, transaction) = events` to verify the expected events were present. Without this validation, if the integration fails to emit the expected error event when the flag is False, the test won't catch it.