Skip to content

fix common tests

00733f9
Select commit
Loading
Failed to load commit list.
Draft

feat: Send GenAI spans as V2 envelope items #6079

fix common tests
00733f9
Select commit
Loading
Failed to load commit list.
@sentry/warden / warden: find-bugs completed Apr 17, 2026 in 23m 50s

7 issues

find-bugs: Found 7 issues (2 high, 4 medium, 1 low)

High

test_multiple_providers captures only transactions but asserts on spans - `tests/integrations/litellm/test_litellm.py:945`

Line 945 calls capture_items("transaction") which only captures transactions. However, lines 1020-1023 (outside the hunk but testing items set up in the hunk) filter for item.type == "span" and assert on span attributes. Since spans are never captured, the spans list will be empty and the for-loop assertion will trivially pass, making the test ineffective at verifying span attributes.

Also found at:

  • tests/integrations/litellm/test_litellm.py:1020-1023
  • tests/integrations/litellm/test_litellm.py:866-868
Test expects V2 span envelope for non-gen_ai op span, will fail - `tests/tracing/test_misc.py:618-629`

The test test_conversation_id_propagates_to_span_with_gen_ai_operation_name was modified to use capture_items("span") which captures V2 envelope span items. However, the span being created has op="http.client", and _split_gen_ai_spans() in client.py only splits spans where op starts with gen_ai.. This span will NOT be sent as a V2 envelope item - it will remain in the transaction event. The test will fail because spans list will be empty or not contain the expected span.

Also found at:

  • tests/tracing/test_misc.py:636-647

Medium

V2 GenAI spans may be missing release, environment, and SDK metadata - `sentry_sdk/client.py:1130`

Line 1130 passes event instead of event_opt to _serialized_v1_span_to_serialized_v2_span. The _prepare_event function (lines 811-817) populates release, environment, and sdk from options into the event, but those values only exist in the returned event_opt. The original event parameter may not contain these fields, causing V2 GenAI spans to be missing sentry.release, sentry.environment, sentry.sdk.name, and sentry.sdk.version attributes.

Also found at:

  • tests/integrations/google_genai/test_google_genai.py:2153
Hardcoded SDK version '2.58.0' will cause test failure on version change - `tests/integrations/huggingface_hub/test_huggingface_hub.py:523`

In test_text_generation, the expected sentry.sdk.version attribute is hardcoded as "2.58.0" (line 523) instead of using mock.ANY like all other similar tests in this file. This test will fail when the SDK version changes, unlike test_text_generation_streaming, test_chat_completion, and other tests which correctly use mock.ANY for version comparison.

Also found at:

  • tests/integrations/openai_agents/test_openai_agents.py:1097
Test accesses orphaned _meta after gen_ai span is removed from transaction - `tests/integrations/openai/test_openai.py:3758-3760`

After gen_ai spans are split from the transaction and sent as V2 envelope items, the transaction's spans list no longer contains the gen_ai span. However, the test still accesses event["_meta"]["spans"]["0"]["data"] expecting truncation metadata. Since the span at index 0 has been moved to the V2 envelope, _meta["spans"]["0"] now references metadata for a span that no longer exists in the transaction's spans array. This test will likely fail or assert against orphaned/stale metadata.

Also found at:

  • tests/integrations/pydantic_ai/test_pydantic_ai.py:830-833
Test checks wrong field 'attributes' instead of 'data' for transaction trace context - `tests/integrations/openai_agents/test_openai_agents.py:3560-3562`

At line 3560-3561, the test checks transaction["contexts"]["trace"].get("attributes", {}) to verify conversation_id is not set. However, all other tests in this file (lines 3359, 3497) and throughout the test suite access transaction trace data via transaction["contexts"]["trace"]["data"]. This inconsistency means the test will always pass since it's checking a non-existent 'attributes' field, while the actual data might still contain the conversation_id in the 'data' field.

Low

Duplicated sort key uses 'name' twice instead of 'name' and 'description' - `tests/integrations/google_genai/test_google_genai.py:330`

The sorting lambda for tools on line 330 was changed from (t.get("name", ""), t.get("description", "")) to (t.get("name", ""), t.get("name", "")). This duplicates 'name' as both primary and secondary sort keys, making the secondary sort redundant. While this works for the current test data (since tool names are distinct), it loses the intended secondary sort by description and appears to be a copy-paste error.

Also found at:

  • tests/integrations/langchain/test_langchain.py:1840-1844

Duration: 23m 34s · Tokens: 18.5M in / 212.2k out · Cost: $26.27 (+extraction: $0.03, +merge: $0.01, +fix_gate: $0.02)

Annotations

Check failure on line 945 in tests/integrations/litellm/test_litellm.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: find-bugs

test_multiple_providers captures only transactions but asserts on spans

Line 945 calls `capture_items("transaction")` which only captures transactions. However, lines 1020-1023 (outside the hunk but testing `items` set up in the hunk) filter for `item.type == "span"` and assert on span attributes. Since spans are never captured, the `spans` list will be empty and the for-loop assertion will trivially pass, making the test ineffective at verifying span attributes.

Check failure on line 1023 in tests/integrations/litellm/test_litellm.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: find-bugs

[KB4-XQE] test_multiple_providers captures only transactions but asserts on spans (additional location)

Line 945 calls `capture_items("transaction")` which only captures transactions. However, lines 1020-1023 (outside the hunk but testing `items` set up in the hunk) filter for `item.type == "span"` and assert on span attributes. Since spans are never captured, the `spans` list will be empty and the for-loop assertion will trivially pass, making the test ineffective at verifying span attributes.

Check failure on line 868 in tests/integrations/litellm/test_litellm.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: find-bugs

[KB4-XQE] test_multiple_providers captures only transactions but asserts on spans (additional location)

Line 945 calls `capture_items("transaction")` which only captures transactions. However, lines 1020-1023 (outside the hunk but testing `items` set up in the hunk) filter for `item.type == "span"` and assert on span attributes. Since spans are never captured, the `spans` list will be empty and the for-loop assertion will trivially pass, making the test ineffective at verifying span attributes.

Check failure on line 629 in tests/tracing/test_misc.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: find-bugs

Test expects V2 span envelope for non-gen_ai op span, will fail

The test `test_conversation_id_propagates_to_span_with_gen_ai_operation_name` was modified to use `capture_items("span")` which captures V2 envelope span items. However, the span being created has `op="http.client"`, and `_split_gen_ai_spans()` in client.py only splits spans where op starts with `gen_ai.`. This span will NOT be sent as a V2 envelope item - it will remain in the transaction event. The test will fail because `spans` list will be empty or not contain the expected span.

Check failure on line 647 in tests/tracing/test_misc.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: find-bugs

[76L-VVC] Test expects V2 span envelope for non-gen_ai op span, will fail (additional location)

The test `test_conversation_id_propagates_to_span_with_gen_ai_operation_name` was modified to use `capture_items("span")` which captures V2 envelope span items. However, the span being created has `op="http.client"`, and `_split_gen_ai_spans()` in client.py only splits spans where op starts with `gen_ai.`. This span will NOT be sent as a V2 envelope item - it will remain in the transaction event. The test will fail because `spans` list will be empty or not contain the expected span.

Check warning on line 1130 in sentry_sdk/client.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: find-bugs

V2 GenAI spans may be missing release, environment, and SDK metadata

Line 1130 passes `event` instead of `event_opt` to `_serialized_v1_span_to_serialized_v2_span`. The `_prepare_event` function (lines 811-817) populates `release`, `environment`, and `sdk` from options into the event, but those values only exist in the returned `event_opt`. The original `event` parameter may not contain these fields, causing V2 GenAI spans to be missing `sentry.release`, `sentry.environment`, `sentry.sdk.name`, and `sentry.sdk.version` attributes.

Check warning on line 2153 in tests/integrations/google_genai/test_google_genai.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: find-bugs

[ZP5-7W4] V2 GenAI spans may be missing release, environment, and SDK metadata (additional location)

Line 1130 passes `event` instead of `event_opt` to `_serialized_v1_span_to_serialized_v2_span`. The `_prepare_event` function (lines 811-817) populates `release`, `environment`, and `sdk` from options into the event, but those values only exist in the returned `event_opt`. The original `event` parameter may not contain these fields, causing V2 GenAI spans to be missing `sentry.release`, `sentry.environment`, `sentry.sdk.name`, and `sentry.sdk.version` attributes.

Check warning on line 523 in tests/integrations/huggingface_hub/test_huggingface_hub.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: find-bugs

Hardcoded SDK version '2.58.0' will cause test failure on version change

In `test_text_generation`, the expected `sentry.sdk.version` attribute is hardcoded as `"2.58.0"` (line 523) instead of using `mock.ANY` like all other similar tests in this file. This test will fail when the SDK version changes, unlike `test_text_generation_streaming`, `test_chat_completion`, and other tests which correctly use `mock.ANY` for version comparison.

Check warning on line 1097 in tests/integrations/openai_agents/test_openai_agents.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: find-bugs

[75Z-DFJ] Hardcoded SDK version '2.58.0' will cause test failure on version change (additional location)

In `test_text_generation`, the expected `sentry.sdk.version` attribute is hardcoded as `"2.58.0"` (line 523) instead of using `mock.ANY` like all other similar tests in this file. This test will fail when the SDK version changes, unlike `test_text_generation_streaming`, `test_chat_completion`, and other tests which correctly use `mock.ANY` for version comparison.

Check warning on line 3760 in tests/integrations/openai/test_openai.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: find-bugs

Test accesses orphaned _meta after gen_ai span is removed from transaction

After gen_ai spans are split from the transaction and sent as V2 envelope items, the transaction's spans list no longer contains the gen_ai span. However, the test still accesses `event["_meta"]["spans"]["0"]["data"]` expecting truncation metadata. Since the span at index 0 has been moved to the V2 envelope, `_meta["spans"]["0"]` now references metadata for a span that no longer exists in the transaction's spans array. This test will likely fail or assert against orphaned/stale metadata.

Check warning on line 833 in tests/integrations/pydantic_ai/test_pydantic_ai.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: find-bugs

[AZN-DCX] Test accesses orphaned _meta after gen_ai span is removed from transaction (additional location)

After gen_ai spans are split from the transaction and sent as V2 envelope items, the transaction's spans list no longer contains the gen_ai span. However, the test still accesses `event["_meta"]["spans"]["0"]["data"]` expecting truncation metadata. Since the span at index 0 has been moved to the V2 envelope, `_meta["spans"]["0"]` now references metadata for a span that no longer exists in the transaction's spans array. This test will likely fail or assert against orphaned/stale metadata.

Check warning on line 3562 in tests/integrations/openai_agents/test_openai_agents.py

See this annotation in the file changed.

@sentry-warden sentry-warden / warden: find-bugs

Test checks wrong field 'attributes' instead of 'data' for transaction trace context

At line 3560-3561, the test checks `transaction["contexts"]["trace"].get("attributes", {})` to verify conversation_id is not set. However, all other tests in this file (lines 3359, 3497) and throughout the test suite access transaction trace data via `transaction["contexts"]["trace"]["data"]`. This inconsistency means the test will always pass since it's checking a non-existent 'attributes' field, while the actual data might still contain the conversation_id in the 'data' field.