feat(pydantic-ai): Support ImageUrl content type in span instrumentation #5629
2 issues
find-bugs: Found 2 issues (1 medium, 1 low)
Medium
Regex fails to match valid MIME types, leaking base64 data instead of redacting it - `sentry_sdk/integrations/pydantic_ai/consts.py:7-9`
The DATA_URL_BASE64_REGEX pattern ([a-zA-Z]+/[a-zA-Z]+) only matches MIME types containing letters, but RFC 2046 allows digits, hyphens, periods, and plus signs. Valid data URLs like data:image/svg+xml;base64,... or data:video/3gpp;base64,... will not match, causing _serialize_image_url_item to fall through and return the full data URL including base64-encoded content that should be redacted.
Also found at:
sentry_sdk/integrations/pydantic_ai/spans/utils.py:29-42
Low
Test may silently pass without verifying redaction behavior - `tests/integrations/pydantic_ai/test_pydantic_ai.py:2867-2869`
The test test_image_url_redacts_base64_data_url_via_agent_run uses a conditional if "gen_ai.request.messages" in chat_span["data"] that allows the test to pass even if messages are not present in the span data. Unlike other similar tests which have an assertion like assert found_image, "...", this test's assertion is inside the conditional block, meaning if messages aren't captured (e.g., due to configuration issues or code bugs), the test will pass silently without actually verifying the redaction behavior.
Duration: 6m 3s · Tokens: 2.1M in / 22.9k out · Cost: $2.64 (+extraction: $0.01, +merge: $0.00, +fix_gate: $0.00)
Annotations
Check warning on line 9 in sentry_sdk/integrations/pydantic_ai/consts.py
github-actions / warden: find-bugs
Regex fails to match valid MIME types, leaking base64 data instead of redacting it
The DATA_URL_BASE64_REGEX pattern `([a-zA-Z]+/[a-zA-Z]+)` only matches MIME types containing letters, but RFC 2046 allows digits, hyphens, periods, and plus signs. Valid data URLs like `data:image/svg+xml;base64,...` or `data:video/3gpp;base64,...` will not match, causing `_serialize_image_url_item` to fall through and return the full data URL including base64-encoded content that should be redacted.
Check warning on line 42 in sentry_sdk/integrations/pydantic_ai/spans/utils.py
github-actions / warden: find-bugs
[53N-D7M] Regex fails to match valid MIME types, leaking base64 data instead of redacting it (additional location)
The DATA_URL_BASE64_REGEX pattern `([a-zA-Z]+/[a-zA-Z]+)` only matches MIME types containing letters, but RFC 2046 allows digits, hyphens, periods, and plus signs. Valid data URLs like `data:image/svg+xml;base64,...` or `data:video/3gpp;base64,...` will not match, causing `_serialize_image_url_item` to fall through and return the full data URL including base64-encoded content that should be redacted.