fix(fastapi): Stop eagerly consuming request bodies for streamed spans#6286
5 issues
High
Passwords and PII written to streamed span attributes without scrubbing - `tests/integrations/fastapi/test_fastapi.py:232-233`
When span streaming is enabled, cached request body form/JSON values are serialized verbatim and attached to the segment span as SPANDATA.HTTP_REQUEST_BODY_DATA. Unlike the normal event path—where the event processor / data scrubber replaces sensitive fields such as password with [Filtered]—the span attribute path applies no PII filtering. As a result, plaintext passwords and other sensitive form fields are sent to Sentry as span data. The code comment acknowledges sanitization is deferred to a future before_send_span hook that does not yet exist.
Cached request body (form/JSON) written to streamed span attributes without PII scrubbing - `tests/integrations/fastapi/test_fastapi.py:233`
_get_cached_request_body_attribute in starlette.py serializes the cached request body (request._json or request._form) directly with json.dumps and the result is stored on the streamed span via current_span._segment.set_attribute(SPANDATA.HTTP_REQUEST_BODY_DATA, ...) in both starlette.py (_wrap_async_handler) and fastapi.py (_wrap_async_handler). Unlike the event path, which runs through the Sentry scrubbing pipeline (so a password field becomes [Filtered]), the span-attribute path applies no denylist/redaction. Sensitive fields such as passwords therefore land in plaintext in span data. This is a known limitation acknowledged in the PR/test comments (scrubbing deferred to future before_send_span hooks), but it ships in this state.
Medium
Passwords and sensitive form fields sent unfiltered to Sentry in streamed spans - `tests/integrations/fastapi/test_fastapi.py:209-214`
When span streaming is enabled, _get_cached_request_body_attribute serializes request._form/request._json (including password fields) directly to JSON and attaches it as the HTTP_REQUEST_BODY_DATA span attribute without any PII scrubbing, bypassing the denylist filtering that redacts sensitive fields in the event/non-streaming path. Sanitization should be applied (e.g. in a span hook) before attaching the body.
Low
Switch to cached _json/_form does not stop eager body consumption (extract_request_info still consumes) - `sentry_sdk/integrations/fastapi.py:180`
In _wrap_async_handler (sentry_sdk/integrations/starlette.py), extract_request_info() is invoked unconditionally before the handler runs. That method calls self.json() (→ request.json(), caching _json) and self.form() (→ request.body() + request.form(), caching _form), so the request body is still eagerly consumed on every JSON/form request. As a result, the PR's change to read cached _json/_form in _get_cached_request_body_attribute does not actually avoid eager consumption for streamed spans: by the time the finally block runs, the SDK itself has already populated _json/_form. The 'omit the attribute if the request body is not cached, since the endpoint may not have accessed it' rationale from the PR description is therefore defeated for JSON/form requests — the body is always cached by the SDK and thus always attached to the streamed span. This is a pre-existing/acknowledged consumption path rather than a new defect, but it means the stated goal of the change is not realized.
`_serialize_request_body_data` is now dead code after its only caller was removed - `sentry_sdk/integrations/starlette.py:255`
The _set_request_body_data_on_streaming_segment function deleted in this diff was the only caller of _serialize_request_body_data; the helper now goes unused and should be removed too.
4 skills analyzed
| Skill | Findings | Duration | Cost |
|---|---|---|---|
| security-review | 1 | 1m 53s | $0.62 |
| code-review | 2 | 15m 16s | $2.65 |
| find-bugs | 2 | 21m 34s | $3.89 |
| skill-scanner | 0 | 6.0s | $0.09 |
⏱ 38m 49s · 3.4M in / 247.4k out · $7.25