feat(streaming): emit ReasoningDeltaEvent for reasoning/thinking deltas (#825) by adityasingh2400 · Pull Request #3 · adityasingh2400/openai-agents-python

adityasingh2400 · 2026-04-14T22:42:05Z

Summary

When models like o3 or DeepSeek-R1 produce reasoning/thinking tokens during streaming, those deltas currently only surface as raw RawResponsesStreamEvent wrappers around low-level response.reasoning_summary_text.delta or response.reasoning_text.delta events. To consume them, callers have to inspect .data.type and cast the event themselves — there's no clean signal in the StreamEvent union.

This PR adds ReasoningDeltaEvent to StreamEvent and emits it alongside the existing raw event so reasoning deltas are as easy to consume as message deltas.

Closes openai#825

What changed

Added ReasoningDeltaEvent dataclass to stream_events.py with delta, snapshot, and type fields
Updated StreamEvent type alias to include ReasoningDeltaEvent
Exported from agents/__init__.py
In run_internal/run_loop.py, the run_single_turn_streamed loop now emits a ReasoningDeltaEvent after each ResponseReasoningSummaryTextDeltaEvent (o-series) and ResponseReasoningTextDeltaEvent (DeepSeek/LiteLLM)
The snapshot field accumulates the full reasoning text so far in the turn, so callers don't have to maintain their own buffer
Raw events are still emitted unchanged — fully backwards compatible

Usage example

from agents import Agent, Runner
from agents.stream_events import ReasoningDeltaEvent

agent = Agent(name="thinker", model="o3-mini")
result = Runner.run_streamed(agent, "prove P != NP")

async for event in result.stream_events():
    if isinstance(event, ReasoningDeltaEvent):
        print(event.delta, end="", flush=True)

print()  # reasoning complete

Tests

Added tests/test_reasoning_delta_stream_event.py covering:

ReasoningDeltaEvent is emitted for reasoning items
Snapshot grows monotonically and ends with full text
No event emitted for plain text responses
Raw events still emitted alongside
Importable directly from agents
Correct dataclass fields

Also updated tests/test_stream_events.py::test_complete_streaming_events to account for the new event in the event sequence (count goes from 27 → 28).

Summary by CodeRabbit

New Features
- Streamed reasoning deltas are now emitted as discrete events carrying incremental delta text and an accumulated snapshot.
Tests
- Added and updated tests to verify reasoning-delta emission, snapshot accumulation across deltas, event ordering adjustments, and continued emission of existing stream event types.

coderabbitai · 2026-04-14T22:42:23Z

Important

Review skipped

Too many files!

This PR contains 299 files, which is 149 over the limit of 150.

To get a review, narrow the scope:
• coderabbit review --type committed # exclude uncommitted changes
• coderabbit review --dir # limit to a subdirectory
• coderabbit review --base # compare against a closer base

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: d2ccfe57-d257-43fc-8d93-0773ee475040

📥 Commits

Reviewing files that changed from the base of the PR and between a2d4974 and 8102932.

⛔ Files ignored due to path filters (1)

docs/assets/images/harness_with_compute.png is excluded by !**/*.png

📒 Files selected for processing (299)

.agents/skills/code-change-verification/SKILL.md
.agents/skills/code-change-verification/agents/openai.yaml
.agents/skills/code-change-verification/scripts/run.ps1
.agents/skills/code-change-verification/scripts/run.sh
.agents/skills/docs-sync/agents/openai.yaml
.agents/skills/examples-auto-run/SKILL.md
.agents/skills/examples-auto-run/agents/openai.yaml
.agents/skills/examples-auto-run/scripts/run.sh
.agents/skills/final-release-review/agents/openai.yaml
.agents/skills/implementation-strategy/SKILL.md
.agents/skills/implementation-strategy/agents/openai.yaml
.agents/skills/openai-knowledge/agents/openai.yaml
.agents/skills/pr-draft-summary/SKILL.md
.agents/skills/pr-draft-summary/agents/openai.yaml
.agents/skills/runtime-behavior-probe/SKILL.md
.agents/skills/runtime-behavior-probe/agents/openai.yaml
.agents/skills/runtime-behavior-probe/references/error-cases.md
.agents/skills/runtime-behavior-probe/references/openai-runtime-patterns.md
.agents/skills/runtime-behavior-probe/references/reporting-format.md
.agents/skills/runtime-behavior-probe/references/validation-matrix.md
.agents/skills/runtime-behavior-probe/templates/python_probe.py
.agents/skills/test-coverage-improver/agents/openai.yaml
.codex/config.toml
.codex/hooks.json
.codex/hooks/stop_repo_tidy.py
.github/ISSUE_TEMPLATE/bug_report.md
.github/ISSUE_TEMPLATE/model_provider.md
.github/codex/prompts/pr-labels.md
.github/codex/prompts/release-review.md
.github/codex/schemas/pr-labels.json
.github/scripts/pr_labels.py
.github/workflows/docs.yml
.github/workflows/pr-labels.yml
.github/workflows/publish.yml
.github/workflows/release-pr-update.yml
.github/workflows/release-pr.yml
.github/workflows/release-tag.yml
.github/workflows/tests.yml
.github/workflows/update-docs.yml
.gitignore
AGENTS.md
CLAUDE.md
CLAUDE.md
README.md
SECURITY.md
docs/agents.md
docs/config.md
docs/context.md
docs/examples.md
docs/index.md
docs/ja/agents.md
docs/ja/config.md
docs/ja/context.md
docs/ja/examples.md
docs/ja/guardrails.md
docs/ja/handoffs.md
docs/ja/human_in_the_loop.md
docs/ja/index.md
docs/ja/mcp.md
docs/ja/models/index.md
docs/ja/models/litellm.md
docs/ja/multi_agent.md
docs/ja/quickstart.md
docs/ja/realtime/guide.md
docs/ja/realtime/quickstart.md
docs/ja/realtime/transport.md
docs/ja/release.md
docs/ja/repl.md
docs/ja/results.md
docs/ja/running_agents.md
docs/ja/sandbox/clients.md
docs/ja/sandbox/guide.md
docs/ja/sandbox/memory.md
docs/ja/sandbox_agents.md
docs/ja/sessions/advanced_sqlite_session.md
docs/ja/sessions/encrypted_session.md
docs/ja/sessions/index.md
docs/ja/sessions/sqlalchemy_session.md
docs/ja/streaming.md
docs/ja/tools.md
docs/ja/tracing.md
docs/ja/usage.md
docs/ja/visualization.md
docs/ja/voice/pipeline.md
docs/ja/voice/quickstart.md
docs/ja/voice/tracing.md
docs/ko/agents.md
docs/ko/config.md
docs/ko/context.md
docs/ko/examples.md
docs/ko/guardrails.md
docs/ko/handoffs.md
docs/ko/human_in_the_loop.md
docs/ko/index.md
docs/ko/mcp.md
docs/ko/models/index.md
docs/ko/models/litellm.md
docs/ko/multi_agent.md
docs/ko/quickstart.md
docs/ko/realtime/guide.md
docs/ko/realtime/quickstart.md
docs/ko/realtime/transport.md
docs/ko/release.md
docs/ko/repl.md
docs/ko/results.md
docs/ko/running_agents.md
docs/ko/sandbox/clients.md
docs/ko/sandbox/guide.md
docs/ko/sandbox/memory.md
docs/ko/sandbox_agents.md
docs/ko/sessions/advanced_sqlite_session.md
docs/ko/sessions/encrypted_session.md
docs/ko/sessions/index.md
docs/ko/sessions/sqlalchemy_session.md
docs/ko/streaming.md
docs/ko/tools.md
docs/ko/tracing.md
docs/ko/usage.md
docs/ko/visualization.md
docs/ko/voice/pipeline.md
docs/ko/voice/quickstart.md
docs/ko/voice/tracing.md
docs/llms-full.txt
docs/llms.txt
docs/mcp.md
docs/models/index.md
docs/models/litellm.md
docs/quickstart.md
docs/realtime/guide.md
docs/realtime/quickstart.md
docs/ref/extensions/litellm.md
docs/ref/extensions/memory/mongodb_session.md
docs/ref/extensions/models/any_llm_model.md
docs/ref/extensions/models/any_llm_provider.md
docs/ref/extensions/sandbox/blaxel/mounts.md
docs/ref/extensions/sandbox/blaxel/sandbox.md
docs/ref/extensions/sandbox/cloudflare/mounts.md
docs/ref/extensions/sandbox/cloudflare/sandbox.md
docs/ref/extensions/sandbox/daytona/mounts.md
docs/ref/extensions/sandbox/daytona/sandbox.md
docs/ref/extensions/sandbox/e2b/mounts.md
docs/ref/extensions/sandbox/e2b/sandbox.md
docs/ref/extensions/sandbox/modal/mounts.md
docs/ref/extensions/sandbox/modal/sandbox.md
docs/ref/extensions/sandbox/runloop/mounts.md
docs/ref/extensions/sandbox/runloop/sandbox.md
docs/ref/extensions/sandbox/vercel/sandbox.md
docs/ref/models/openai_agent_registration.md
docs/ref/models/openai_client_utils.md
docs/ref/models/reasoning_content_replay.md
docs/ref/run_internal/agent_bindings.md
docs/ref/run_internal/prompt_cache_key.md
docs/ref/run_internal/run_grouping.md
docs/ref/sandbox.md
docs/ref/sandbox/apply_patch.md
docs/ref/sandbox/capabilities/capabilities.md
docs/ref/sandbox/capabilities/capability.md
docs/ref/sandbox/capabilities/compaction.md
docs/ref/sandbox/capabilities/filesystem.md
docs/ref/sandbox/capabilities/memory.md
docs/ref/sandbox/capabilities/shell.md
docs/ref/sandbox/capabilities/skills.md
docs/ref/sandbox/capabilities/tools/apply_patch_tool.md
docs/ref/sandbox/capabilities/tools/shell_tool.md
docs/ref/sandbox/capabilities/tools/view_image.md
docs/ref/sandbox/config.md
docs/ref/sandbox/entries.md
docs/ref/sandbox/entries/artifacts.md
docs/ref/sandbox/entries/base.md
docs/ref/sandbox/entries/mounts/base.md
docs/ref/sandbox/entries/mounts/patterns.md
docs/ref/sandbox/entries/mounts/providers/azure_blob.md
docs/ref/sandbox/entries/mounts/providers/base.md
docs/ref/sandbox/entries/mounts/providers/box.md
docs/ref/sandbox/entries/mounts/providers/gcs.md
docs/ref/sandbox/entries/mounts/providers/r2.md
docs/ref/sandbox/entries/mounts/providers/s3.md
docs/ref/sandbox/entries/mounts/providers/s3_files.md
docs/ref/sandbox/errors.md
docs/ref/sandbox/files.md
docs/ref/sandbox/manifest.md
docs/ref/sandbox/manifest_render.md
docs/ref/sandbox/materialization.md
docs/ref/sandbox/memory/interface.md
docs/ref/sandbox/memory/manager.md
docs/ref/sandbox/memory/phase_one.md
docs/ref/sandbox/memory/phase_two.md
docs/ref/sandbox/memory/prompts.md
docs/ref/sandbox/memory/rollouts.md
docs/ref/sandbox/memory/storage.md
docs/ref/sandbox/permissions.md
docs/ref/sandbox/remote_mount_policy.md
docs/ref/sandbox/runtime.md
docs/ref/sandbox/runtime_agent_preparation.md
docs/ref/sandbox/runtime_session_manager.md
docs/ref/sandbox/sandbox_agent.md
docs/ref/sandbox/sandboxes/docker.md
docs/ref/sandbox/sandboxes/unix_local.md
docs/ref/sandbox/session/archive_extraction.md
docs/ref/sandbox/session/archive_ops.md
docs/ref/sandbox/session/base_sandbox_session.md
docs/ref/sandbox/session/dependencies.md
docs/ref/sandbox/session/events.md
docs/ref/sandbox/session/manager.md
docs/ref/sandbox/session/manifest_application.md
docs/ref/sandbox/session/manifest_ops.md
docs/ref/sandbox/session/mount_lifecycle.md
docs/ref/sandbox/session/pty_types.md
docs/ref/sandbox/session/runtime_helpers.md
docs/ref/sandbox/session/sandbox_client.md
docs/ref/sandbox/session/sandbox_session.md
docs/ref/sandbox/session/sandbox_session_state.md
docs/ref/sandbox/session/sinks.md
docs/ref/sandbox/session/snapshot_lifecycle.md
docs/ref/sandbox/session/tar_workspace.md
docs/ref/sandbox/session/utils.md
docs/ref/sandbox/session/workspace_payloads.md
docs/ref/sandbox/snapshot.md
docs/ref/sandbox/snapshot_defaults.md
docs/ref/sandbox/types.md
docs/ref/sandbox/util/checksums.md
docs/ref/sandbox/util/deep_merge.md
docs/ref/sandbox/util/github.md
docs/ref/sandbox/util/iterator_io.md
docs/ref/sandbox/util/parse_utils.md
docs/ref/sandbox/util/retry.md
docs/ref/sandbox/util/tar_utils.md
docs/ref/sandbox/util/token_truncation.md
docs/ref/sandbox/workspace_paths.md
docs/release.md
docs/results.md
docs/running_agents.md
docs/sandbox/clients.md
docs/sandbox/guide.md
docs/sandbox/memory.md
docs/sandbox_agents.md
docs/scripts/translate_docs.py
docs/sessions/index.md
docs/streaming.md
docs/stylesheets/extra.css
docs/tools.md
docs/tracing.md
docs/usage.md
docs/voice/pipeline.md
docs/voice/quickstart.md
docs/zh/agents.md
docs/zh/config.md
docs/zh/context.md
docs/zh/examples.md
docs/zh/guardrails.md
docs/zh/handoffs.md
docs/zh/human_in_the_loop.md
docs/zh/index.md
docs/zh/mcp.md
docs/zh/models/index.md
docs/zh/models/litellm.md
docs/zh/multi_agent.md
docs/zh/quickstart.md
docs/zh/realtime/guide.md
docs/zh/realtime/quickstart.md
docs/zh/realtime/transport.md
docs/zh/release.md
docs/zh/repl.md
docs/zh/results.md
docs/zh/running_agents.md
docs/zh/sandbox/clients.md
docs/zh/sandbox/guide.md
docs/zh/sandbox/memory.md
docs/zh/sandbox_agents.md
docs/zh/sessions/advanced_sqlite_session.md
docs/zh/sessions/encrypted_session.md
docs/zh/sessions/index.md
docs/zh/sessions/sqlalchemy_session.md
docs/zh/streaming.md
docs/zh/tools.md
docs/zh/tracing.md
docs/zh/usage.md
docs/zh/visualization.md
docs/zh/voice/pipeline.md
docs/zh/voice/quickstart.md
docs/zh/voice/tracing.md
examples/basic/hello_world_gpt_5.py
examples/basic/lifecycle_example.py
examples/basic/non_strict_output_type.py
examples/basic/stream_function_call_args.py
examples/basic/stream_ws.py
examples/customer_service/main.py
examples/financial_research_agent/agents/search_agent.py
examples/financial_research_agent/agents/verifier_agent.py
examples/financial_research_agent/agents/writer_agent.py
examples/financial_research_agent/main.py
examples/hosted_mcp/simple.py
examples/mcp/manager_example/README.md
examples/mcp/manager_example/smoke_test.py
examples/mcp/sse_example/main.py
examples/mcp/sse_example/server.py
examples/mcp/sse_remote_example/README.md
examples/mcp/sse_remote_example/main.py
examples/mcp/streamable_http_remote_example/README.md

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

A new ReasoningDeltaEvent stream event type was added and exported from the package; the streaming run loop now accumulates reasoning-text deltas into per-delta ReasoningDeltaEvent(delta, snapshot) events; tests and expected event sequences were added/updated.

Changes

Cohort / File(s)	Summary
Event Type Definition `src/agents/stream_events.py`	Add `ReasoningDeltaEvent` dataclass with fields `delta: str`, `snapshot: str`, `type: "reasoning_delta"` and include it in the `StreamEvent` union.
Package Exports `src/agents/__init__.py`	Import `ReasoningDeltaEvent` and add it to `__all__` to expose the new event at package top level.
Streaming Loop Implementation `src/agents/run_internal/run_loop.py`	Accumulate reasoning deltas in `_reasoning_snapshot`; on `ResponseReasoningTextDeltaEvent`/`ResponseReasoningSummaryTextDeltaEvent` append `event.delta` and enqueue `ReasoningDeltaEvent(delta=..., snapshot=...)`; reset snapshot on `ResponseCreatedEvent`.
Test Coverage `tests/test_reasoning_delta_stream_event.py`	New async tests asserting emission, shape (`delta`, `snapshot`, `type`), monotonic snapshot accumulation, coexistence with other events, and top-level importability.
Stream Event Test Updates `tests/test_stream_events.py`	Adjusted expected streamed-event sequence and counts (total events increased to account for `ReasoningDeltaEvent`); indices realigned accordingly.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Runner
    participant Model
    participant StreamLoop
    participant EventQueue

    Client->>Runner: run_streamed()
    Runner->>Model: request reasoning + response
    Model-->>StreamLoop: ResponseReasoningTextDeltaEvent / ResponseReasoningSummaryTextDeltaEvent
    StreamLoop->>StreamLoop: append delta → _reasoning_snapshot
    StreamLoop->>EventQueue: enqueue ReasoningDeltaEvent(delta, snapshot)
    EventQueue-->>Client: stream_events() yields ReasoningDeltaEvent and other events

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I nibble deltas, stitch thought by bit,
Snapshots swell steady, piece after bit,
From model to queue my whiskers prance,
Small hops of reason join into dance,
A rabbit cheers for each streaming advance.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: introducing a new ReasoningDeltaEvent that emits reasoning/thinking deltas in the streaming API.
Docstring Coverage	✅ Passed	Docstring coverage is 90.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/reasoning-delta-stream-event-825

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/agents/run_internal/run_loop.py`:
- Around line 1113-1114: The _reasoning_snapshot field is never cleared across
stream retries, causing ReasoningDeltaEvent.snapshot to include duplicated text;
import ResponseCreatedEvent and, inside the event loop where stream events are
handled (the loop that processes events from stream_response_with_retry /
get_stream after rewind()), detect events of type ResponseCreatedEvent and reset
_reasoning_snapshot = "" when such an event is received to mark a fresh response
attempt; ensure you reference and update the existing _reasoning_snapshot
variable (not a new local) so subsequent ReasoningDeltaEvent handling produces a
clean snapshot for the new stream.

In `@tests/test_reasoning_delta_stream_event.py`:
- Around line 100-104: The test currently breaks out when a ReasoningDeltaEvent
is seen but does nothing if none are emitted; update the test to explicitly fail
when no ReasoningDeltaEvent is observed by either setting a flag (e.g.,
seen_reasoning = False) and asserting seen_reasoning is True after the async for
loop, or by using an else branch on the loop to raise an AssertionError;
reference the async iterator result.stream_events(), the ReasoningDeltaEvent
type check and the event.type assertion to locate where to add the post-loop
failure assertion.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: da5ae3ef-04e2-4c5c-a0aa-b83bf9abdb96

📥 Commits

Reviewing files that changed from the base of the PR and between 5c9fb2c and 531b9ea.

📒 Files selected for processing (5)

src/agents/__init__.py
src/agents/run_internal/run_loop.py
src/agents/stream_events.py
tests/test_reasoning_delta_stream_event.py
tests/test_stream_events.py

- Import ResponseCreatedEvent and reset _reasoning_snapshot to "" when a ResponseCreatedEvent is received inside the retry stream loop, fixing the bug where snapshot text would be duplicated across retries - In test_reasoning_delta_event_type_field: add found=False flag and assert found after the loop so the test properly fails when no ReasoningDeltaEvent is emitted

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_reasoning_delta_stream_event.py`:
- Around line 59-70: The test currently allows a vacuous pass when no
ReasoningDeltaEvent snapshots are emitted; update the test around the snapshots
collection from result.stream_events() to require at least one snapshot before
performing length-order and content checks: after collecting snapshots (variable
snapshots) add an assertion that snapshots is not empty (e.g., assert snapshots,
"no reasoning snapshots emitted") so the subsequent loop and final check that
"Hello world" appears in snapshots[-1] will fail if no ReasoningDeltaEvent
objects were produced.
- Around line 82-85: The test currently only asserts that no individual event is
a ReasoningDeltaEvent but doesn't ensure the stream produced any events; update
the test that uses result.stream_events() to also verify the stream yielded at
least one event (e.g., accumulate events or increment a counter while iterating)
and assert the collected events list length (or counter) is greater than zero,
while still asserting none of the yielded events are instances of
ReasoningDeltaEvent.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: b02a6b46-9175-4392-97c4-428c546aef1f

📥 Commits

Reviewing files that changed from the base of the PR and between 531b9ea and 996be13.

📒 Files selected for processing (5)

src/agents/__init__.py
src/agents/run_internal/run_loop.py
src/agents/stream_events.py
tests/test_reasoning_delta_stream_event.py
tests/test_stream_events.py

✅ Files skipped from review due to trivial changes (1)

src/agents/run_internal/run_loop.py

🚧 Files skipped from review as they are similar to previous changes (1)

src/agents/init.py

Instead of trying to constraint the agent to only work in the workspace, allow it to fail freely without sudo.

…i#2850) Co-authored-by: Kazuhiro Sera <seratch@openai.com>

…penai#2925)

### Summary This pull request fixes sandbox compaction defaults so Azure/custom deployment names do not fail before the model request is sent. It makes compaction model-window lookup separator-insensitive for known OpenAI-style names like `gpt-5-2`, adds a non-throwing lookup path, and falls back to the static compaction threshold when the model window cannot be inferred. ### Test plan - `make format` - `make lint` - `uv run pytest tests/sandbox/test_compaction.py tests/sandbox/capabilities/test_compaction_capability.py tests/test_sandbox_runtime_agent_preparation.py -q` - `uv run mypy src/agents/sandbox/capabilities/compaction.py tests/sandbox/test_compaction.py tests/sandbox/capabilities/test_compaction_capability.py` - `uv run pyright src/agents/sandbox/capabilities/compaction.py tests/sandbox/test_compaction.py tests/sandbox/capabilities/test_compaction_capability.py` Full `make tests` / `make typecheck` are currently blocked in this workspace by missing optional dependencies and unrelated existing failures (`numpy`, `litellm`, `sqlalchemy`, `temporalio`, `boto3`, runloop extra, and one PTY timing assertion). ### Issue number Refs openai#2927 ### Checks - [x] I've added new tests (if relevant) - [ ] I've added/updated the relevant documentation - [x] I've run `make lint` and `make format` - [ ] I've made sure tests pass

- Add SandboxPathGrant manifest support for explicit access to absolute paths outside sandbox workspace. - Centralize path handling in WorkspacePathPolicy.normalize_path(...), including extra grant matching, symlink-aware host validation, and most-specific grant selection. - Harden access boundaries by rejecting filesystem-root grants, // root aliases, and grants that resolve to /. - Preserve nested grant semantics, including writable parent + read-only child cases through remote symlink targets and macOS exec confinement. - Update sandbox provider integrations to use shared path policy across Docker, Unix-local, Runloop, Vercel, Cloudflare, E2B, Modal, Daytona, and Blaxel.

…mpletes (openai#2931)

)

Co-authored-by: Kazuhiro Sera <seratch@openai.com>

…penai#3410)

…openai#3386)

…gnoring (openai#3411)

…#3424)

## Summary - Refresh runtime handling around session and tool-call flows. - Adjust model configuration metadata used by runtime integrations. - Add focused coverage for the updated behavior. ## Validation - .venv/bin/python -m pytest tests/model_settings/test_serialization.py tests/models/test_trace_config.py tests/mcp/test_streamable_http_client_factory.py tests/test_run_context_approvals.py tests/test_run_state.py::TestRunState::test_trace_api_key_serialization_is_opt_in tests/realtime/test_session.py - .venv/bin/ruff check <touched files> - .venv/bin/ruff format --check <touched files> - git diff --check

…3466)

…ai#3461)

)

…i#3489)

…tItem from agents (openai#3490)

…as (openai#825) Add a new ReasoningDeltaEvent to StreamEvent so callers can react to reasoning/thinking tokens in real time without unpacking low-level raw response events. The event is emitted whenever a ResponseReasoningSummaryTextDeltaEvent (o-series extended thinking via the Responses API) or a ResponseReasoningTextDeltaEvent (third-party models like DeepSeek-R1 via LiteLLM) passes through the stream. The underlying RawResponsesStreamEvent is still emitted as well, so nothing breaks for consumers that already inspect raw events. Fields: delta - the incremental text fragment from this chunk snapshot - full accumulated reasoning text so far in this turn type - always 'reasoning_delta' Closes openai#825

- Import ResponseCreatedEvent and reset _reasoning_snapshot to "" when a ResponseCreatedEvent is received inside the retry stream loop, fixing the bug where snapshot text would be duplicated across retries - In test_reasoning_delta_event_type_field: add found=False flag and assert found after the loop so the test properly fails when no ReasoningDeltaEvent is emitted

The two stream-event tests were only asserting on data conditional on a ReasoningDeltaEvent being emitted at all, so a regression that stopped emitting the event entirely would have passed silently. * test_reasoning_delta_snapshot_accumulates: assert that snapshots is non-empty before checking monotonic length and the "Hello world" inclusion (previously gated on `if snapshots:`). * test_no_reasoning_delta_event_without_reasoning: count yielded events and assert the stream produced at least one, so the negative not-isinstance assertion can't pass on an empty event stream. Picked up the remaining nitpicks from the CodeRabbit review of PR #3.

…ning_deltas Default the new ReasoningDeltaEvent emission to off so the streamed event count is unchanged for existing consumers. Opt in via RunConfig.emit_reasoning_deltas=True to receive reasoning deltas without unwrapping the raw events. Addresses the event-volume concern raised in review.

Adding ReasoningDeltaEvent to the StreamEvent union widened PrintableEvent, so mypy could no longer prove the trailing match handled a RunItemStreamEvent. Add an early return for ReasoningDeltaEvent, mirroring the existing RawResponsesStreamEvent guard, which restores narrowing and fixes typecheck.

adityasingh2400 · 2026-05-26T16:09:01Z

Closing this fork staging PR. The corresponding upstream work is tracked separately, so this internal branch is no longer needed.

coderabbitai Bot reviewed Apr 14, 2026

View reviewed changes

Comment thread src/agents/run_internal/run_loop.py

Comment thread tests/test_reasoning_delta_stream_event.py

adityasingh2400 force-pushed the feat/reasoning-delta-stream-event-825 branch from 01d8b3d to 996be13 Compare April 16, 2026 16:09

coderabbitai Bot reviewed Apr 16, 2026

View reviewed changes

Comment thread tests/test_reasoning_delta_stream_event.py Outdated

Comment thread tests/test_reasoning_delta_stream_event.py

scotttrinh and others added 24 commits April 16, 2026 20:41

fix: Trust filesystem permissions for Vercel roots (openai#2910)

b58d059

Instead of trying to constraint the agent to only work in the workspace, allow it to fail freely without sudo.

fix: openai#604 handle None choices in ChatCompletion response (opena…

b7ba446

…i#2850) Co-authored-by: Kazuhiro Sera <seratch@openai.com>

fix: normalize compacted Responses user inputs before session reuse (o…

f84ef7f

…penai#2925)

feat(extensions): add MongoDB session backend (openai#2902)

67fb85a

docs: update translated document pages (openai#2935)

55c8900

fix: openai#2929 surface run-loop exceptions after stream_events() co…

61443ca

…mpletes (openai#2931)

Release 0.14.2 (openai#2899)

e80d2d2

fix: openai#2938 make sandboxes importable on Windows (openai#2948)

82eaf15

docs: move module docstring to top of handoff_filters.py (openai#2950)

cebc763

fix: openai#2951 warn for tool name character replacement (openai#2953)

da82b2c

fix: openai#2962 normalize sandbox paths and add Windows CI (openai#2963

cc57bb1

)

fix: prepare Daytona workspace root before start (openai#2956)

12b5471

Add Datadog as an external tracer in the tracing docs (openai#2965)

66cc689

docs: update translated document pages (openai#2978)

35c497e

fix: windows errors with openai#2956 (openai#2979)

5515283

fix: tighten LocalSnapshot restorable checks (openai#2975)

902c599

fix: bound manifest description truncation (openai#2974)

9d963e8

Release 0.14.3 (openai#2980)

5d300f0

docs: remove duplicate word in voice interruptions section (openai#2981)

bf3e9d1

docs: update translated document pages (openai#2982)

9e228fc

test: add sandbox compatibility guards (openai#2984)

2a515f0

fix: ignore relative snapshot base overrides (openai#2976)

106ef05

Co-authored-by: Kazuhiro Sera <seratch@openai.com>

cty-ut and others added 25 commits May 15, 2026 15:39

fix: skip wait_for_status when Vercel sandbox is in a terminal state (o…

43a389d

…penai#3410)

fix: filter hosted_tool_call types in remove_all_tools handoff filter (…

cb7211b

…openai#3386)

fix: guard None text in ItemHelpers.extract_last_content (openai#3394)

5e71d09

fix: log exception when output guardrail raises instead of silently i…

cb0461d

…gnoring (openai#3411)

fix: reject relative sandbox workspace roots (openai#3422)

94523f9

fix: normalize leading question marks in exposed port queries (openai…

e37b3d2

…#3424)

fix: openai#3363 honor short custom voice splitter chunks (openai#3364)

4bd459e

fix: keep mountpoint credentials out of sandbox commands (openai#3429)

4970fd6

docs: fix LiteLLM API reference redirect (openai#3444)

41fe113

docs: re-fix openai#3444

13d1815

docs: fix duplicated word in usaspending glossary example (openai#3445)

65774ce

Release 0.17.3 (openai#3417)

17f7cae

docs: add SECURITY.md in the same way with openai-agents-js repo

445ad22

fix: apply hardened http client default to MCP SSE transport (openai#…

9514473

…3466)

fix: use non-None value for output in FunctionSpanData (openai#3475)

9303389

fix: openai#3459 add opt-in recovery for missing function tools (open…

45effb4

…ai#3461)

fix: add missing entries to span __slots__ (openai#3483)

eda7b51

fix: redact invalid JSON payload in ModelBehaviorError data (openai#3485

813a003

)

fix: export more tracing related functions & types from agents (opena…

573530f

…i#3489)

fix: export MCPListToolsItem, ToolSearchCallItem, and ToolSearchOutpu…

fedc809

…tItem from agents (openai#3490)

chore: ruff format/isort fixes after rebase

5b04cf8

adityasingh2400 force-pushed the feat/reasoning-delta-stream-event-825 branch from 5c36515 to 5b04cf8 Compare May 22, 2026 05:10

adityasingh2400 added 2 commits May 24, 2026 02:16

adityasingh2400 closed this May 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(streaming): emit ReasoningDeltaEvent for reasoning/thinking deltas (#825)#3

feat(streaming): emit ReasoningDeltaEvent for reasoning/thinking deltas (#825)#3
adityasingh2400 wants to merge 348 commits into
mainfrom
feat/reasoning-delta-stream-event-825

adityasingh2400 commented Apr 14, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 14, 2026 •

edited

Loading

Review skipped

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

adityasingh2400 commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

adityasingh2400 commented Apr 14, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Usage example

Tests

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

adityasingh2400 commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

adityasingh2400 commented Apr 14, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 14, 2026 •

edited

Loading