Skip to content

Commit c84859c

Browse files
committed
Merge main into feat/track-tool-origin
- Resolved merge conflicts from main branch refactoring - Moved _get_tool_origin_info() from deleted _run_impl.py to tool.py - Updated all run_internal modules to support tool_origin tracking - Preserved tool origin setting logic for agent-as-tool and MCP tools - Fixed all RunImpl references to use new run_internal module structure - Added tool_origin support to ToolCallItem and ToolCallOutputItem in all execution paths
1 parent 8b2d12a commit c84859c

File tree

457 files changed

+53077
-6416
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

457 files changed

+53077
-6416
lines changed
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
---
2+
name: code-change-verification
3+
description: Run the mandatory verification stack when changes affect runtime code, tests, or build/test behavior in the OpenAI Agents Python repository.
4+
---
5+
6+
# Code Change Verification
7+
8+
## Overview
9+
10+
Ensure work is only marked complete after formatting, linting, type checking, and tests pass. Use this skill when changes affect runtime code, tests, or build/test configuration. You can skip it for docs-only or repository metadata unless a user asks for the full stack.
11+
12+
## Quick start
13+
14+
1. Keep this skill at `./.agents/skills/code-change-verification` so it loads automatically for the repository.
15+
2. macOS/Linux: `bash .agents/skills/code-change-verification/scripts/run.sh`.
16+
3. Windows: `powershell -ExecutionPolicy Bypass -File .agents/skills/code-change-verification/scripts/run.ps1`.
17+
4. If any command fails, fix the issue, rerun the script, and report the failing output.
18+
5. Confirm completion only when all commands succeed with no remaining issues.
19+
20+
## Manual workflow
21+
22+
- If dependencies are not installed or have changed, run `make sync` first to install dev requirements via `uv`.
23+
- Run from the repository root in this order: `make format`, `make lint`, `make mypy`, `make tests`.
24+
- Do not skip steps; stop and fix issues immediately when a command fails.
25+
- Re-run the full stack after applying fixes so the commands execute in the required order.
26+
27+
## Resources
28+
29+
### scripts/run.sh
30+
31+
- Executes the full verification sequence with fail-fast semantics from the repository root. Prefer this entry point to ensure the required commands run in the correct order.
32+
33+
### scripts/run.ps1
34+
35+
- Windows-friendly wrapper that runs the same verification sequence with fail-fast semantics. Use from PowerShell with execution policy bypass if required by your environment.
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
Set-StrictMode -Version Latest
2+
$ErrorActionPreference = "Stop"
3+
4+
$scriptDir = Split-Path -Parent $MyInvocation.MyCommand.Definition
5+
$repoRoot = $null
6+
7+
try {
8+
$repoRoot = (& git -C $scriptDir rev-parse --show-toplevel 2>$null)
9+
} catch {
10+
$repoRoot = $null
11+
}
12+
13+
if (-not $repoRoot) {
14+
$repoRoot = Resolve-Path (Join-Path $scriptDir "..\\..\\..\\..")
15+
}
16+
17+
Set-Location $repoRoot
18+
19+
function Invoke-MakeStep {
20+
param(
21+
[Parameter(Mandatory = $true)][string]$Step
22+
)
23+
24+
Write-Host "Running make $Step..."
25+
& make $Step
26+
27+
if ($LASTEXITCODE -ne 0) {
28+
Write-Error "code-change-verification: make $Step failed with exit code $LASTEXITCODE."
29+
exit $LASTEXITCODE
30+
}
31+
}
32+
33+
Invoke-MakeStep -Step "format"
34+
Invoke-MakeStep -Step "lint"
35+
Invoke-MakeStep -Step "mypy"
36+
Invoke-MakeStep -Step "tests"
37+
38+
Write-Host "code-change-verification: all commands passed."
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
#!/usr/bin/env bash
2+
# Fail fast on any error or undefined variable.
3+
set -euo pipefail
4+
5+
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
6+
if command -v git >/dev/null 2>&1; then
7+
REPO_ROOT="$(git -C "${SCRIPT_DIR}" rev-parse --show-toplevel 2>/dev/null || true)"
8+
fi
9+
REPO_ROOT="${REPO_ROOT:-$(cd "${SCRIPT_DIR}/../../../.." && pwd)}"
10+
11+
cd "${REPO_ROOT}"
12+
13+
echo "Running make format..."
14+
make format
15+
16+
echo "Running make lint..."
17+
make lint
18+
19+
echo "Running make mypy..."
20+
make mypy
21+
22+
echo "Running make tests..."
23+
make tests
24+
25+
echo "code-change-verification: all commands passed."

.agents/skills/docs-sync/SKILL.md

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
---
2+
name: docs-sync
3+
description: Analyze main branch implementation and configuration to find missing, incorrect, or outdated documentation in docs/. Use when asked to audit doc coverage, sync docs with code, or propose doc updates/structure changes. Only update English docs under docs/** and never touch translated docs under docs/ja, docs/ko, or docs/zh. Provide a report and ask for approval before editing docs.
4+
---
5+
6+
# Docs Sync
7+
8+
## Overview
9+
10+
Identify doc coverage gaps and inaccuracies by comparing main branch features and configuration options against the current docs structure, then propose targeted improvements.
11+
12+
## Workflow
13+
14+
1. Confirm scope and base branch
15+
- Identify the current branch and default branch (usually `main`).
16+
- Prefer analyzing the current branch to keep work aligned with in-flight changes.
17+
- If the current branch is not `main`, analyze only the diff vs `main` to scope doc updates.
18+
- Avoid switching branches if it would disrupt local changes; use `git show main:<path>` or `git worktree add` when needed.
19+
20+
2. Build a feature inventory from the selected scope
21+
- If on `main`: inventory the full surface area and review docs comprehensively.
22+
- If not on `main`: inventory only changes vs `main` (feature additions/changes/removals).
23+
- Focus on user-facing behavior: public exports, configuration options, environment variables, CLI commands, default values, and documented runtime behaviors.
24+
- Capture evidence for each item (file path + symbol/setting).
25+
- Use targeted search to find option types and feature flags (for example: `rg "Settings"`, `rg "Config"`, `rg "os.environ"`, `rg "OPENAI_"`).
26+
- When the topic involves OpenAI platform features, invoke `$openai-knowledge` to pull current details from the OpenAI Developer Docs MCP server instead of guessing, while treating the SDK source code as the source of truth when discrepancies appear.
27+
28+
3. Doc-first pass: review existing pages
29+
- Walk each relevant page under `docs/` (excluding `docs/ja`, `docs/ko`, and `docs/zh`).
30+
- Identify missing mentions of important, supported options (opt-in flags, env vars), customization points, or new features from `src/agents/` and `examples/`.
31+
- Propose additions where users would reasonably expect to find them on that page.
32+
33+
4. Code-first pass: map features to docs
34+
- Review the current docs information architecture under `docs/` and `mkdocs.yml`.
35+
- Determine the best page/section for each feature based on existing patterns and the API reference structure under `docs/ref`.
36+
- Identify features that lack any doc page or have a page but no corresponding content.
37+
- Note when a structural adjustment would improve discoverability.
38+
- When improving `docs/ref/*` pages, treat the corresponding docstrings/comments in `src/` as the source of truth. Prefer updating those code comments so regenerated reference docs stay correct, instead of hand-editing the generated pages.
39+
40+
5. Detect gaps and inaccuracies
41+
- **Missing**: features/configs present in main but absent in docs.
42+
- **Incorrect/outdated**: names, defaults, or behaviors that diverge from main.
43+
- **Structural issues** (optional): pages overloaded, missing overviews, or mis-grouped topics.
44+
45+
6. Produce a Docs Sync Report and ask for approval
46+
- Provide a clear report with evidence, suggested doc locations, and proposed edits.
47+
- Ask the user whether to proceed with doc updates.
48+
49+
7. If approved, apply changes (English only)
50+
- Edit only English docs in `docs/**`.
51+
- Do **not** edit `docs/ja`, `docs/ko`, or `docs/zh`.
52+
- Keep changes aligned with the existing docs style and navigation.
53+
- Update `mkdocs.yml` when adding or renaming pages.
54+
- Build docs with `make build-docs` after edits to verify the docs site still builds.
55+
56+
## Output format
57+
58+
Use this template when reporting findings:
59+
60+
Docs Sync Report
61+
62+
- Doc-first findings
63+
- Page + missing content -> evidence + suggested insertion point
64+
- Code-first gaps
65+
- Feature + evidence -> suggested doc page/section (or missing page)
66+
- Incorrect or outdated docs
67+
- Doc file + issue + correct info + evidence
68+
- Structural suggestions (optional)
69+
- Proposed change + rationale
70+
- Proposed edits
71+
- Doc file -> concise change summary
72+
- Questions for the user
73+
74+
## References
75+
76+
- `references/doc-coverage-checklist.md`
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
# Doc Coverage Checklist
2+
3+
Use this checklist to scan the selected scope (main = comprehensive, or current-branch diff) and validate documentation coverage.
4+
5+
## Feature inventory targets
6+
7+
- Public exports: classes, functions, types, and module entry points.
8+
- Configuration options: `*Settings` types, default config objects, and builder patterns.
9+
- Environment variables or runtime flags.
10+
- CLI commands, scripts, and example entry points that define supported usage.
11+
- User-facing behaviors: retry, timeouts, streaming, errors, logging, telemetry, and data handling.
12+
- Deprecations, removals, or renamed settings.
13+
14+
## Doc-first pass (page-by-page)
15+
16+
- Review each relevant English page (excluding `docs/ja`, `docs/ko`, and `docs/zh`).
17+
- Look for missing opt-in flags, env vars, or customization options that the page implies.
18+
- Add new features that belong on that page based on user intent and navigation.
19+
20+
## Code-first pass (feature inventory)
21+
22+
- Map features to the closest existing page based on the docs navigation in `mkdocs.yml`.
23+
- Prefer updating existing pages over creating new ones unless the topic is clearly new.
24+
- Use conceptual pages for cross-cutting concerns (auth, errors, streaming, tracing, tools).
25+
- Keep quick-start flows minimal; move advanced details into deeper pages.
26+
27+
## Evidence capture
28+
29+
- Record the main-branch file path and symbol/setting name.
30+
- Note defaults or behavior-critical details for accuracy checks.
31+
- Avoid large code dumps; a short identifier is enough.
32+
33+
## Red flags for outdated or incorrect docs
34+
35+
- Option names/types no longer exist or differ from code.
36+
- Default values or allowed ranges do not match implementation.
37+
- Features removed in code but still documented.
38+
- New behaviors introduced without corresponding docs updates.
39+
40+
## When to propose structural changes
41+
42+
- A page mixes unrelated audiences (quick-start + deep reference) without clear separation.
43+
- Multiple pages duplicate the same concept without cross-links.
44+
- New feature areas have no obvious home in the nav structure.
45+
46+
## Diff mode guidance (current branch vs main)
47+
48+
- Focus only on changed behavior: new exports/options, modified defaults, removed features, or renamed settings.
49+
- Use `git diff main...HEAD` (or equivalent) to constrain analysis.
50+
- Document removals explicitly so docs can be pruned if needed.
51+
52+
## Patch guidance
53+
54+
- Keep edits scoped and aligned with existing tone and format.
55+
- Update cross-links when moving or renaming sections.
56+
- Leave translated docs untouched; English-only updates.
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
---
2+
name: examples-auto-run
3+
description: Run python examples in auto mode with logging, rerun helpers, and background control.
4+
---
5+
6+
# examples-auto-run
7+
8+
## What it does
9+
10+
- Runs `uv run examples/run_examples.py` with:
11+
- `EXAMPLES_INTERACTIVE_MODE=auto` (auto-input/auto-approve).
12+
- Per-example logs under `.tmp/examples-start-logs/`.
13+
- Main summary log path passed via `--main-log` (also under `.tmp/examples-start-logs/`).
14+
- Generates a rerun list of failures at `.tmp/examples-rerun.txt` when `--write-rerun` is set.
15+
- Provides start/stop/status/logs/tail/collect/rerun helpers via `run.sh`.
16+
- Background option keeps the process running with a pidfile; `stop` cleans it up.
17+
18+
## Usage
19+
20+
```bash
21+
# Start (auto mode; interactive included by default)
22+
.agents/skills/examples-auto-run/scripts/run.sh start [extra args to run_examples.py]
23+
# Examples:
24+
.agents/skills/examples-auto-run/scripts/run.sh start --filter basic
25+
.agents/skills/examples-auto-run/scripts/run.sh start --include-server --include-audio
26+
27+
# Check status
28+
.agents/skills/examples-auto-run/scripts/run.sh status
29+
30+
# Stop running job
31+
.agents/skills/examples-auto-run/scripts/run.sh stop
32+
33+
# List logs
34+
.agents/skills/examples-auto-run/scripts/run.sh logs
35+
36+
# Tail latest log (or specify one)
37+
.agents/skills/examples-auto-run/scripts/run.sh tail
38+
.agents/skills/examples-auto-run/scripts/run.sh tail main_20260113-123000.log
39+
40+
# Collect rerun list from a main log (defaults to latest main_*.log)
41+
.agents/skills/examples-auto-run/scripts/run.sh collect
42+
43+
# Rerun only failed entries from rerun file (auto mode)
44+
.agents/skills/examples-auto-run/scripts/run.sh rerun
45+
```
46+
47+
## Defaults (overridable via env)
48+
49+
- `EXAMPLES_INTERACTIVE_MODE=auto`
50+
- `EXAMPLES_INCLUDE_INTERACTIVE=1`
51+
- `EXAMPLES_INCLUDE_SERVER=0`
52+
- `EXAMPLES_INCLUDE_AUDIO=0`
53+
- `EXAMPLES_INCLUDE_EXTERNAL=0`
54+
- Auto-approvals in auto mode: `APPLY_PATCH_AUTO_APPROVE=1`, `SHELL_AUTO_APPROVE=1`, `AUTO_APPROVE_MCP=1`
55+
56+
## Log locations
57+
58+
- Main logs: `.tmp/examples-start-logs/main_*.log`
59+
- Per-example logs (from `run_examples.py`): `.tmp/examples-start-logs/<module_path>.log`
60+
- Rerun list: `.tmp/examples-rerun.txt`
61+
- Stdout logs: `.tmp/examples-start-logs/stdout_*.log`
62+
63+
## Notes
64+
65+
- The runner delegates to `uv run examples/run_examples.py`, which already writes per-example logs and supports `--collect`, `--rerun-file`, and `--print-auto-skip`.
66+
- `start` uses `--write-rerun` so failures are captured automatically.
67+
- If `.tmp/examples-rerun.txt` exists and is non-empty, invoking the skill with no args runs `rerun` by default.
68+
69+
## Behavioral validation (Codex/LLM responsibility)
70+
71+
The runner does not perform any automated behavioral validation. After every foreground `start` or `rerun`, **Codex must manually validate** all exit-0 entries:
72+
73+
1. Read the example source (and comments) to infer intended flow, tools used, and expected key outputs.
74+
2. Open the matching per-example log under `.tmp/examples-start-logs/`.
75+
3. Confirm the intended actions/results occurred; flag omissions or divergences.
76+
4. Do this for **all passed examples**, not just a sample.
77+
5. Report immediately after the run with concise citations to the exact log lines that justify the validation.

0 commit comments

Comments
 (0)