openai
diff --git a/‎.agents/skills/code-change-verification/SKILL.md‎
Lines changed: 35 additions & 0 deletions b/‎.agents/skills/code-change-verification/SKILL.md‎
Lines changed: 35 additions & 0 deletions
diff --git a/‎.agents/skills/code-change-verification/scripts/run.ps1‎
Lines changed: 38 additions & 0 deletions b/‎.agents/skills/code-change-verification/scripts/run.ps1‎
Lines changed: 38 additions & 0 deletions
diff --git a/‎.agents/skills/code-change-verification/scripts/run.sh‎
Lines changed: 25 additions & 0 deletions b/‎.agents/skills/code-change-verification/scripts/run.sh‎
Lines changed: 25 additions & 0 deletions
diff --git a/‎.agents/skills/docs-sync/SKILL.md‎
Lines changed: 76 additions & 0 deletions b/‎.agents/skills/docs-sync/SKILL.md‎
Lines changed: 76 additions & 0 deletions
diff --git a/‎.agents/skills/docs-sync/references/doc-coverage-checklist.md‎
Lines changed: 56 additions & 0 deletions b/‎.agents/skills/docs-sync/references/doc-coverage-checklist.md‎
Lines changed: 56 additions & 0 deletions
diff --git a/‎.agents/skills/examples-auto-run/SKILL.md‎
Lines changed: 77 additions & 0 deletions b/‎.agents/skills/examples-auto-run/SKILL.md‎
Lines changed: 77 additions & 0 deletions
@@ -0,0 +1,35 @@
+---
+name: code-change-verification
+description: Run the mandatory verification stack when changes affect runtime code, tests, or build/test behavior in the OpenAI Agents Python repository.
+---
+
+# Code Change Verification
+
+## Overview
+
+Ensure work is only marked complete after formatting, linting, type checking, and tests pass. Use this skill when changes affect runtime code, tests, or build/test configuration. You can skip it for docs-only or repository metadata unless a user asks for the full stack.
+
+## Quick start
+
+1. Keep this skill at `./.agents/skills/code-change-verification` so it loads automatically for the repository.
+2. macOS/Linux: `bash .agents/skills/code-change-verification/scripts/run.sh`.
+3. Windows: `powershell -ExecutionPolicy Bypass -File .agents/skills/code-change-verification/scripts/run.ps1`.
+4. If any command fails, fix the issue, rerun the script, and report the failing output.
+5. Confirm completion only when all commands succeed with no remaining issues.
+
+## Manual workflow
+
+- If dependencies are not installed or have changed, run `make sync` first to install dev requirements via `uv`.
+- Run from the repository root in this order: `make format`, `make lint`, `make mypy`, `make tests`.
+- Do not skip steps; stop and fix issues immediately when a command fails.
+- Re-run the full stack after applying fixes so the commands execute in the required order.
+
+## Resources
+
+### scripts/run.sh
+
+- Executes the full verification sequence with fail-fast semantics from the repository root. Prefer this entry point to ensure the required commands run in the correct order.
+
+### scripts/run.ps1
+
+- Windows-friendly wrapper that runs the same verification sequence with fail-fast semantics. Use from PowerShell with execution policy bypass if required by your environment.
@@ -0,0 +1,38 @@
+Set-StrictMode -Version Latest
+$ErrorActionPreference = "Stop"
+
+$scriptDir = Split-Path -Parent $MyInvocation.MyCommand.Definition
+$repoRoot = $null
+
+try {
+    $repoRoot = (& git -C $scriptDir rev-parse --show-toplevel 2>$null)
+} catch {
+    $repoRoot = $null
+}
+
+if (-not $repoRoot) {
+    $repoRoot = Resolve-Path (Join-Path $scriptDir "..\\..\\..\\..")
+}
+
+Set-Location $repoRoot
+
+function Invoke-MakeStep {
+    param(
+        [Parameter(Mandatory = $true)][string]$Step
+    )
+
+    Write-Host "Running make $Step..."
+    & make $Step
+
+    if ($LASTEXITCODE -ne 0) {
+        Write-Error "code-change-verification: make $Step failed with exit code $LASTEXITCODE."
+        exit $LASTEXITCODE
+    }
+}
+
+Invoke-MakeStep -Step "format"
+Invoke-MakeStep -Step "lint"
+Invoke-MakeStep -Step "mypy"
+Invoke-MakeStep -Step "tests"
+
+Write-Host "code-change-verification: all commands passed."
@@ -0,0 +1,25 @@
+#!/usr/bin/env bash
+# Fail fast on any error or undefined variable.
+set -euo pipefail
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+if command -v git >/dev/null 2>&1; then
+  REPO_ROOT="$(git -C "${SCRIPT_DIR}" rev-parse --show-toplevel 2>/dev/null || true)"
+fi
+REPO_ROOT="${REPO_ROOT:-$(cd "${SCRIPT_DIR}/../../../.." && pwd)}"
+
+cd "${REPO_ROOT}"
+
+echo "Running make format..."
+make format
+
+echo "Running make lint..."
+make lint
+
+echo "Running make mypy..."
+make mypy
+
+echo "Running make tests..."
+make tests
+
+echo "code-change-verification: all commands passed."
@@ -0,0 +1,76 @@
+---
+name: docs-sync
+description: Analyze main branch implementation and configuration to find missing, incorrect, or outdated documentation in docs/. Use when asked to audit doc coverage, sync docs with code, or propose doc updates/structure changes. Only update English docs under docs/** and never touch translated docs under docs/ja, docs/ko, or docs/zh. Provide a report and ask for approval before editing docs.
+---
+
+# Docs Sync
+
+## Overview
+
+Identify doc coverage gaps and inaccuracies by comparing main branch features and configuration options against the current docs structure, then propose targeted improvements.
+
+## Workflow
+
+1. Confirm scope and base branch
+   - Identify the current branch and default branch (usually `main`).
+   - Prefer analyzing the current branch to keep work aligned with in-flight changes.
+   - If the current branch is not `main`, analyze only the diff vs `main` to scope doc updates.
+   - Avoid switching branches if it would disrupt local changes; use `git show main:<path>` or `git worktree add` when needed.
+
+2. Build a feature inventory from the selected scope
+   - If on `main`: inventory the full surface area and review docs comprehensively.
+   - If not on `main`: inventory only changes vs `main` (feature additions/changes/removals).
+   - Focus on user-facing behavior: public exports, configuration options, environment variables, CLI commands, default values, and documented runtime behaviors.
+   - Capture evidence for each item (file path + symbol/setting).
+   - Use targeted search to find option types and feature flags (for example: `rg "Settings"`, `rg "Config"`, `rg "os.environ"`, `rg "OPENAI_"`).
+   - When the topic involves OpenAI platform features, invoke `$openai-knowledge` to pull current details from the OpenAI Developer Docs MCP server instead of guessing, while treating the SDK source code as the source of truth when discrepancies appear.
+
+3. Doc-first pass: review existing pages
+   - Walk each relevant page under `docs/` (excluding `docs/ja`, `docs/ko`, and `docs/zh`).
+   - Identify missing mentions of important, supported options (opt-in flags, env vars), customization points, or new features from `src/agents/` and `examples/`.
+   - Propose additions where users would reasonably expect to find them on that page.
+
+4. Code-first pass: map features to docs
+   - Review the current docs information architecture under `docs/` and `mkdocs.yml`.
+   - Determine the best page/section for each feature based on existing patterns and the API reference structure under `docs/ref`.
+   - Identify features that lack any doc page or have a page but no corresponding content.
+   - Note when a structural adjustment would improve discoverability.
+   - When improving `docs/ref/*` pages, treat the corresponding docstrings/comments in `src/` as the source of truth. Prefer updating those code comments so regenerated reference docs stay correct, instead of hand-editing the generated pages.
+
+5. Detect gaps and inaccuracies
+   - **Missing**: features/configs present in main but absent in docs.
+   - **Incorrect/outdated**: names, defaults, or behaviors that diverge from main.
+   - **Structural issues** (optional): pages overloaded, missing overviews, or mis-grouped topics.
+
+6. Produce a Docs Sync Report and ask for approval
+   - Provide a clear report with evidence, suggested doc locations, and proposed edits.
+   - Ask the user whether to proceed with doc updates.
+
+7. If approved, apply changes (English only)
+   - Edit only English docs in `docs/**`.
+   - Do **not** edit `docs/ja`, `docs/ko`, or `docs/zh`.
+   - Keep changes aligned with the existing docs style and navigation.
+   - Update `mkdocs.yml` when adding or renaming pages.
+   - Build docs with `make build-docs` after edits to verify the docs site still builds.
+
+## Output format
+
+Use this template when reporting findings:
+
+Docs Sync Report
+
+- Doc-first findings
+  - Page + missing content -> evidence + suggested insertion point
+- Code-first gaps
+  - Feature + evidence -> suggested doc page/section (or missing page)
+- Incorrect or outdated docs
+  - Doc file + issue + correct info + evidence
+- Structural suggestions (optional)
+  - Proposed change + rationale
+- Proposed edits
+  - Doc file -> concise change summary
+- Questions for the user
+
+## References
+
+- `references/doc-coverage-checklist.md`
@@ -0,0 +1,56 @@
+# Doc Coverage Checklist
+
+Use this checklist to scan the selected scope (main = comprehensive, or current-branch diff) and validate documentation coverage.
+
+## Feature inventory targets
+
+- Public exports: classes, functions, types, and module entry points.
+- Configuration options: `*Settings` types, default config objects, and builder patterns.
+- Environment variables or runtime flags.
+- CLI commands, scripts, and example entry points that define supported usage.
+- User-facing behaviors: retry, timeouts, streaming, errors, logging, telemetry, and data handling.
+- Deprecations, removals, or renamed settings.
+
+## Doc-first pass (page-by-page)
+
+- Review each relevant English page (excluding `docs/ja`, `docs/ko`, and `docs/zh`).
+- Look for missing opt-in flags, env vars, or customization options that the page implies.
+- Add new features that belong on that page based on user intent and navigation.
+
+## Code-first pass (feature inventory)
+
+- Map features to the closest existing page based on the docs navigation in `mkdocs.yml`.
+- Prefer updating existing pages over creating new ones unless the topic is clearly new.
+- Use conceptual pages for cross-cutting concerns (auth, errors, streaming, tracing, tools).
+- Keep quick-start flows minimal; move advanced details into deeper pages.
+
+## Evidence capture
+
+- Record the main-branch file path and symbol/setting name.
+- Note defaults or behavior-critical details for accuracy checks.
+- Avoid large code dumps; a short identifier is enough.
+
+## Red flags for outdated or incorrect docs
+
+- Option names/types no longer exist or differ from code.
+- Default values or allowed ranges do not match implementation.
+- Features removed in code but still documented.
+- New behaviors introduced without corresponding docs updates.
+
+## When to propose structural changes
+
+- A page mixes unrelated audiences (quick-start + deep reference) without clear separation.
+- Multiple pages duplicate the same concept without cross-links.
+- New feature areas have no obvious home in the nav structure.
+
+## Diff mode guidance (current branch vs main)
+
+- Focus only on changed behavior: new exports/options, modified defaults, removed features, or renamed settings.
+- Use `git diff main...HEAD` (or equivalent) to constrain analysis.
+- Document removals explicitly so docs can be pruned if needed.
+
+## Patch guidance
+
+- Keep edits scoped and aligned with existing tone and format.
+- Update cross-links when moving or renaming sections.
+- Leave translated docs untouched; English-only updates.
@@ -0,0 +1,77 @@
+---
+name: examples-auto-run
+description: Run python examples in auto mode with logging, rerun helpers, and background control.
+---
+
+# examples-auto-run
+
+## What it does
+
+- Runs `uv run examples/run_examples.py` with:
+  - `EXAMPLES_INTERACTIVE_MODE=auto` (auto-input/auto-approve).
+  - Per-example logs under `.tmp/examples-start-logs/`.
+  - Main summary log path passed via `--main-log` (also under `.tmp/examples-start-logs/`).
+  - Generates a rerun list of failures at `.tmp/examples-rerun.txt` when `--write-rerun` is set.
+- Provides start/stop/status/logs/tail/collect/rerun helpers via `run.sh`.
+- Background option keeps the process running with a pidfile; `stop` cleans it up.
+
+## Usage
+
+```bash
+# Start (auto mode; interactive included by default)
+.agents/skills/examples-auto-run/scripts/run.sh start [extra args to run_examples.py]
+# Examples:
+.agents/skills/examples-auto-run/scripts/run.sh start --filter basic
+.agents/skills/examples-auto-run/scripts/run.sh start --include-server --include-audio
+
+# Check status
+.agents/skills/examples-auto-run/scripts/run.sh status
+
+# Stop running job
+.agents/skills/examples-auto-run/scripts/run.sh stop
+
+# List logs
+.agents/skills/examples-auto-run/scripts/run.sh logs
+
+# Tail latest log (or specify one)
+.agents/skills/examples-auto-run/scripts/run.sh tail
+.agents/skills/examples-auto-run/scripts/run.sh tail main_20260113-123000.log
+
+# Collect rerun list from a main log (defaults to latest main_*.log)
+.agents/skills/examples-auto-run/scripts/run.sh collect
+
+# Rerun only failed entries from rerun file (auto mode)
+.agents/skills/examples-auto-run/scripts/run.sh rerun
+```
+
+## Defaults (overridable via env)
+
+- `EXAMPLES_INTERACTIVE_MODE=auto`
+- `EXAMPLES_INCLUDE_INTERACTIVE=1`
+- `EXAMPLES_INCLUDE_SERVER=0`
+- `EXAMPLES_INCLUDE_AUDIO=0`
+- `EXAMPLES_INCLUDE_EXTERNAL=0`
+- Auto-approvals in auto mode: `APPLY_PATCH_AUTO_APPROVE=1`, `SHELL_AUTO_APPROVE=1`, `AUTO_APPROVE_MCP=1`
+
+## Log locations
+
+- Main logs: `.tmp/examples-start-logs/main_*.log`
+- Per-example logs (from `run_examples.py`): `.tmp/examples-start-logs/<module_path>.log`
+- Rerun list: `.tmp/examples-rerun.txt`
+- Stdout logs: `.tmp/examples-start-logs/stdout_*.log`
+
+## Notes
+
+- The runner delegates to `uv run examples/run_examples.py`, which already writes per-example logs and supports `--collect`, `--rerun-file`, and `--print-auto-skip`.
+- `start` uses `--write-rerun` so failures are captured automatically.
+- If `.tmp/examples-rerun.txt` exists and is non-empty, invoking the skill with no args runs `rerun` by default.
+
+## Behavioral validation (Codex/LLM responsibility)
+
+The runner does not perform any automated behavioral validation. After every foreground `start` or `rerun`, **Codex must manually validate** all exit-0 entries:
+
+1. Read the example source (and comments) to infer intended flow, tools used, and expected key outputs.
+2. Open the matching per-example log under `.tmp/examples-start-logs/`.
+3. Confirm the intended actions/results occurred; flag omissions or divergences.
+4. Do this for **all passed examples**, not just a sample.
+5. Report immediately after the run with concise citations to the exact log lines that justify the validation.