feat(metrics): Add Prometheus metrics integration for agent monitoring#2855
feat(metrics): Add Prometheus metrics integration for agent monitoring#2855KirobotDev wants to merge 1 commit intoopenai:mainfrom
Conversation
|
Thanks for sharing this, and I like the idea using the hooks this way. We don't plan to have this as part of the core SDK, but please feel free to share it as your own example and/or package for other developers! |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 084a14d86b
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| async def on_tool_start( | ||
| self, | ||
| context: RunContextWrapper[Any], | ||
| agent: Agent[Any], | ||
| tool_name: str, | ||
| input_data: dict[str, Any], |
There was a problem hiding this comment.
Match on_tool_start signature to RunHooks
RunHooks.on_tool_start is invoked with (context, agent, tool), but this override requires an extra input_data argument, so any run that uses MetricsHooks and executes a tool will raise a TypeError before the tool call completes. This makes metrics hooks unsafe for agents that use tools.
Useful? React with 👍 / 👎.
| async def on_start( | ||
| self, | ||
| context: RunContextWrapper[Any], | ||
| agent: Agent[Any], | ||
| ) -> None: |
There was a problem hiding this comment.
Rename lifecycle handlers to actual RunHooks callbacks
These handlers are declared as on_start/on_end (and on_error below), but the runner dispatches RunHooks via on_agent_start/on_agent_end and has no on_error callback. As a result, run start/end/error metrics are never emitted through MetricsHooks, so the integration silently misses core counters and durations.
Useful? React with 👍 / 👎.
| result = await Runner.run( | ||
| agent, | ||
| f"Solve this math problem: {problem}", | ||
| hooks=[hooks], |
There was a problem hiding this comment.
Pass a RunHooks instance instead of a list in example
Runner.run expects hooks to be a single RunHooks object, not a list, so passing hooks=[hooks] will fail when the runner tries to call hook methods on the list. In this example, /solve and /chat requests will return errors instead of running the agent.
Useful? React with 👍 / 👎.
|
ok I will do that thank you for your quick response :) |
Summary
This PR adds Prometheus metrics integration to the OpenAI Agents SDK, enabling users to monitor agent performance in production environments.
Changes
prometheus-client>=0.21.0added to pyproject.tomlMetrics Exposed
agents_llm_latency_secondsagents_tokens_totalagents_errors_totalagents_runs_totalagents_run_duration_secondsagents_turns_totalagents_tool_executions_totalagents_tool_latency_secondsUsage