diff --git a/.changeset/tricky-bats-pay.md b/.changeset/tricky-bats-pay.md new file mode 100644 index 000000000..5eee2902b --- /dev/null +++ b/.changeset/tricky-bats-pay.md @@ -0,0 +1,5 @@ +--- +"@browserbasehq/stagehand": patch +--- + +Update docs / logging to reflect gpt 5.4 and gemini 3.1 family compatibility with agent hybrid mode diff --git a/claude.md b/claude.md index fcffa95b4..c784f88a1 100644 --- a/claude.md +++ b/claude.md @@ -238,8 +238,9 @@ Hybrid mode uses both DOM-based and coordinate-based tools (act, click, type, dr **Recommended models for hybrid mode:** -- `google/gemini-3-flash-preview` -- `anthropic/claude-sonnet-4-20250514`, `anthropic/claude-sonnet-4-5-20250929`, `anthropic/claude-haiku-4-5-20251001` +- `google/gemini-3-flash-preview`, `google/gemini-3.1-flash-lite-preview`, `google/gemini-3.1-pro-preview` +- `openai/gpt-5.4`, `openai/gpt-5.4-mini` +- Any `anthropic/claude-*` model ```typescript const stagehand = new Stagehand({ diff --git a/packages/core/lib/v3/handlers/v3AgentHandler.ts b/packages/core/lib/v3/handlers/v3AgentHandler.ts index eda528df5..6ee953d0c 100644 --- a/packages/core/lib/v3/handlers/v3AgentHandler.ts +++ b/packages/core/lib/v3/handlers/v3AgentHandler.ts @@ -166,6 +166,8 @@ export class V3AgentHandler { if ( this.mode === "hybrid" && !baseModel.modelId.includes("gemini-3-flash") && + !baseModel.modelId.includes("gemini-3.1") && + !baseModel.modelId.includes("gpt-5.4") && !baseModel.modelId.includes("claude") ) { this.logger({ diff --git a/packages/docs/v3/basics/agent.mdx b/packages/docs/v3/basics/agent.mdx index f3274b110..350047149 100644 --- a/packages/docs/v3/basics/agent.mdx +++ b/packages/docs/v3/basics/agent.mdx @@ -137,8 +137,9 @@ Both DOM and CUA modes have their strengths and weaknesses. Hybrid mode combines **Model Requirements:** Hybrid mode requires models that can reliably perform coordinate-based actions from screenshots. The following models are recommended: -- **Google:** `google/gemini-3-flash-preview` -- **Anthropic:** `anthropic/claude-sonnet-4-20250514`, `anthropic/claude-sonnet-4-5-20250929`, `anthropic/claude-haiku-4-5-20251001` +- **Google:** `google/gemini-3-flash-preview`, `google/gemini-3.1-flash-lite-preview`, `google/gemini-3.1-pro-preview` +- **OpenAI:** `openai/gpt-5.4`, `openai/gpt-5.4-mini` +- **Anthropic:** Any `anthropic/claude-*` model Other models may not reliably produce accurate coordinates for clicking and typing. diff --git a/packages/docs/v3/references/agent.mdx b/packages/docs/v3/references/agent.mdx index be702649f..65d3e9aa7 100644 --- a/packages/docs/v3/references/agent.mdx +++ b/packages/docs/v3/references/agent.mdx @@ -131,8 +131,9 @@ interface AgentInstance { **Hybrid Mode Model Requirements:** Only use hybrid mode with models that can reliably perform coordinate-based actions: - - **Google:** `google/gemini-3-flash-preview` - - **Anthropic:** `anthropic/claude-sonnet-4-20250514`, `anthropic/claude-sonnet-4-5-20250929`, `anthropic/claude-haiku-4-5-20251001` + - **Google:** `google/gemini-3-flash-preview`, `google/gemini-3.1-flash-lite-preview`, `google/gemini-3.1-pro-preview` + - **OpenAI:** `openai/gpt-5.4`, `openai/gpt-5.4-mini` + - **Anthropic:** Any `anthropic/claude-*` model Requires `experimental: true` in Stagehand constructor.