[Claimed #1912] Feat: Add Anthropic CUA adaptive thinking by github-actions[bot] · Pull Request #1954 · browserbase/stagehand

github-actions · 2026-04-03T23:29:31Z

Mirrored from external contributor PR #1912 after approval by @miguelg719.

Original author: @chromiebot
Original PR: #1912
Approved source head SHA: 0ea0332c525017c727743d11a68b8eb74f76b646

@chromiebot, please continue any follow-up discussion on this mirrored PR. When the external PR gets new commits, this same internal PR will be marked stale until the latest external commit is approved and refreshed here.

Original description

why

what changed

test plan

Summary by cubic

Add adaptive thinking for Anthropic Claude 4.6 models in the CUA client with effort controls and automatic temperature=1. Keeps legacy thinkingBudget for older models and improves model detection and tool versioning.

New Features
- Detects 4.6 models (e.g., claude-opus-4-6*, claude-sonnet-4-6*, incl. provider-prefixed) and sends thinking: { type: "adaptive" } with output_config.effort (defaults to "medium").
- Adds ThinkingEffort and thinkingEffort in ClientOptions ("none" | "low" | "medium" | "high" | "max"); "none" disables adaptive thinking.
- Forces temperature: 1 with adaptive thinking and logs when overriding a user value; logs when thinkingBudget is provided on 4.6 models.
- Uses computer_20251124 for 4.6 and claude-opus-4-5-20251101; older models continue with thinking: { type: "enabled", budget_tokens }.
Migration
- For 4.6 models, set thinkingEffort; thinkingBudget is ignored.
- Adaptive thinking forces temperature: 1; set thinkingEffort: "none" to disable.

^{Written for commit ebe95e7. Summary will update on new commits. Review in cubic}

Add comprehensive tests for the new adaptive thinking API used by Claude 4.6 models (claude-opus-4-6, claude-sonnet-4-6). Tests verify: - Adaptive thinking uses thinking.type: 'adaptive' (not 'enabled') - Effort levels are passed via output_config.effort (not budget_tokens) - All effort levels: low, medium, high, max - Older models continue using deprecated budget_tokens API - Model name detection handles provider-prefixed names These tests define the expected API contract per: https://platform.claude.com/docs/en/build-with-claude/adaptive-thinking Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Update AnthropicCUAClient to use the correct API contract for Claude 4.6 models (claude-opus-4-6, claude-sonnet-4-6). Claude 4.6 models use adaptive thinking: - thinking: { type: "adaptive" } - output_config: { effort: "low" | "medium" | "high" | "max" } This replaces the deprecated API: - thinking: { type: "enabled", budget_tokens: N } Changes: - Add ThinkingEffort type for effort levels - Add thinkingEffort option to ClientOptions - Detect 4.6 models and use adaptive thinking with output_config - Keep backward compatibility with thinkingBudget for older models - Add deprecation notice for thinkingBudget on 4.6 models The implementation follows the API contract documented at: https://platform.claude.com/docs/en/build-with-claude/adaptive-thinking Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add test for default "medium" effort when thinkingEffort not set - Add tests verifying temperature=1 is set for adaptive thinking - Add test that older models don't force temperature=1 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Always set temperature=1 when adaptive thinking is enabled (required by API) - Default to "medium" effort for Claude 4.6 models when thinkingEffort not set - This ensures adaptive thinking works out of the box for 4.6 models Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

github-actions · 2026-04-03T23:29:35Z

This mirrored PR has been merged into main. The original external PR #1912 is now completed.

changeset-bot · 2026-04-03T23:29:39Z

🦋 Changeset detected

Latest commit: ebe95e7

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 4 packages

Name	Type
@browserbasehq/stagehand	Patch
@browserbasehq/stagehand-evals	Patch
@browserbasehq/stagehand-server-v3	Patch
@browserbasehq/stagehand-server-v4	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

cubic-dev-ai

1 issue found across 3 files

Confidence score: 5/5

This looks low risk to merge: the only reported issue is a documentation/behavior mismatch with low severity (3/10), not a functional break in core logic.
In packages/core/lib/v3/types/public/model.ts, docs say ThinkingEffort defaults to high, while runtime behavior uses medium when thinkingEffort is unset, which could mislead integrators about default model behavior.
Pay close attention to packages/core/lib/v3/types/public/model.ts - align the ThinkingEffort default in docs with the actual adaptive-thinking fallback (medium).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/lib/v3/types/public/model.ts">

<violation number="1" location="packages/core/lib/v3/types/public/model.ts:101">
P3: The new `ThinkingEffort` docs claim `high` is the default, but the implementation defaults adaptive thinking to `medium` when `thinkingEffort` is unset.</violation>
</file>

Architecture diagram

sequenceDiagram
    participant App as Application Logic
    participant Client as AnthropicCUAClient
    participant SDK as Anthropic SDK (Beta)
    participant API as Anthropic API

    Note over App,API: Runtime Flow for Adaptive Thinking

    App->>Client: getAction(messages)
    
    Client->>Client: Detect model version (4.6 vs older)

    alt NEW: Model is Claude 4.6 (Opus or Sonnet)
        Client->>Client: NEW: Set thinking.type = "adaptive"
        Client->>Client: NEW: Set output_config.effort = thinkingEffort (default: "medium")
        Client->>Client: NEW: Force temperature = 1
        Note right of Client: Required for adaptive thinking mode
    else CHANGED: Older Claude models (e.g., 4.5)
        opt thinkingBudget provided
            Client->>Client: CHANGED: Set thinking.type = "enabled"
            Client->>Client: CHANGED: Set budget_tokens = thinkingBudget
        end
    end

    Client->>SDK: beta.messages.create({ model, messages, thinking, ... })
    
    Note over SDK,API: Uses computer_20251124 header
    
    SDK->>API: POST /v1/messages
    API-->>SDK: Response (with thinking blocks)
    SDK-->>Client: Message Object
    Client-->>App: Action Result

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.}

filip-michalsky · 2026-04-21T23:57:52Z

+    }
+
+    // Track user-specified temperature so we can warn if adaptive thinking overrides it
+    this.userTemperature = clientOptions?.temperature;


will this not clash with Opus 4.7 not supporting temp?

https://platform.claude.com/docs/en/api/cli/messages/create#create.temperature

pirate

needs a test for claude-opus-4-7 to make sure temperature/adapting thinking doesn't break

miguelg719 · 2026-04-22T06:38:36Z

@filip-michalsky @pirate this pr doesn't add support for opus 4.7 on CUA. Regardless, a temperature for default of 1.0 is still supported but deprecated per the documentation here. Any non-default value will throw a 400, but we should scope removing passing temperature as its own PR

filip-michalsky · 2026-04-22T16:49:16Z

@filip-michalsky @pirate this pr doesn't add support for opus 4.7 on CUA. Regardless, a temperature for default of 1.0 is still supported but deprecated per the documentation here. Any non-default value will throw a 400, but we should scope removing passing temperature as its own PR

sounds good

requested changes are oos for this pr

@shrey150

This PR was opened by the [Changesets release](https://github.com/changesets/action) GitHub action. When you're ready to do a release, you can merge this and the packages will be published to npm automatically. If you're not ready to do a release yet, that's fine, whenever you add more changesets to main, this PR will be updated. # Releases ## @browserbasehq/stagehand@3.3.0 ### Minor Changes - [#1980](#1980) [`e471d2e`](e471d2e) Thanks [@shrey150](https://github.com/shrey150)! - Support Browserbase verified session settings and bump the Browserbase SDK. ### Patch Changes - [#1954](#1954) [`732b384`](732b384) Thanks [@github-actions](https://github.com/apps/github-actions)! - Update Anthropic CUA to use adaptive thinking - [#2001](#2001) [`20b601d`](20b601d) Thanks [@shrey150](https://github.com/shrey150)! - Include `agent.execute()` usage in `stagehand.metrics` for API-backed sessions. - [#1983](#1983) [`8543c11`](8543c11) Thanks [@github-actions](https://github.com/apps/github-actions)! - Add variable substitution to the keys tool in both live execution and cache replay paths. When keys steps with `method="type"` contain `%variableName%` tokens, they are now resolved against the provided variables. This brings the keys tool to parity with the type tool's variable handling. - [#1973](#1973) [`14b64ec`](14b64ec) Thanks [@monadoid](https://github.com/monadoid)! - Enable strict structured outputs for supported model paths. - [#2028](#2028) [`a500de1`](a500de1) Thanks [@tkattkat](https://github.com/tkattkat)! - Remove deprecated provider option - [#1975](#1975) [`8f7192c`](8f7192c) Thanks [@seanmcguire12](https://github.com/seanmcguire12)! - make file upload elements more explicit in page snapshot ## @browserbasehq/stagehand-docs@1.0.1 ### Patch Changes - [#2017](#2017) [`6b9b46d`](6b9b46d) Thanks [@monadoid](https://github.com/monadoid)! - Document the optional MCP `start` `sessionId` parameter for attaching to an existing Browserbase session. ## @browserbasehq/stagehand-evals@1.1.11 ### Patch Changes - Updated dependencies \[[`732b384`](732b384), [`20b601d`](20b601d), [`8543c11`](8543c11), [`14b64ec`](14b64ec), [`a500de1`](a500de1), [`e471d2e`](e471d2e), [`8f7192c`](8f7192c)]: - @browserbasehq/stagehand@3.3.0 ## @browserbasehq/stagehand-server-v3@3.6.3 ### Patch Changes - Updated dependencies \[[`732b384`](732b384), [`20b601d`](20b601d), [`8543c11`](8543c11), [`14b64ec`](14b64ec), [`a500de1`](a500de1), [`e471d2e`](e471d2e), [`8f7192c`](8f7192c)]: - @browserbasehq/stagehand@3.3.0 ## @browserbasehq/stagehand-server-v4@3.6.3 ### Patch Changes - Updated dependencies \[[`732b384`](732b384), [`20b601d`](20b601d), [`8543c11`](8543c11), [`14b64ec`](14b64ec), [`a500de1`](a500de1), [`e471d2e`](e471d2e), [`8f7192c`](8f7192c)]: - @browserbasehq/stagehand@3.3.0 Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Chromie Bot and others added 4 commits March 30, 2026 06:20

github-actions Bot assigned miguelg719 Apr 3, 2026

github-actions Bot added external-contributor Tracks PRs mirrored from external contributor forks. external-contributor:mirrored An internal mirrored PR currently exists for this external contributor PR. labels Apr 3, 2026

github-actions Bot mentioned this pull request Apr 3, 2026

Feat: Add Anthropic CUA adaptive thinking #1912

Closed

cubic-dev-ai Bot reviewed Apr 3, 2026

View reviewed changes

Comment thread packages/core/lib/v3/types/public/model.ts Outdated

miguelg719 added 4 commits April 6, 2026 16:37

override logging and fixes

52fbfec

changeset

61946d2

update model check

ee9683d

remove outdated tests

ebe95e7

filip-michalsky reviewed Apr 21, 2026

View reviewed changes

pirate previously requested changes Apr 22, 2026

View reviewed changes

miguelg719 requested review from filip-michalsky and pirate April 22, 2026 06:41

filip-michalsky approved these changes Apr 22, 2026

View reviewed changes

miguelg719 merged commit 732b384 into main Apr 22, 2026
207 checks passed

github-actions Bot added external-contributor:completed The mirrored PR has been merged and the external contributor flow is complete. and removed external-contributor:mirrored An internal mirrored PR currently exists for this external contributor PR. labels Apr 22, 2026

This was referenced Apr 22, 2026

Version Packages #1979

Merged

Version Packages qsobad/stagehand#1

Open

Version Packages edisplay/stagehand#5

Open

BABTUNA mentioned this pull request Apr 24, 2026

core(cua): implement addContextNote parity for Google/Anthropic/Microsoft clients #2037

Open

This was referenced Apr 24, 2026

Version Packages azaj01/stagehand#1

Open

Version Packages nxtreaming/stagehand#3

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Claimed #1912] Feat: Add Anthropic CUA adaptive thinking#1954

[Claimed #1912] Feat: Add Anthropic CUA adaptive thinking#1954
miguelg719 merged 8 commits intomainfrom
external-contributor-pr-1912

github-actions Bot commented Apr 3, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

github-actions Bot commented Apr 3, 2026 •

edited

Loading

Uh oh!

changeset-bot Bot commented Apr 3, 2026 •

edited

Loading

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

Uh oh!

filip-michalsky Apr 21, 2026

Uh oh!

miguelg719 Apr 22, 2026

Uh oh!

pirate left a comment

Uh oh!

miguelg719 commented Apr 22, 2026

Uh oh!

filip-michalsky commented Apr 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

github-actions Bot commented Apr 3, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Original description

why

what changed

test plan

Summary by cubic

Uh oh!

github-actions Bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changeset-bot Bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

filip-michalsky Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

miguelg719 Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

pirate left a comment

Choose a reason for hiding this comment

Uh oh!

miguelg719 commented Apr 22, 2026

Uh oh!

filip-michalsky commented Apr 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions Bot commented Apr 3, 2026 •

edited by cubic-dev-ai Bot

Loading

github-actions Bot commented Apr 3, 2026 •

edited

Loading

changeset-bot Bot commented Apr 3, 2026 •

edited

Loading