Skip to content

Commit ee15098

Browse files
authored
Support Claude Opus 4.7+ adaptive thinking (#119)
Anthropic rejects manual extended thinking on Claude Opus 4.7 and 4.8 with HTTP 400. These models require adaptive thinking, where depth is controlled via a top-level `output_config.effort` parameter rather than the legacy `thinking.budget_tokens` field. The previous auto-detection had only a single thinking mode (manual) and matched any `claude-opus-4*` ID, so any request to Opus 4.7+ failed before reaching the model. Replace the boolean `supports_thinking` switch with a `ThinkingMode` enum that distinguishes three cases: - Adaptive: emits `thinking: {type: "adaptive"}` together with `output_config: {effort: "high"}`. Selected for `claude-opus-4-7`, `claude-opus-4-8`, and the `claude-opus-latest` alias. - Manual: keeps the existing `{type: "enabled", budget_tokens: N}` payload. Selected for Sonnet 4.x, Claude 3.7 Sonnet, and Opus 4 through 4.6. - None: unchanged for non-thinking models. Users can still override the default effort level (or any other field) per model via the `config` block in `models.json`, which is shallow-merged into the request. Add a unit test covering all three branches of the new detection, extend `models.example.json` with an Opus 4.7 entry, and document the new shape and override mechanism in the README with links to Anthropic's extended-thinking and effort documentation.
1 parent 98ebeba commit ee15098

3 files changed

Lines changed: 162 additions & 19 deletions

File tree

README.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -184,6 +184,19 @@ The code-assistant uses two JSON configuration files to manage LLM providers and
184184
**`~/.config/code-assistant/models.json`** - Define available models:
185185
```json
186186
{
187+
"Claude Opus 4.7 (Adaptive Thinking)": {
188+
"provider": "anthropic",
189+
"id": "claude-opus-4-7",
190+
"config": {
191+
"max_tokens": 64000,
192+
"thinking": {
193+
"type": "adaptive"
194+
},
195+
"output_config": {
196+
"effort": "high"
197+
}
198+
}
199+
},
187200
"Claude Sonnet 4.5 (Thinking)": {
188201
"provider": "anthropic",
189202
"id": "claude-sonnet-4-5",
@@ -212,6 +225,16 @@ The code-assistant uses two JSON configuration files to manage LLM providers and
212225
}
213226
```
214227

228+
**Note on Claude Opus 4.7+ extended thinking**: Starting with Claude Opus 4.7, Anthropic
229+
no longer accepts the manual `thinking: { type: "enabled", budget_tokens: N }` form
230+
(it returns a 400 error). These models require *adaptive* thinking, where depth is
231+
controlled via the `output_config.effort` parameter (`low`, `medium`, `high`, `xhigh`,
232+
`max`). code-assistant detects Opus 4.7+ model IDs (`claude-opus-4-7`,
233+
`claude-opus-4-8`, `claude-opus-latest`) and emits the correct request shape by default.
234+
You can override the effort level (or any other field) via the model's `config` block,
235+
as shown in the example above. See Anthropic's [extended thinking](https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking)
236+
and [effort](https://docs.anthropic.com/en/docs/build-with-claude/effort) docs for details.
237+
215238
**Environment Variable Substitution**: Use `${VAR_NAME}` in provider configs to reference environment variables for API keys.
216239

217240
**Full Examples**: See [`providers.example.json`](providers.example.json) and [`models.example.json`](models.example.json) for complete configuration examples with all supported providers (Anthropic, OpenAI, Ollama, SAP AI Core, Vertex AI, Groq, Cerebras, MistralAI, OpenRouter).

crates/llm/src/anthropic.rs

Lines changed: 124 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -565,17 +565,63 @@ pub struct AnthropicClient {
565565
custom_config: Option<serde_json::Value>,
566566
}
567567

568+
/// Thinking strategy used for a given Claude model.
569+
///
570+
/// See: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking
571+
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
572+
enum ThinkingMode {
573+
/// Manual extended thinking via `thinking: {type: "enabled", budget_tokens: N}`.
574+
/// Supported on Claude Sonnet 4.x, Claude 3.7 Sonnet, and Claude Opus 4.x up to 4.6.
575+
Manual,
576+
/// Adaptive thinking via `thinking: {type: "adaptive"}` combined with
577+
/// `output_config: {effort: "..."}`. Required on Claude Opus 4.7 and later
578+
/// (manual thinking returns a 400 error on these models).
579+
Adaptive,
580+
/// Model does not support extended thinking.
581+
None,
582+
}
583+
568584
impl AnthropicClient {
569-
/// Substrings of model IDs that should enable thinking mode and higher limits
570-
fn thinking_model_substrings() -> &'static [&'static str] {
585+
/// Substrings of model IDs that require adaptive thinking.
586+
///
587+
/// Manual extended thinking (`type: "enabled"` with `budget_tokens`) is rejected
588+
/// with a 400 error on these models. They must use `type: "adaptive"` together
589+
/// with the `output_config.effort` parameter.
590+
fn adaptive_thinking_model_substrings() -> &'static [&'static str] {
591+
&[
592+
"claude-opus-4-7",
593+
"claude-opus-4-8",
594+
// Anthropic alias that currently points to the latest Opus release,
595+
// which uses adaptive thinking. Some proxies expose this alias too.
596+
"claude-opus-latest",
597+
]
598+
}
599+
600+
/// Substrings of model IDs that support manual extended thinking.
601+
fn manual_thinking_model_substrings() -> &'static [&'static str] {
571602
&["claude-sonnet-4", "claude-3-7-sonnet", "claude-opus-4"]
572603
}
573604

574-
/// Returns true if the current model should have thinking mode enabled
575-
fn supports_thinking(&self) -> bool {
576-
Self::thinking_model_substrings()
605+
/// Returns the thinking mode that should be used for the current model.
606+
fn thinking_mode(&self) -> ThinkingMode {
607+
if Self::adaptive_thinking_model_substrings()
608+
.iter()
609+
.any(|substr| self.model.contains(substr))
610+
{
611+
return ThinkingMode::Adaptive;
612+
}
613+
if Self::manual_thinking_model_substrings()
577614
.iter()
578615
.any(|substr| self.model.contains(substr))
616+
{
617+
return ThinkingMode::Manual;
618+
}
619+
ThinkingMode::None
620+
}
621+
622+
/// Returns true if the current model should have thinking mode enabled
623+
fn supports_thinking(&self) -> bool {
624+
!matches!(self.thinking_mode(), ThinkingMode::None)
579625
}
580626

581627
pub fn default_base_url() -> String {
@@ -1311,16 +1357,11 @@ impl LLMProvider for AnthropicClient {
13111357
});
13121358

13131359
// Configure thinking mode and max_tokens based on model
1314-
let (thinking_config, max_tokens) = if self.supports_thinking() {
1315-
(
1316-
Some(ThinkingConfiguration {
1317-
thinking_type: "enabled".to_string(),
1318-
budget_tokens: 16000,
1319-
}),
1320-
64000,
1321-
)
1360+
let thinking_mode = self.thinking_mode();
1361+
let max_tokens = if matches!(thinking_mode, ThinkingMode::None) {
1362+
8192
13221363
} else {
1323-
(None, 8192)
1364+
64000
13241365
};
13251366

13261367
// Convert messages using the message converter
@@ -1330,20 +1371,34 @@ impl LLMProvider for AnthropicClient {
13301371
let mut anthropic_request = serde_json::json!({
13311372
"model": self.model,
13321373
"max_tokens": max_tokens,
1333-
"temperature": if thinking_config.is_some() {
1374+
"temperature": if matches!(thinking_mode, ThinkingMode::None) {
1375+
0.7
1376+
} else {
13341377
// Anthropic requires this to be 1.0 if you enable "thinking"
13351378
1.0
1336-
} else {
1337-
0.7
13381379
},
13391380
"system": system,
13401381
"stream": streaming_callback.is_some(),
13411382
"messages": messages_json,
13421383
});
13431384

1344-
if let Some(thinking_config) = thinking_config {
1345-
anthropic_request["thinking"] = serde_json::to_value(thinking_config)?;
1385+
match thinking_mode {
1386+
ThinkingMode::Manual => {
1387+
anthropic_request["thinking"] = serde_json::to_value(ThinkingConfiguration {
1388+
thinking_type: "enabled".to_string(),
1389+
budget_tokens: 16000,
1390+
})?;
1391+
}
1392+
ThinkingMode::Adaptive => {
1393+
// Opus 4.7+ require adaptive thinking; depth is controlled via
1394+
// `output_config.effort`. Users can override either via the model's
1395+
// `config` block (shallow-merged below).
1396+
anthropic_request["thinking"] = serde_json::json!({ "type": "adaptive" });
1397+
anthropic_request["output_config"] = serde_json::json!({ "effort": "high" });
1398+
}
1399+
ThinkingMode::None => {}
13461400
}
1401+
13471402
if let Some(tool_choice) = tool_choice {
13481403
anthropic_request["tool_choice"] = tool_choice;
13491404
}
@@ -1938,4 +1993,54 @@ mod tests {
19381993
panic!("Expected ToolResult content");
19391994
}
19401995
}
1996+
1997+
fn make_client(model: &str) -> AnthropicClient {
1998+
AnthropicClient::new(
1999+
"test-key".to_string(),
2000+
model.to_string(),
2001+
AnthropicClient::default_base_url(),
2002+
)
2003+
}
2004+
2005+
#[test]
2006+
fn test_thinking_mode_detection() {
2007+
// Adaptive-only models (Opus 4.7+).
2008+
for id in [
2009+
"claude-opus-4-7",
2010+
"claude-opus-4-8",
2011+
"claude-opus-latest",
2012+
"vendor-prefix/claude-opus-4-7",
2013+
] {
2014+
assert_eq!(
2015+
make_client(id).thinking_mode(),
2016+
ThinkingMode::Adaptive,
2017+
"expected adaptive for {id}",
2018+
);
2019+
}
2020+
2021+
// Manual extended thinking models.
2022+
for id in [
2023+
"claude-sonnet-4-6",
2024+
"claude-sonnet-4-5",
2025+
"claude-3-7-sonnet",
2026+
"claude-opus-4",
2027+
"claude-opus-4-5",
2028+
"claude-opus-4-6",
2029+
] {
2030+
assert_eq!(
2031+
make_client(id).thinking_mode(),
2032+
ThinkingMode::Manual,
2033+
"expected manual for {id}",
2034+
);
2035+
}
2036+
2037+
// Models that don't support extended thinking.
2038+
for id in ["claude-3-5-sonnet", "claude-haiku-4-5", "gpt-4o"] {
2039+
assert_eq!(
2040+
make_client(id).thinking_mode(),
2041+
ThinkingMode::None,
2042+
"expected none for {id}",
2043+
);
2044+
}
2045+
}
19412046
}

models.example.json

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,19 @@
11
{
2+
"Claude Opus 4.7 (Adaptive Thinking)": {
3+
"provider": "anthropic-main",
4+
"id": "claude-opus-4-7",
5+
"context_token_limit": 200000,
6+
"config": {
7+
"max_tokens": 64000,
8+
"temperature": 1.0,
9+
"thinking": {
10+
"type": "adaptive"
11+
},
12+
"output_config": {
13+
"effort": "high"
14+
}
15+
}
16+
},
217
"Claude Sonnet 4.6 (Thinking)": {
318
"provider": "anthropic-main",
419
"id": "claude-sonnet-4-6",

0 commit comments

Comments
 (0)