feat(m3): smolagents + claude_agent_sdk bridge for CC subscription#13
Open
suzuke wants to merge 2 commits into
Open
feat(m3): smolagents + claude_agent_sdk bridge for CC subscription#13suzuke wants to merge 2 commits into
suzuke wants to merge 2 commits into
Conversation
…3 PR 19)
Adds `provider: "claude-subscription"` to SmolagentsConfig. When set,
the smolagents backend drives Claude via `claude_agent_sdk` (OAuth
from `~/.claude/credentials.json`) instead of LiteLLM + API key. No
API token burn.
**ACL invariant — the load-bearing design pin** (reviewer round 1 Q3):
`claude_agent_sdk.query()` is NOT a token completion API — it's a
complete agent product that runs its own loop with its own tools.
Naive use would re-create the §3.3 agent-loop-in-agent-loop problem
(same as cli-subscription) and silently void smolagents' ACL.
Fix: configure SDK as a degenerate single-turn text generator —
- `allowed_tools=[]`
- `disallowed_tools=[Read, Edit, Write, Glob, Grep, Bash, ...]` (exhaustive)
- `max_turns=1`
- `can_use_tool=lambda *a, **kw: {"behavior": "deny"}` (defense in depth)
This forces the SDK to return a single text response with NO internal
tool execution. smolagents parses the text for tool calls and dispatches
via its OWN tools — where `CheatResistancePolicy` ACL fires.
The invariant is locked in by `test_sdk_is_invoked_with_no_internal_tools`.
If a future SDK update adds a default tool that bypasses
`disallowed_tools`, that test must trip — re-verify against
`claude_agent_sdk.__version__` before merge.
**Reviewer round 1 Q1-Q7 + 4 didn't-ask items**:
- Q1 flat module placement (`smolagents_claude_sdk_model.py`) ✓
- Q2 provider name `claude-subscription` (discoverable) ✓
- Q3 ACL invariant locked + invariant test (this commit's headline)
- Q4 marketing wording: medium framing + explicit ACL invariant clause
- Q5 no separate compliance gate (smolagents tool surface is the gate)
- Q6 asyncio.run() with loud failure inside running loop
- Q7 transitional shim — module docstring marks remove-when condition
- Cost framing: comment in module docstring noting `total_cost_usd` is
API-equivalent estimate, not actual subscription bill
- Auth-failure UX: `ClaudeAgentSDKAuthError` with "run claude login" hint
- Default opt-in: `provider="anthropic"` default unchanged; users must
explicitly set `claude-subscription`
- SDK version pin: `claude-agent-sdk` is already a base dep
**Tests** (12 new in `test_smolagents_claude_sdk_model.py`):
- ACL invariant (THE critical test)
- can_use_tool deny callback
- _SDK_DISALLOWED_TOOLS exhaustive set
- Message format conversion (dict + list-content)
- Stream draining
- generate() returns ChatMessage with text
- Auth failure → ClaudeAgentSDKAuthError
- OSError variants → AuthError
- asyncio.run() loud failure in running loop
- SmolagentsBackend dispatches to ClaudeAgentSDKModel for "claude-subscription"
- SmolagentsBackend keeps LiteLLMModel for other providers (back-compat)
Stats: 4 files changed, +459 / -16 LOC. 12 new tests. Full suite
2774 passed + 1 pre-existing failure + 4 skipped, 0 regressions.
**ToS notice**: Anthropic has not publicly endorsed `claude_agent_sdk`
use outside their first-party CC + Claude Code Skills products.
Module docstring + future user doc must say so. Operators should
review their CC ToS before relying on this in production.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reviewer round 2 was VERIFIED with 2 non-blocking issues. Both folded in. R2 #1 (suggested before merge) — typed auth classification: `ClaudeAgentSDKAuthError` was mapping to `AgentErrorType.AUTH` only because the error message happened to contain "api key" (in the "switch to provider: anthropic with an explicit API key" sentence). String- match coincidence; rewording the message would silently break the classification. New `_classify_error_typed(exc)` does isinstance check on `ClaudeAgentSDKAuthError` BEFORE falling through to the string-match path. Generic exceptions still go through `_classify_error` for backward compat. End-to-end regression test added: `test_auth_error_classification_end_to_end` constructs a real SmolagentsBackend, simulates SDK raising ClaudeAgentSDKAuthError via `agent.run`, asserts the resulting AgentResult.error_type == AgentErrorType.AUTH. Catches future docstring/message rewording that would silently demote to UNKNOWN. R2 #2 (docs-only) — future tense for `usage_source` plumbing: Module docstring previously claimed AttemptNode "records this as `usage_source=\"oauth_estimated\"`" — but the new enum value isn't yet in spec §4.1's `Literal[...]` and orchestrator doesn't read `backend_metadata` for this field. Reworded as future-tense: "WILL record once PR 19a lands the orchestrator plumbing; today, the field is unset and falls back to default." Stats: 2 files, +52/-7 LOC. 13 tests in test_smolagents_claude_sdk_model.py (was 12). Full suite: 2775 passed + 1 pre-existing failure unchanged + 4 skipped. 0 regressions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Stacked on #12 (M3 PR 18 marketing audit). Adds
provider: "claude-subscription"to the smolagents backend so it can drive Claude via CC OAuth credentials (no API key burn).Best of both worlds: smolagents' strong ACL boundary (
CheatResistancePolicyat toolforward()) + CC subscription's auth path.ACL invariant — the load-bearing pin
claude_agent_sdk.query()is NOT a token completion API — it's a complete agent product that runs its own loop with its own tools. Naive use would re-create the §3.3 agent-loop-in-agent-loop problem and silently void smolagents' ACL.Fix: configure SDK as a degenerate single-turn text generator:
This is locked in by
test_sdk_is_invoked_with_no_internal_tools— patchesclaude_agent_sdk.query, captures the actual options arg, asserts all 4 invariant properties. If a future SDK update adds a default tool that bypassesdisallowed_tools, this test breaks.Reviewer trail
claude_agent_sdk.query()as a completion API; reviewer corrected — it's a complete agent. Forced single-turn config + invariant test added. Plus 4 didn't-ask items: SDK version pin, cost framing, auth UX, default opt-in._classify_error_typed()+ e2e regression test; (2) docstring claimedusage_source="oauth_estimated"already exists but it's deferred to PR 19a — reworded as future-tense.Stats
55d788b+a15b606R2 fixes)Configuration
Existing
provider: "anthropic"(default) and other LiteLLM providers unchanged.Known limitations / non-blockers
usage_source="oauth_estimated"not yet plumbed onto AttemptNode (spec §4.1 enum needs the new value + orchestrator change). Deferred to PR 19a; cost field falls back to orchestrator default for now. Module docstring uses future tense.claude_agent_sdkuse outside their first-party CC + Claude Code Skills products. Module docstring documents the risk; users should review their CC ToS before relying on this in production.🤖 Generated with Claude Code