Skip to content

feat(m2): smolagents AgentBackend adapter (ToolCallingAgent + LiteLLM)#7

Open
suzuke wants to merge 3 commits into
feat/m2-hmac-sealfrom
feat/m2-smolagents-backend
Open

feat(m2): smolagents AgentBackend adapter (ToolCallingAgent + LiteLLM)#7
suzuke wants to merge 3 commits into
feat/m2-hmac-sealfrom
feat/m2-smolagents-backend

Conversation

@suzuke
Copy link
Copy Markdown
Owner

@suzuke suzuke commented Apr 25, 2026

Summary

Stacked on #6 (M2 PR 12 HMAC seal). Adds opt-in smolagents AgentBackend wrapping ToolCallingAgent + LiteLLMModel as a drop-in alternative to the existing claude_agent_sdk backend. Default config still uses claude-code; users opt in via agent.type: smolagents.

This is the largest M2 PR — productionising the smolagents adapter that POC validated in tests/security/. Spec ref: §1 module table — AgentBackend "smolagents adapter" for M2; §2.1 default-safe-mode tools; §INV-3 forbidden-tool list.

What's new

src/crucible/agents/_path_acl.py (new, ~290 LOC)

The Crucible-owned filesystem-access layer. Independent of any agent framework — testable without smolagents installed (reviewer Q7 pin).

  • safe_read / safe_write / safe_edit / safe_glob / safe_grep enforce CheatResistancePolicy ACL inside each helper.
  • Read = editable | readonly; write = editable; hidden / unlisted ALWAYS denied.
  • Path normalization rejects .. traversal, absolute escapes, symlink jumps before classification.
  • safe_glob rejects absolute patterns (/tmp/*, \\server\share) before calling Path.glob() — fixes a NotImplementedError escape (reviewer round 2 F2).
  • Error messages sanitised: hidden file errors use generic "not visible to the agent" — agent cannot probe existence via tool errors. Eval-command details never surface.
  • safe_edit is search/replace contract (reviewer Q4): rejects 0 or multiple matches.

src/crucible/agents/_smolagents_tools.py (new, ~240 LOC)

5 Tool subclasses (read_file, write_file, edit_file, glob, grep) wrapping the safe_* helpers. Forbidden tools (run_python / run_shell / eval_code / execute / python_interpreter / code_act) are absent by construction — no registration path could add them. Registry inspection tests verify this structurally.

src/crucible/agents/smolagents_backend.py (new, ~225 LOC)

SmolagentsBackend(AgentInterface) with:

  • Stable backend_kind = "smolagents" (not provider/model-specific).
  • Best-effort backend_version via importlib.metadata.version()__version__"unknown". Never raises.
  • API key VALUE never stored — only the env var NAME is referenced. LiteLLM reads VALUE at request time.
  • Hard-fails on workspace mismatch in generate_edit() (reviewer round 2 F1) — backend's tools and policy are bound to one workspace at construction; cross-workspace reuse raises ValueError. Regression test asserts _agent.run is never called and no files in either workspace are touched on mismatch.
  • Lazy import via _import_smolagents() — module load doesn't pull in smolagents/litellm; only __init__ triggers, raising SmolagentsImportError with pip install 'autocrucible[smolagents]' hint if missing.

Factory (agents/__init__.py)

create_agent(config, ...) dispatches by config.agent.type:

  • "claude-code" → existing ClaudeCodeAgent (default; back-compat preserved)
  • "smolagents" → SmolagentsBackend (lazy import)
  • Unknown → ValueError (fail closed, reviewer pin)

Optional dependency

pyproject.toml adds:

[project.optional-dependencies]
smolagents = ["smolagents>=1.24.0", "litellm>=1.50.0"]

Default pip install autocrucible doesn't pull these in. Importing crucible.agents works without the extra; only opting into agent.type: smolagents triggers the lazy import + clear install error.

Orchestrator integration

  • backend_kind and backend_version on AttemptNode now come from the agent (getattr(self.agent, "backend_kind", None) or "claude_sdk") — preserves legacy ClaudeCodeAgent behaviour while letting SmolagentsBackend declare its own identity.
  • CLI init and run commands build the policy and pass workspace + policy kwargs to create_agent so the smolagents path has what it needs.

Reviewer trail

Round Verdict Key finding
1 (design) NEEDS_TWEAK 7 main Qs + 8 implementation pins: actionable ImportError, lazy import, single-key API, ACL inside forward(), own search/replace edit tool, best-effort version, importorskip strategy, forbidden-tools-absent invariant, no probabilistic CI gate
2 (impl) REJECTED F1: generate_edit warned and continued on workspace mismatch — isolation/correctness bug if backend reused across workspaces. F2: safe_glob('/tmp/*') raised NotImplementedError outside the sanitised-denial channel — escapes tool boundary as backend exception.
3 (fix) VERIFIED F1: generate_edit raises ValueError on mismatch; regression asserts _agent.run never called and no files touched. F2: safe_glob rejects absolute patterns before Path.glob(); reviewer re-ran their repro and confirmed all variants now return sanitised ToolDenied.

Stats

  • 3 commits (f3f3fe8 step 1 + b20045b step 2 + 9aba8d3 F1+F2 fix), 11 files changed, +1,473 / -20 LOC
  • 2,608 passed / 4 skipped, 0 regressions vs PR 12 baseline (2,448). +160 net = 81 new PR 13 tests + ~79 previously-skipped security tests now running with smolagents installed.
  • 81 PR-13 tests across 4 files:
    • test_agents_path_acl.py (38, never skipped) — ACL boundary, traversal/absolute rejection, error message hygiene, ambiguous-edit rejection, hidden-file existence-leak defence
    • test_agents_factory.py (12, never skipped) — factory dispatch, fail-closed on unknown type, missing-extra ImportError, lazy-import contract, forbidden-tools-absent registry inspection
    • test_smolagents_tools.py (16, importorskip) — Tool subclasses delegate correctly; ACL denial returns [denied] string; hidden file content never leaks
    • test_smolagents_backend.py (15, importorskip) — backend_kind stable, version best-effort + "unknown" fallback, construction is pure, generate_edit success/no-edit/exception paths, error classifier, workspace mismatch hard-fail regression, API key never on backend attributes

Test plan

Crucible-owned (always run, no smolagents required)

  • ACL: read=editable|readonly, write=editable, hidden/unlisted denied
  • Path traversal: .., absolute outside workspace, symlink rejected
  • Absolute glob patterns: /tmp/*, /etc/*.conf, \\server\share\* → ToolDenied (reviewer F2 regression)
  • Error messages don't leak hidden file existence or content
  • safe_edit rejects 0 / multiple matches
  • Factory dispatch + fail-closed on unknown type
  • Missing-extra ImportError carries pip install 'autocrucible[smolagents]' hint
  • Forbidden tools absent from registry by construction

Smolagents-dependent (importorskip)

  • Tool subclasses delegate to safe_* helpers
  • ACL denial returns [denied] string (recoverable in agent loop)
  • Hidden file content never leaks via tool output
  • Backend identity: backend_kind == "smolagents", version best-effort
  • Construction is pure (no network calls)
  • generate_edit mocked-agent.run mapping (success / no-edit / exception)
  • Error classifier: auth / timeout / unknown
  • Workspace mismatch hard-fails: _agent.run not called, no files touched (reviewer F1 regression)
  • API key value never stored on backend attributes

Known limitations / non-blockers

  • Regex DoS in safe_grep: model-controlled regex could in principle craft catastrophic-backtrack patterns. Reviewer accepted as deferred to future hardening since safe_grep already caps result count and this PR's scope didn't include regex sandboxing. A later pass should bound file/line sizes or move to a regex engine with timeout if model-controlled grep is exposed in hostile settings.
  • Cost tracking: smolagents v1.24 doesn't surface LiteLLM cost in a stable way; usage is None in AgentResult for now. Cost scraping from response objects can land in a follow-up.
  • CodeAct mode: explicitly NOT exposed in PR 13 per spec §INV-3. Requires --agent-mode=code-act --executor=docker in M3+.
  • SubscriptionCLIBackend: M3 territory, not in this PR.

Migration / usage

# .crucible/config.yaml
agent:
  type: smolagents          # opt-in (default: claude-code)
  smolagents:
    provider: anthropic     # passed to LiteLLM
    model: claude-3-5-sonnet-20241022
    api_key_env: ANTHROPIC_API_KEY
    max_steps: 12
pip install 'autocrucible[smolagents]'
export ANTHROPIC_API_KEY=sk-...
crucible run --tag t1

🤖 Generated with Claude Code

suzuke and others added 3 commits April 26, 2026 00:25
The Crucible-owned filesystem-access layer that all backends call from
their tools. Independent of smolagents — testable without the framework
installed (reviewer round 1 Q7 pin).

`crucible/agents/_path_acl.py`:
- `safe_read` / `safe_write` / `safe_edit` / `safe_glob` / `safe_grep`
  enforce CheatResistancePolicy ACL inside each helper.
- Read = editable | readonly. Write = editable. Hidden / unlisted ALWAYS
  denied. Path normalization rejects `..` traversal and absolute paths
  that escape workspace.
- Error messages sanitised (reviewer round 1 pin): hidden file errors
  use generic "not visible to the agent" phrasing — agent cannot probe
  existence via tool errors. Eval-command details never surface.
- `safe_edit` is search/replace contract (reviewer Q4): rejects missing
  / ambiguous substrings. Caller can use `safe_write` for full
  overwrite. Differentiating these makes the contract predictable.

`crucible/config.py`:
- New `SmolagentsConfig` dataclass: provider / model / api_key_env /
  max_steps. API key VALUE is never stored — only the env var name.
- `AgentConfig.type` discriminator: "claude-code" | "smolagents".
  Validated in `_build_agent` — unknown values fail closed (reviewer
  pin: no CodeAct mode field in PR 13).
- `_build_smolagents` validator added.

Tests: 36 unit tests in `test_agents_path_acl.py` covering ACL boundary
+ path normalization + traversal defence + error message hygiene +
existence-leak defence + truncation + ambiguous-edit rejection.

This is the first of ~5 commits for M2 PR 13. Next: smolagents Tool
subclasses (importorskip), SmolagentsBackend, factory, pyproject extra.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Production-grade adapter wrapping smolagents ToolCallingAgent +
LiteLLMModel as a drop-in alternative to the existing claude_agent_sdk
backend. Default config still uses claude-code; users opt into smolagents
via `agent.type: smolagents` in `.crucible/config.yaml`.

New modules:
- `agents/_smolagents_tools.py` — 5 Tool subclasses (read_file /
  write_file / edit_file / glob / grep) wrapping the safe_* helpers
  from step 1. Forbidden tools (run_python / run_shell / eval_code /
  execute / python_interpreter / code_act) are absent BY CONSTRUCTION;
  there is no registration path that could add them. Tests assert this
  structurally.
- `agents/smolagents_backend.py` — `SmolagentsBackend(AgentInterface)`
  with stable `backend_kind="smolagents"` and best-effort
  `backend_version` via `importlib.metadata.version()` (fallback to
  `__version__`, then "unknown" — never raise).

Factory wiring (`agents/__init__.py`):
- `create_agent(config, ...)` dispatches by `config.agent.type`.
- claude-code path unchanged (back-compat preserved).
- smolagents path requires `workspace` + `policy` kwargs; CLI's `run`
  and `init` commands now build the policy and pass both.
- Lazy import: loading `crucible.agents` module does NOT pull in
  smolagents — only `_create_smolagents` triggers the import, raising
  `SmolagentsImportError` with `pip install 'autocrucible[smolagents]'`
  hint if missing.

Orchestrator integration:
- `backend_kind` and `backend_version` on AttemptNode now come from the
  agent (was hardcoded `claude_sdk`). `getattr(self.agent, "backend_kind",
  None) or "claude_sdk"` preserves legacy ClaudeCodeAgent behaviour.

API key handling (reviewer round 1 Q2):
- `SmolagentsConfig.api_key_env` stores only the env var NAME.
- The VALUE is never on the backend object; LiteLLM reads it from the
  env at request time. Test asserts no attribute on SmolagentsBackend
  contains the key string.

Optional dependency (reviewer round 1 Q1):
- `pyproject.toml` adds `[smolagents]` extra: `smolagents>=1.24.0` +
  `litellm>=1.50.0`.
- Default `pip install autocrucible` does NOT pull these in.
- Importing `crucible.agents` works without the extra; only opting
  into `agent.type: smolagents` triggers the lazy import + clear
  error if missing.

Tests (+77):
- `test_agents_factory.py` (12) — type discriminator, default claude-code,
  unknown type fail-closed, smolagents missing-extra error message,
  workspace/policy required, lazy-import contract, forbidden-tools-absent
  registry inspection.
- `test_smolagents_tools.py` (16, importorskip) — Tool subclasses
  delegate to safe_* helpers; ACL denial returns `[denied] ...` string
  (not raise); hidden file reads/writes/edits all denied without leaking
  content; explicit forbidden-name checks.
- `test_smolagents_backend.py` (13, importorskip) — backend_kind stable,
  backend_version best-effort + "unknown" fallback, construction is
  pure (no network), generate_edit success/no-edit/exception paths,
  error classifier (auth/timeout/unknown), API key never leaks into
  backend attributes, capabilities exclude run_python/shell.

Existing test fix:
- `test_integration.py` used `agent.type: "fake"` which the new
  validator rejects. Changed to `claude-code` (orchestrator overrides
  the agent instance directly via FakeAgent injection).

Full suite: 2604 passed / 4 skipped, 0 regressions vs PR 12 (2448);
the +156 includes 77 new PR 13 tests + ~79 previously-skipped security
tests now running with smolagents installed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reviewer round 2 REJECTED PR 13 with two real correctness blockers.

F1 — workspace mismatch must hard-fail (orchestrator isolation bug):
- `SmolagentsBackend.generate_edit(prompt, workspace)` previously logged
  a warning and continued using the construction-time workspace if the
  passed workspace differed.
- This is an isolation/correctness hole: if a backend object is reused
  / rebound across workspaces, calls to `generate_edit(prompt,
  workspace_b)` would read/write/snapshot from configured workspace_a
  with the wrong ACL.
- Fix: raise ValueError on mismatch — the backend's tools and
  CheatResistancePolicy are bound to one workspace at construction.
  Callers needing a different workspace should construct a new backend.
- Regression: `test_generate_edit_rejects_workspace_mismatch_hard`
  asserts (a) ValueError raised, (b) `_agent.run` NOT called, (c) NO
  files in either workspace touched.

F2 — `safe_glob()` had an uncaught absolute-pattern crash:
- `Path('/tmp/*').glob()` raises NotImplementedError ("Non-relative
  patterns are unsupported") which `_smolagents_tools.GlobTool.forward`
  doesn't catch (only `ToolDenied`). The exception would bubble out of
  the tool boundary and surface as a backend exception to the agent
  loop — bypassing the sanitised-denial channel entirely.
- Fix: reject `Path(pattern).is_absolute()` and `/`/`\\` prefixes
  BEFORE calling `Path.glob()`, with clear ToolDenied("workspace-relative").
- Regressions: `test_safe_glob_rejects_absolute_pattern` and
  `test_safe_glob_rejects_root_slash_pattern`.

Both fixes are ~6 lines of code each plus tests. Full suite:
2608 passed / 4 skipped (was 2604; +4 from new regression tests),
0 regressions.

Reviewer's verified items remain verified: lazy import / API key
handling / forbidden tools / backend identity / orchestrator wire-up.
Non-blocker (regex DoS in `safe_grep`) deferred to a future hardening
pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant