Skip to content

Latest commit

 

History

History
267 lines (213 loc) · 15.9 KB

File metadata and controls

267 lines (213 loc) · 15.9 KB

Interfaces

Project: armor Last updated: 2026-05-17

The system's contact surface — everything that calls into the system, everything the system calls out to, and the public boundaries within the system. Each interface is a stable contract.


Inbound interfaces

Build & test entry points (Makefile)

Target Purpose Usage
make sync Install/sync dependencies via uv CI and local development
make lint Run ruff linter on source and test code CI and pre-commit
make format Auto-format code with ruff; fail if not formatted CI and pre-commit
make typecheck Run mypy strict mode on src/ CI and pre-commit
make test Run all pytest tests (unit + integration) CI and local development
make eval Run eval corpus tests (pytest tests/eval/) CI and detector validation
make check Run lint, format, typecheck, test, eval (full CI gate) CI before merge; local sanity check
make fitness Run fitness checks (architecture invariants) Optional pre-commit; advisory in CI
make fitness-smoke Run smoke-marked fitness checks only Fast pre-push / local agent gate
make fitness-full Run all fitness checks, including slow and LLM-gated checks Nightly and release verification
make demo Run end-to-end demo scenario Manual testing and onboarding
make release-check Run staged pre-tag verification (check, fitness, demo, offline-smoke examples; optional Docker stage) Release readiness
make help Print available targets Reference

Exit codes:

  • 0 — all checks passed
  • 1 — any check failed (lint, typecheck, test, eval, fitness)

CLI

armor <subcommand> [flags] [args]

Subcommands:
  daemon              Start the long-lived guardrail daemon
  check input         Check a user-input payload for injection signals
  check output        Check a model-output payload for exfiltration signals
  check tool          Check a tool-call (name + params) against the command denylist
  check fetched       Check a tool-call result (e.g., Read/WebFetch output) for indirect injection
  session close       Mark a session ended; flush state
  canary list         List the active canary catalogue (IDs + kinds, never values)
  canary generate     Generate a new canary values file at install time
  canary honeypot     Write a fake-credential .env file seeded with canary values (filesystem honeypot)
  canary pii-context  Write a system-prompt snippet with fake PII identity records seeded as canary values
  canary seed         One-step setup: generate values + write all honeypot files (.env, pii-context, user-profile)
  config show         Show runtime configuration (selected section)
  incidents list      Paginated table of incidents (filterable by session, category, age)
  incidents show      Full record for a single incident (canary_id only — never values)
  incidents tail      Live-updating Rich table of new incidents (polls; survives daemon restart)
  incidents export    Export incidents as NDJSON
  sessions list       Active sessions with state name + risk score
  sessions show       Full session state, signal count, rolling-buffer hash (no raw content)
  sessions unblock    Operator-cleared transition out of `Blocked` (→ `Watching`); writes audit row
  hooks install       Install the bundled Claude Code hook configuration into a settings.json
  health              Expanded health report; exits 0 healthy / 1 degraded / 2 critical

Global flags:
  --socket <path>    Daemon socket path (default: /var/run/armor.sock)
  --session-id <id>  Session ID for stateful checks (default: derived from env)
  --json             Machine-readable output
Subcommand / flag Type Default Effect
daemon --socket path /var/run/armor.sock Where to bind the IPC socket
daemon --model path /models/<chosen>.gguf Validator LLM weights file
daemon --db path /var/lib/armor/armor.db SQLite file path
daemon --canary-values path <unset> Path to canary values file (generated by armor canary generate); env var ARMOR_CANARY_VALUES_PATH overrides
daemon --quarantine-key-path path <unset> Path to the quarantine-encryption key file; if unset, the daemon writes/reads <db_dir>/.key (autogenerated on first start) per ADR-011
hooks install --settings path ./.claude/settings.json Target settings.json to merge the bundled Claude Code hook stanzas into
daemon --catalogue path <unset> Deprecated — legacy alias retained for backward compatibility; new deployments must use --canary-values.
check input <text> string (stdin OK) Payload to evaluate
check output <text> string (stdin OK) Payload to evaluate
check tool --name <n> --params <json> strings Tool name + params blob. In --hook-mode, both fields may be derived from Claude Code/Codex hook JSON on stdin (tool_name/tool_input, or Codex tool_input.command inferred as Bash).
check fetched <text> --source-tool <name> string Tool-call result + source tool name to evaluate for indirect injection (per ADR-041). In --hook-mode, both fields may be derived from hook JSON on stdin (tool_name plus tool_response/tool_result).
canary generate --out <path> path Output path for generated values file (required)
canary generate --seed <hex> int <RNG> Optional seed for deterministic generation (e.g., 0xCAFEBABE); if unset, uses OS RNG
canary honeypot --values <path> path Canary values file produced by canary generate (required)
canary honeypot --out <path> path Destination path for the generated fake-credential .env file (required)
canary pii-context --values <path> path Canary values file produced by canary generate (required)
canary pii-context --out <path> path Destination path for the generated system-prompt PII context snippet (required)
canary seed --out-dir <path> path Directory to write all honeypot files: canary-values.json, .env, pii-context.txt, user-profile.json (required)
canary seed --seed-value <hex> int <RNG> Optional seed for deterministic generation
config show --section <name> string Show a config section (e.g., pipeline.exempt, pipeline.source_multipliers) in TOML format; --json outputs JSON
config show --section <name> --json bool false Render config as JSON instead of TOML
incidents list --since <duration> duration string e.g. 1h, 30m
incidents list --session <id> string Filter to one session
incidents list --category <pat> glob Filter on detector category
incidents list --limit <N> int 50 Page size
incidents show <incident_id> string Full record for one incident
incidents tail --filter <expr> string Live tail with comma-separated key=value filters: session/session_id, category, since, severity
incidents export --since <duration> duration string Export rows newer than the relative age
incidents export --session <id> string Export rows for one session
incidents export --severity <level> string Export rows with one persisted verdict severity
incidents export --output <path> path - Write NDJSON to a file instead of stdout
sessions list --state <name> string Filter on state (Normal, Watching, Elevated, High, Blocked)
sessions show <session_id> string Full session state
sessions unblock <session_id> --reason <text> string Clear BlockedWatching; required for audit row
--session-id <id> string $ARMOR_SESSION_ID Cross-call session correlation

Exit codes (check family):

  • 0 — pass (allowed)
  • 1 — internal error (daemon unreachable, IPC failed)
  • 2 — usage error (bad flags)
  • 78 — daemon configuration error (e.g. model not found at startup)
  • 100 — block (the check returned block)
  • 101 — advisory (returned advisory; caller decides whether to allow)

The split between exit codes 0 and 100 is intentional — Claude Code hooks use exit code 2 to signal "block and show stderr to the model" (per the hook contract). The armor check wrapper translates verdicts to that convention via the --hook-mode flag. In hook mode, check input, check tool, check fetched, and check output read hook JSON from stdin when positional/flag inputs are absent.

Exit codes (health):

  • 0 — healthy (daemon responsive, db reachable, model loaded, recent traffic normal)
  • 1 — degraded (e.g. db near full, p95 latency elevated, model load slow)
  • 2 — critical (e.g. daemon unresponsive, db unreachable, model not loaded)

Output rendering: incidents list, incidents tail, sessions list, and health use Rich tables on a TTY and degrade to plain text (no ANSI escapes) when stdout is not a TTY.

Daemon IPC (Unix socket)

See data-model.md § Wire / interchange formats for the request/response schema. Newline-delimited JSON. One request, one response, then the connection may be reused or closed.

Python SDK

The armor library exports a typed, ergonomic client for the daemon IPC transport. It does not run detectors locally.

Public surface:

from armor import (
    ArmorClient,
    AsyncArmorClient,
    DaemonUnreachableError,
    HealthReport,
    Incident,
    Verdict,
)

# Synchronous client
client = ArmorClient(socket_path="/var/run/armor.sock")
v: Verdict = client.check_input("user input", session_id="sess-1")
if v.blocked:
    return safe_response()

response = llm_client.messages.create(...)
v = client.check_output(response.content[0].text, session_id="sess-1")
if v.blocked:
    return safe_response()

# Check a tool result for indirect injection
tool_result = read_file("config.txt")
v = client.check_fetched(tool_result, source_tool="Read", session_id="sess-1")
if v.blocked:
    return safe_response()

# Asynchronous client
async_client = AsyncArmorClient(socket_path="/var/run/armor.sock")
v = await async_client.check_input("user input", session_id="sess-1")

# Session-bound context manager (sync)
with client.session("user-123") as s:
    v1 = s.check_input("message 1")  # Implicitly uses session_id="user-123"
    v2 = s.check_input("message 2")

# Session-bound context manager (async)
async with async_client.session("user-123") as s:
    v1 = await s.check_input("message 1")
    v2 = await s.check_input("message 2")

# Health check
report: HealthReport = client.health()
if not report.daemon_reachable:
    raise RuntimeError("Daemon unreachable")

# Retrieve a forensic incident
incident: Incident | None = client.incident("inc-abc123")

Classes:

Class Purpose
ArmorClient(socket_path) Synchronous client for daemon communication. Methods: check_input, check_output, check_tool_call, check_fetched, health, incident, session.
AsyncArmorClient(socket_path) Asynchronous client (same interface, returns awaitables).
Verdict Security verdict with decision (pass/block/advisory/error), signal_id, severity, message, details. Properties: blocked, passed, is_error.
HealthReport Daemon health status: daemon_reachable, socket_reachable, db_reachable, model_loaded, version, uptime_seconds, active_connections, max_concurrent, total_checks, and optional rolling P95 latency fields.
Incident Forensic incident: id, timestamp, session_id, payload_hash, verdict_decision, signal_id, details.

Exceptions:

Exception Raised when
DaemonUnreachableError Daemon socket does not exist or connection fails. Signals a hard dependency failure; SDK calls do not degrade gracefully.

Stability: The re-exported classes (ArmorClient, AsyncArmorClient, Verdict, HealthReport, Incident) are stable across minor versions. See ADR-028 for the semver contract.


Outbound interfaces

Dependency What we call Library / version Failure mode
Local file system Read model weights, read/write SQLite, read/write socket stdlib Daemon refuses to start if any required path is unwritable
llama.cpp (via llama-cpp-python) Inference on the validator/honeypot model Pinned in pyproject.toml LLM unavailable → checks degrade to static-only with advisory confidence=0

armor makes no outbound network calls by default. There is no telemetry or upload interface in the current runtime.


Internal public surface

Trait: Detector

from typing import Protocol

class Detector(Protocol):
    id: str                # e.g. "regex.instruction_override" or "llm.validator"
    category: str          # taxonomy bucket from ADR-001 (e.g. "direct_injection", "meta")
    cost_tier: str         # "static" | "semantic" | "llm"

    def check(self, payload: Payload, ctx: SessionContext) -> Verdict: ...

Cost tiers and per-call budgets:

  • static — pure regex/parsing with no learning models (≤100ms typical; budget per config pipeline.per_detector_budget_ms)

  • semantic — local non-LLM ML inference (e.g., ONNX embeddings for topic-coherence; 50 ms default per ADR-026; budget per config pipeline.per_detector_budget_ms)

  • llm — local LLM inference (validator 500ms per model.validator_budget_ms, honeypot 16s per model.honeypot_budget_ms)

  • Implementors: Every concrete detector module under src/armor/detectors/.

  • Consumers: The daemon's Pipeline only.

  • Stability: The signature is stable across minor versions. Adding fields is a breaking change.

  • SessionContext access: ctx.signal_history is a public read-only list of Signal objects, exposing the session's prior event history. Detectors may read this to contextualize their decisions (e.g., meta.conversation_hijack reads signal count to calibrate confidence). Signal history persists for the session lifetime and reflects all detected signals (blocks, advisories, errors) in temporal order.

  • Required behavior:

    • Must be deterministic given (payload, ctx).
    • Must not raise — must catch internal errors and return Verdict.error(reason).
    • Must not perform I/O outside the daemon (no network, no filesystem writes).
    • Must complete within the configured per-detector budget (default 100 ms for static/semantic; 500 ms for llm validator; 16s for honeypot).
    • Canary isolation (v0.2+): Detectors must NOT read canary values (via catalogue.values() or the .value field). The honeypot detector (task 019) has exclusive read access. Validator detector passes only canary_id references to forensic logs and verdicts.

Trait: Verdict

@dataclass(frozen=True)
class Verdict:
    decision: Literal["pass", "block", "advisory", "error"]
    signal_id: str | None       # which rule fired
    severity: Literal["low", "medium", "high", "critical"]
    message: str                # human-readable reason
    details: dict               # detector-specific structured details

Verdicts compose in the pipeline by aggregation: any block short-circuits to block; otherwise the highest severity advisory propagates and feeds the session risk score.


Extension points

  • New detectors are added by dropping a module under src/armor/detectors/ that registers via the entry-point armor.detectors. The pipeline auto-discovers at boot. No core changes needed.
  • New canary types are added by editing the canary generator script and re-running armor canary regenerate (which writes a new catalogue snapshot).
  • Custom hook clients (e.g. for non-Claude-Code agents) speak the IPC directly — see data-model.md for the protocol.

armor does not support runtime detector hot-loading in v1. Reload = daemon restart.