armor — Authoritative Spec

Project: armor Last updated: 2026-05-07

What this directory is

docs/spec/ is the authoritative current-state snapshot of armor. It answers the question:

"If the code were deleted tomorrow, what would I need to write down to rebuild it?"

The spec is dual-natured:

Output of current sessions — every completed task that changes externally-observable behavior, the data model, an interface, or configuration must update the relevant spec file in the same commit.
Input to future sessions — used for onboarding, drift audits against the code, and (in the limit) regenerating the codebase from scratch.

The code is one realization of this spec. If the spec and code disagree, one of them is wrong — fix the wrong one in that same change.

Spec vs. ADRs vs. overview

Doc	Purpose	Lifecycle
`docs/spec/`	What the system does and is today	Snapshot — supersede in place, never append
`docs/architecture/decisions/`	Why decisions were made	Append-only history; ADRs can be superseded by later ADRs
`docs/architecture/overview.md`	Narrative tour of the system	Snapshot, but optimized for human reading
`docs/architecture/diagrams.md`	Visual structure and flows	Snapshot, part of the spec

When a later ADR supersedes an earlier one (for example, ADR-018 supersedes the model row of ADR-001), the spec just reflects the new choice. The ADRs preserve the reasoning trail; the spec preserves the current truth.

The spec files

File	Covers	Read this when
behaviors.md	What the system does — detection categories, block/pass/forensic-capture semantics, session escalation rules	You need to know what should happen when X
architecture.md	System structure — containers, components, source paths, cross-container edges (paired with `docs/architecture/diagrams.md`)	You need to know what the deployable units and modules are, or audit drift against the code
data-model.md	Detector verdicts, session state, canary catalogue, forensic incident records	You need to know what data exists and how it's structured
interfaces.md	The `armor` CLI, the daemon IPC protocol, the Python SDK surface, hook contracts	You need to know what calls into or out of the system
configuration.md	Detector thresholds, hook profile env vars, model selection, container limits	You need to know what's tunable
fitness-functions.md	Executable architectural invariants — what `make fitness` checks (no network in daemon, no canary leak, layering, complexity, coverage)	You need to know which invariants are mechanically verified, or propose a new one

Maintenance rules

Update in the same commit as the code change. A task that changes behavior is not done until behaviors.md reflects it.
Supersede in place. Never append. When a decision changes, rewrite the spec entry — don't add a "previously this was X" note. The ADR carries that history.
No future tense. The spec describes what is, not what will be. Roadmap and per-task planning are operator-private and not part of the public repo.
No implementation rationale. "We chose X because Y" belongs in an ADR. The spec just says "uses X."

Project summary

armor is a defense-in-depth security layer for LLM agents — Python library + Docker container — that detects prompt injection, exfiltration via canary tokens, encoding/obfuscation, jailbreaks, tool/API abuse, and multi-turn session attacks. It runs as a long-lived daemon and integrates with Claude Code via shell hooks and with custom agents as an importable library.

Top-level invariants

Every check returns a Verdict of pass | block | advisory. No detector returns side-channel data; no detector mutates payload. Verdicts compose via the pipeline (defined in behaviors.md).
The daemon never makes outbound network calls by default. All inference is local. The validator LLM and the pattern engines must work without internet. (Outbound calls are gated behind an explicit opt-in config flag for telemetry, not default-on.)
Hook clients never bypass the daemon. Detectors are not callable directly from a hook; the hook always speaks to the daemon. Centralization keeps detector ordering, version, and config consistent.
Forensic logs never contain the canary value verbatim — they reference canary IDs. This prevents the forensic log itself from becoming a canary-leak channel if it's ever exposed.
Block decisions are deterministic given (input, output, session_state). No randomness in the verdict path. This makes regressions reproducible from the corpus.

Non-goals

Replacing model-side safety training. armor is a perimeter; the model still refuses on its own.
Generic content filtering (profanity, NSFW, copyright). Out of scope — the scope is security, not policy.
Sub-millisecond hot paths inside a high-QPS API. Target is a single user's Claude Code session and small agent fleets.
Detection of every theoretical attack. The corpus defines what we claim to catch; new attack classes land as corpus entries + (when needed) new detectors.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

armor — Authoritative Spec

What this directory is

Spec vs. ADRs vs. overview

The spec files

Maintenance rules

Project summary

Top-level invariants

Non-goals

FilesExpand file tree

SPEC.md

Latest commit

History

SPEC.md

File metadata and controls

armor — Authoritative Spec

What this directory is

Spec vs. ADRs vs. overview

The spec files

Maintenance rules

Project summary

Top-level invariants

Non-goals