Skip to content

openclaw-contrib/orchestrator-protocol-spec

Repository files navigation

Ryn Orchestrator Reference Design (ROD)

Status: Published reference design, v1.0. Version: published-v1.0 Last updated: 2026-04-16

What this is

A reference architecture for single-operator LLM companions that need to converse with mixed-trust audiences — principal, friends, strangers — without leaking operator state. It is an OpenClaw-based deployment pattern with code-enforced (not prompt-enforced) security invariants, a structured untrusted-content tagging discipline, and an operator-in-the-loop self-modification workflow.

Who should read this

  • Hobbyists building personal AI companions exposed to Discord, Slack, or other mixed-trust channels
  • Researchers interested in trust-tiered conversation design and multi-turn memory-poisoning defenses
  • OpenClaw plugin authors looking for reference patterns
  • Security-adjacent readers evaluating how far LLM-output filtering can be pushed with current tooling

Who should NOT read this

  • Enterprise teams looking for production-grade multi-agent frameworks (use LangGraph, AutoGen, or CrewAI — this is not that)
  • Anyone looking for a proven-impervious security design (see THREAT_MODEL.md — the recall of LLM-judge components is unmeasured as of publication)
  • Tool-use or agent-SaaS builders — the threat model and design assume a single operator, not multi-tenancy

Before reading

Read THREAT_MODEL.md first. Every security claim in this spec is bounded by that document. The spec's "cannot be bypassed" language describes code-layer enforcement of routing and tagging invariants — it does NOT claim the Gate's LLM-based semantic-injection detection is impervious. The threat model is the authoritative statement of scope.

What's novel here

Most of this spec is recombination of existing patterns (LangGraph supervisor/worker, Simon Willison's dual-LLM, AutoGen nested chats, Constitutional AI — see PRIOR_ART.md for the full mapping). The genuinely novel contributions, confirmed by specialist review before publication:

  1. Plugin-boundary discipline as a stateable principle: "if it's deterministic, it's code; if it's judgment, it's prompt." Used to decide where every security invariant lives.
  2. <untrusted> tag preservation through distillation — multi-turn memory-poisoning defense. Most memory systems launder untrusted content into trusted priors during summarization.
  3. Operator-in-the-loop self-modification with generated-per-proposal adversarial red-teaming. Differs from fully-autonomous Constitutional AI by preserving human authority with measurable test gates.
  4. Three-path trust-tier update protocol anchored on platform snowflake, with CLI / DM-confirm / organic-with-adversarial-review paths and no auto-elevation.

Everything else is either a well-known pattern applied carefully (Gate, filesystem bus, typed envelopes) or an implementation detail of OpenClaw hook wiring.

Measured security claims

Claims labeled "no exceptions" or "cannot be bypassed" in this spec refer to code-layer enforcement (plugin hooks on deterministic paths). LLM-judge components (the Personality Gate's semantic-injection detection, the adversarial-review sub-agent's grooming-pattern heuristics) are bounded by the judge model's capability and are unmeasured as of publication. See THREAT_MODEL.md §Residual Risks and the calibration goal in 05-security-deepdive.md §7.5.

Deployment framing

This document uses named identities throughout to keep the design concrete: "the principal" (the single operator), and three friend-tier interlocutors (friend_alpha, friend_beta, friend_gamma). The principal's AI companion is named Ryn. These are illustrative — the design is deployment-agnostic. Discord snowflakes like 000000000000000001 are obvious placeholders, not real IDs.

Rule and content files referenced as private (gate/rules.md, gate/adversarial_suite.md, interlocutor memory under memory/interlocutors/*/) are deployment-specific and not included in this repository. Schemas and stub examples are in the companion trust-gate-plugin repository.

Brain architecture note

The spec documents two brain implementations — Hybrid (shipping) and Federation (deferred). Hybrid is a single persistent session with Task-tool sub-agents, ~40% lower latency and ~37% lower cost than Federation, and is the recommended architecture for v1.0. Federation is documented as a speculative alternative in 06-appendix-federation.md with a clear marker. Readers focused on getting a companion running should treat Hybrid as the spec; Federation is design thinking preserved for future experimentation.


Scope

Reference design and architecture for an AI companion running on OpenClaw. Documents the shared infrastructure (trust tiers, Gate, memory, identity, cron) plus the Hybrid brain implementation. Federation brain is documented as an experimental alternative in an appendix.

Transport: OpenClaw MCP tools (sessions_spawn, sessions_send, sessions_yield, sessions_list, sessions_history) + Task tool (for sub-agent spawning in Hybrid) + shared workspace filesystem.

First-ship scope (v1.0.0) — start here

Per strategic review, the first public release is a minimum lovable artifact focused on what other people can actually benefit from today. If you're reading this repository for the first time, read these five files and stop:

The deeper-dive documents (02-protocol.md, 03-enforcement.md, 04-state-and-ops.md, 05-security-deepdive.md, 06-appendix-federation.md) are included in this repository but marked as reference implementation detail. They are most useful to OpenClaw plugin authors or readers who want to see the full internal spec; casual readers should start with the five files above and dip into the deeper dives as specific questions arise.

Rationale: the broadly reusable contribution is the design pattern (plugin-boundary discipline, untrusted-tag preservation through distillation, three-path trust-tier elevation), not the exact OpenClaw plumbing. Leading with the pattern lets more readers extract value without wading through implementation specifics tied to one runtime.

Full document map

All 10 files ship as canonical v1.0.0 content. This table is the complete index; use it to jump to a specific topic after the first read.

# Document What it covers
0 THREAT_MODEL.md Read first. Adversaries in and out of scope, residual risks, posture guidance.
1 01-concepts.md Design principles, two-brain architecture, untrusted content handling. The mental model.
2 02-protocol.md Message envelope, Hybrid brain, fast path, versioning. The wire-level protocol.
3 03-enforcement.md Plugin-layer enforcement, OpenClaw runtime substrate. The code-vs-prompt boundary.
4 04-state-and-ops.md State files, locking, error handling, cost/latency, implementation checklist. The ops story.
5 05-security-deepdive.md Personality Gate, trust tiers, approval workflows, open questions.
6 06-appendix-federation.md Speculative. Federation brain as an experimental alternative. Not part of v1.0 shipping path.
trust-gate-plugin.md Plugin reference for the OSS enforcement plugin. Hook registrations, tier resolution, identity-snapshot file formats, config knobs, operational notes. Reflects the running implementation; pairs with 03-enforcement.md and 05-security-deepdive.md.
PRIOR_ART.md Antecedents in multi-agent orchestration + LLM security. OWASP LLM Top 10 mapping.
CHANGELOG.md Published-version changes.

Cross-references

Section numbers (§X.Y) inside the subdocs refer to sections that may live in a different subdoc in this split. The section numbers are preserved from the internal monolithic spec for traceability; search any subdoc for the literal header (e.g., ## 2.4 or ### 7.3) to locate it.

Authoritative anchor: the section number is the identifier. The file a section lives in can change across revisions.

License & citation

See LICENSE (Apache-2.0). If you adopt patterns from this spec, cite both this document and the underlying prior art; see PRIOR_ART.md.

About

Ryn Orchestrator Reference Design — trust-tiered conversation security for single-operator LLM companions

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors