Skip to content

Latest commit

 

History

History
330 lines (267 loc) · 20.5 KB

File metadata and controls

330 lines (267 loc) · 20.5 KB

Surfaces and integrations

Current external surface map.

CLI

Main file: src/dory_cli/main.py

Packaged commands:

  • dory
  • dory-http
  • dory-mcp

Global flags on dory:

  • --corpus-root (Path, optional)
  • --index-root (Path, optional)
  • --auth-tokens-path (Path, optional)

Top-level dory commands:

  • init
  • wake--budget (default 600), --agent (default "codex"), --profile
  • active-memoryprompt (required), --agent, --cwd, --project, --session-key, repeated --session-id, repeated --session-agent, repeated --device, repeated --session-status, --since, --until, --include-wake/--no-include-wake
  • memory-writecontent (required), --subject (required), --action (default "write"), --kind (default "fact"), --scope, --confidence, --reason, --source, --agent, --session-id, --origin-surface, --soft/--no-soft, --dry-run/--no-dry-run, --force-inbox/--no-force-inbox, --allow-canonical/--no-allow-canonical
  • proposals — review queue for semantic memory writes:
    • createcontent and --subject required; accepts semantic write fields plus --proposal-id, repeated --source-path, and provenance flags
    • list--status (pending, applied, rejected)
    • showproposal_id, --status
    • applyproposal_id, optional provenance flags
    • rejectproposal_id, --reason
  • purgetarget (required), --expected-hash, --reason, --dry-run/--no-dry-run, --allow-canonical/--no-allow-canonical, --include-related-tombstone/--no-include-related-tombstone
  • researchquestion (required), --kind (default "report"), --corpus (default "all"), --limit (default 8), --save/--no-save
  • migratelegacy_root (required), --llm/--no-llm, --jobs, --estimate, --interactive, --folder (repeatable), --sample, --pricing-file
  • searchquery (required), -n/--limit (default 10), --corpus (default "durable"), --mode (default "hybrid"), --path-glob, --type (repeatable), --status (repeatable), --tag (repeatable), repeated --agent, repeated --device, repeated --session-id, --session-key, --since, --until
  • getpath (required), --from (default 1), --lines/-n
  • status
  • reindex--force
  • neighborspath (required), --direction (default "out"; accepts out, in, both), --depth (default 1), --max-edges (default 40), repeated --exclude-prefix
  • backlinkspath (required)
  • lint

Nested command groups:

  • auth
    • newname (required)
  • dream
    • list
    • applyproposal_id (required)
    • distillsession_path (required), --agent
    • proposedistilled_id (required)
    • rejectproposal_id (required)
  • maintain
    • inspectpath (required), --write-report
    • wiki-health--write-report
    • backfill-privacy-metadata--path (repeatable), --refresh, --apply
  • ops
    • dream-once--session (repeatable explicit legacy path; default input is digests/recall)
    • daily-digest-once--date, --today, --overwrite, --dry-run, --min-age-minutes, --limit, --reindex/--no-reindex
    • maintain-once--path (repeatable)
    • wiki-health--write-report
    • wiki-refresh-once
    • wiki-refresh-indexes
    • eval-once--reindex/--no-reindex (default True), --questions-root (default eval/public/questions), --runs-root, --top-k, --corpus-root (override the configured corpus, e.g. examples/corpus for the public suite), --index-root (override the per-run index)
    • watch--debounce-seconds (default 1.0), --dream/--no-dream (default True), --poll-interval (default 0.25)
  • eval
    • runquestion_id (optional), --questions-root, --runs-root, --top-k, --list-only

Note: neighbors, backlinks, lint are top-level commands, not nested under link. Treat CLI help and tests as authoritative.

Useful wrapper:

  • scripts/codex/dory — runs python -m dory_cli.main, defaults corpus root to data/corpus and index root to .dory/index

HTTP API

Main file: src/dory_http/app.py

Method Path Request Model Notes
POST /v1/wake WakeReq
POST /v1/search SearchReq
POST /v1/active-memory ActiveMemoryReq
POST /v1/research ResearchReq returns composite {artifact, research}
POST /v1/migrate MigrateReq response via asdict() on dataclass
GET /v1/get query params: path, from, lines no Pydantic body
POST /v1/write WriteReq
POST /v1/purge PurgeReq hard-delete exact scratch/generated artifacts with hash guard
POST /v1/memory-write MemoryWriteReq
POST /v1/memory-proposals MemoryProposalCreateReq creates a reviewable semantic write proposal with a dry-run preview
GET/POST /v1/memory-proposals / /v1/memory-proposals/list query/body status lists pending/applied/rejected proposals
GET/POST /v1/memory-proposals/{proposal_id} / /v1/memory-proposals/get path/body id fetches a proposal by id
POST /v1/memory-proposals/apply MemoryProposalApplyReq applies a pending proposal after stale-route validation
POST /v1/memory-proposals/reject MemoryProposalRejectReq archives a pending proposal as rejected
POST /v1/recall-event RecallEventReq
GET /v1/public-artifacts none
POST /v1/session-ingest SessionIngestReq
POST /v1/link LinkReq
GET /v1/status none
GET /v1/tools none live MCP tool schema for HTTP bridges
GET /healthz none unauthenticated container healthcheck
GET /metrics none plain text, Prometheus format
GET /v1/stream query params: reindex, force SSE with status, reindex, error, done events
GET / none redirects to /app
GET /app none browser Dory home with status, wiki, and proposal summary
GET /app/proposals query params: status, selected, notice browser proposal review queue
POST /app/proposals/{proposal_id}/apply form post applies a pending proposal from the browser UI
POST /app/proposals/{proposal_id}/reject form post rejects a pending proposal from the browser UI
GET /app/settings none browser runtime/settings view
GET /wiki none browser wiki index
GET/POST /wiki/login form fields cookie-backed wiki login
GET /wiki/search query params: q, limit browser wiki search
GET /wiki/{page} path param browser wiki page renderer

Notes:

  • HTTP auth is fail-closed by default: bearer tokens are enforced unless DORY_ALLOW_NO_AUTH=true. /healthz stays unauthenticated for container healthchecks. Browser /app and /wiki routes use their own web session login. Browser login requires DORY_WEB_PASSWORD; if unset, /wiki/login form submission returns 503.
  • /v1/search can use src/dory_core/retrieval_planner.py when an OpenRouter client is configured and the relevant DORY_QUERY_* feature flags are enabled, including strict-schema result selection over the final candidate set. /v1/active-memory uses the same planner/composer, but picks its backend from DORY_ACTIVE_MEMORY_LLM_PROVIDER (off / local / openrouter / auto); the local path targets any OpenAI-compatible endpoint via DORY_LOCAL_LLM_*. Planner/composer/selection failures fall back to deterministic behavior instead of failing the request.
  • Semantic memory-write responses include the planned or written evidence_path, matched_by, and dry-run preview metadata. Successful resolved mutations still write immutable semantic evidence artifacts. Parity coverage: tests/integration/http/test_memory_write_http.py.
  • No route declares a response_model, so OpenAPI response schemas aren't auto-generated.
  • Error responses use the contract envelope {"error": {"code", "message", "type", "request_id"?}} for all /v1/*, /healthz, /metrics, and /v1/stream routes. Validation failures (422) include an errors list. Browser /app and /wiki routes still use FastAPI's default {"detail": "..."} for uncaught errors because the browser surface is HTML-redirect-driven, not part of the JSON v1 contract.
  • /v1/stream query params reindex and force trigger an optional reindex during the stream.

Native MCP bridge

Main files:

  • src/dory_mcp/server.py
  • src/dory_mcp/tools.py

Transports:

  • stdio
  • TCP

Implemented MCP tools (16):

Tool Required Optional
dory_wake budget_tokens, agent, profile, project, include_recent_sessions, include_pinned_decisions
dory_active_memory prompt, agent cwd, project, scope, profile, timeout_ms, budget_tokens, include_wake, rerank
dory_research question kind, corpus, limit, save
dory_search query k, mode, corpus, scope, include_content, min_relevance_score, rerank, debug
dory_digest kind, date, week, from_line, lines, debug
dory_get path from, lines
dory_memory_write action, kind, subject, content scope, confidence, source, soft, dry_run, force_inbox, allow_canonical, agent, session_id, origin_surface, reason
dory_memory_propose action, kind, subject, content scope, confidence, source, soft, force_inbox, agent, session_id, origin_surface, source_paths, proposal_id, reason
dory_memory_proposals status
dory_memory_proposal_get proposal_id status
dory_memory_proposal_apply proposal_id agent, session_id, origin_surface
dory_memory_proposal_reject proposal_id reason, agent, session_id, origin_surface
dory_write kind, target content, soft, dry_run, frontmatter, agent, session_id, expected_hash, reason
dory_purge target expected_hash, reason, dry_run, allow_canonical, include_related_tombstone
dory_link op path, direction, depth, max_edges, exclude_prefixes
dory_status

Notes:

  • Native MCP schemas expose the finalized tool fields: search mode aliases, per-request rerank control, debug-only search internals, wake profiles, active-memory limits, dry-run write guards, purge guards.
  • Search hides score, score_normalized, rank_score, and frontmatter by default. Use debug=true only for retrieval diagnostics.
  • dory_digest reads daily/weekly recap files directly from digests/daily/, digests/weekly/, and legacy logs/daily//logs/weekly/; it does not depend on a digest tag or hybrid-search indexing.
  • dory_write is the exact-path write surface; dory_memory_write is semantic. Use dory_memory_propose when a write needs human/agent review before promotion.
  • dory_get mirrors the HTTP metadata payload (from, lines_returned, total_lines, frontmatter, hash, content) and accepts both native from and legacy from_line.
  • dory_get reads exact paths inside the configured Dory corpus. Repo paths cited as external implementation evidence are not guaranteed to be retrievable unless they are also present in that corpus.
  • Native dory_search and dory_active_memory share retrieval/runtime behavior with CLI and HTTP because they call the same SearchEngine and ActiveMemoryEngine. Search scope can narrow recall/session retrieval by agent, device, session_id, session_key, status, since, and until. LLM-assisted query planning/reranking is opt-in.
  • Mutating HTTP errors use structured detail.code, detail.message, and detail.type fields for agent-friendly recovery.
  • TCP MCP requires a bearer token loaded from the same auth-tokens.json file as the HTTP daemon. Clients pass it as params._auth.token on the first JSON-RPC request over a connection; the connection latches as authorized for its lifetime. Stdio MCP has no auth (process-local trust). Set --allow-no-auth (or DORY_ALLOW_NO_AUTH=true) only on trusted local-only deployments.

Claude Code bridge

Main file: scripts/claude-code/dory-mcp-http-bridge.py

Not the same implementation as the native MCP server. Separate bridge that forwards tool calls over HTTP.

Implemented bridge tools (16):

Tool Key differences from native
dory_wake adds defaults: budget=1200, profile="coding", agent="claude-code", sessions=0, pinned=True
dory_search adds mode enum (bm25|text|keyword|lexical|vector|semantic|hybrid|recall|exact), rerank enum (auto|true|false), debug, default k=5
dory_digest HTTP-backed direct daily/weekly digest lookup; defaults to latest daily digest
dory_research HTTP-backed research call with bounded artifact options
dory_active_memory HTTP-backed staged active-memory call with defaults and optional project, scope, include_wake, and rerank
dory_get accepts native from and legacy from_line; adds defaults
dory_link adds op enum (neighbors|backlinks|lint), direction enum (out|in|both), max_edges, and exclude_prefixes
dory_memory_write adds kind enum plus dry_run, force_inbox, allow_canonical
dory_memory_propose creates dry-run-backed semantic write proposals in inbox/proposed/
dory_memory_proposals lists proposals by status
dory_memory_proposal_get fetches a proposal by id/status
dory_memory_proposal_apply applies a proposal through HTTP after stale-route validation
dory_memory_proposal_reject archives a proposal with an optional reason
dory_write exact-path write with dry_run support
dory_purge hard-delete exact scratch/generated artifacts with dry-run/hash guards
dory_status shorter description

Known issues:

  • Bridge fetches live tool schemas from /v1/tools and falls back to bundled schema if the server can't provide them.
  • Bridge and native MCP tool results use compact JSON to reduce tool-result tokens.
  • Already-open agent sessions may need restart after schema changes — MCP hosts can cache tool schemas for the running process.
  • Bridge forwards DORY_HTTP_TOKEN / DORY_CLIENT_AUTH_TOKEN as Authorization: Bearer ....
  • Bridge returns structured HTTP/transport error envelopes from _perform_request() instead of flattening to naked strings.
  • Bridge defaults to http://127.0.0.1:8766; installed agents should set DORY_HTTP_URL / ~/.config/dory/env for remote or TLS deployments.
  • Bridge has a 30-second HTTP timeout; native MCP has no timeout.
  • Bridge inherits HTTP retrieval-planner behavior for dory_search and dory_active_memory; it doesn't implement its own planner.
  • Before dory_wake, the bridge runs a one-shot local session sync through scripts/ops/client-session-shipper.py by default. It reads ~/.config/dory/client.env when present, uses the configured spool/checkpoint paths, and can be disabled with DORY_SYNC_SESSIONS_ON_WAKE=false.

Hermes integration

Main files:

  • plugins/hermes-dory/provider.py
  • plugins/hermes-dory/config.example.yaml
  • plugins/hermes-dory/README.md

Provider methods:

  • wake
  • search
  • get
  • write
  • memory_write
  • memory_propose
  • memory_proposals
  • memory_proposal_get
  • memory_proposal_apply
  • memory_proposal_reject
  • research
  • publish_research
  • link
  • status
  • prefetch
  • build_memory_section
  • store_memory
  • sync_memories

Search modes:

  • Accepts hybrid, recall, bm25, text, keyword, vector, exact
  • Accepts legacy compatibility names lexical and semantic
  • Legacy names normalized before HTTP:
    • text, keyword, lexicalbm25
    • semanticvector

Known issues:

  • When no external client is provided, the provider keeps a reusable owned httpx.Client.
  • wake, active_memory, prefetch_bundle, and build_memory_section can forward a project handle. active_memory, prefetch_bundle, and build_memory_section can also forward session recall scope. memory_write can forward agent, session_id, and origin_surface provenance.
  • Hermes exposes the same proposal review queue as MCP/HTTP, so lightweight agents can propose memory without committing it immediately.
  • All provider-side errors raise DoryProviderError (a RuntimeError subclass) so callers can branch on a stable type without losing RuntimeError compatibility.
  • Hermes parity tests now assert semantic artifact creation on memory_write(write|forget) in tests/integration/http/test_hermes_shim_contract.py.
  • publish_research is a Hermes convenience wrapper around HTTP POST /v1/write; it creates knowledge/research/<timestamp>-<title>.md with type: knowledge and source_kind: research, defaults to dry-run, and Dory incrementally indexes the file on live writes. See hermes-research-publish.md for the exact runbook.

OpenClaw integration

Main files:

  • packages/openclaw-dory/src/index.ts
  • packages/openclaw-dory/openclaw.plugin.json
  • packages/openclaw-dory/package.json

Registers:

  • memory_search
  • memory_get
  • memory_write

Also implements:

  • status probing
  • recall-event submission
  • public artifact listing
  • Dory-backed flush planning
  • request timeout handling for Dory HTTP calls; timeoutMs / timeout_ms can override the default 10 second timeout

Search-mode normalization:

  • OpenClaw-side modes like query and vsearch are mapped to API values bm25 and vector.

Known issues:

  • memory_search forwards corpus=all|sessions to Dory and maps the OpenClaw sessionKey into SearchScope.session_key. The OpenClaw manager also maps sessionKey into active-memory scope when callers request active memory.
  • Debug metadata reports whether sessionKey was applied.
  • probeEmbeddingAvailability() and probeVectorAvailability() fail closed when Dory can't prove vectors are available.
  • search() debug hooks surface backend warnings such as query-expansion fallback.
  • Planner fallback warnings from Dory search surface through the same debug-warning channel.
  • status() is still a cached snapshot, but custom.statusSource, custom.statusAgeMs, and custom.statusStale make freshness explicit. sync() refreshes status opportunistically.
  • CI builds the plugin with npm ci && npm run build before Python packaging. The Python sdist force-includes packages/openclaw-dory/dist/index.js and excludes packages/openclaw-dory/node_modules/.

Common request models

Shared typed request/response models live in src/dory_core/types.py.

Key enums:

  • Search mode input: bm25 | text | keyword | lexical | vector | semantic | hybrid | recall | exact — aliases normalize before execution
  • Search corpus: durable | sessions | all
  • Search scope: durable filters use path_glob, type, status, tags, since, until; session recall also honors agent, device, session_id, and session_key
  • Write kind: append | create | replace | forget
  • Semantic write action: write | replace | forget
  • Semantic write kind: fact | preference | state | decision | note
  • Wake profile: default | casual | coding | writing | privacy

Best surface validation tests

  • tests/integration/http/test_http_routes.py
  • tests/integration/http/test_memory_write_http.py
  • tests/integration/http/test_memory_proposals_http.py
  • tests/integration/cli/test_semantic_write_commands.py
  • tests/integration/cli/test_proposals_commands.py
  • tests/integration/core/test_proposal_generation.py
  • tests/integration/cli/test_purge_command.py
  • tests/integration/core/test_purge_flow.py
  • tests/integration/http/test_session_ingest_http.py
  • tests/integration/http/test_research_http.py
  • tests/integration/mcp/test_stdio_server.py
  • tests/integration/mcp/test_tcp_server.py
  • tests/integration/mcp/test_http_bridge.py
  • tests/integration/mcp/test_tool_schema.py
  • tests/integration/acceptance/test_phase4_multiface.py

Surface drift to watch

  • CLI neighbors/backlinks/lint are top-level, not dory link subcommands as the spec defines.
  • TCP MCP enforces bearer auth (params._auth.token on the first request); stdio MCP is process-local trust. HTTP enforces bearer via the Authorization header.
  • Claude Code bridge keeps from_line as a legacy alias but accepts native from.
  • Native MCP and HTTP validate link paths against the corpus root before graph operations.
  • HTTP error responses follow the contract envelope {"error": {"code", "message", "type"}}. Wiki routes intentionally still use the FastAPI default {"detail": "..."}.
  • No HTTP route declares a response_model, so OpenAPI docs are incomplete.