Retrieval is not just search. It is the process by which an agent assembles the right memory before it answers a question, changes code, reviews a design, or writes a new entry.
If architecture defines how memory is stored, retrieval defines how that memory becomes usable.
Good retrieval does two things at once:
- it brings in the rules that should shape the current task
- it keeps irrelevant memory out of the context window
Poor retrieval usually looks like one of these failures:
- the agent misses a constraint that should have applied
- the agent loads too much low-value context and buries the important rule
- the agent combines entries from the wrong scope and invents a contradiction
- the agent treats stale memory as current memory
Retrieval quality is therefore part of the memory model itself, not a convenience feature layered on top.
Retrieval has two stages:
- choose the right scope
- apply entries in the right authority order
That order matters. A good retrieval system does not start by asking "what looks semantically similar?" It starts by asking:
- which wing is active
- which room is most likely relevant
- which entry types must be applied first
In practice, retrieval is healthiest when it is:
- scoped before it is widened
- ordered by authority, not similarity alone
- filtered to the smallest useful set
- resilient when the preferred source is unavailable
Within retrieved memory, apply entries in this order:
- Invariants — hard constraints
- Decisions — current design direction
- Patterns — reusable ways of solving the problem
- Notes — supporting context
Entries with status: deprecated stay out of default retrieval unless explicitly requested.
The reasoning is straightforward:
invariantanswers what must not be violateddecisionanswers which direction is currently authoritativepatternanswers how similar problems are usually solvednoteanswers what extra context may still help
Lower-priority entries must not silently override higher-priority ones.
If memory contains:
- one
invariantsaying callbacks must not block - one
decisionsaying this project uses a single-threaded executor by default - one
patternshowing how to offload long-running work - one
notedescribing a past debugging session
the note may add context, but it must never weaken the invariant or reinterpret the decision.
Retrieval is scoped to the active wing by default.
If an agent is working on a ROS 2 project task, it should start with the relevant project wing or the shared ros2 wing rather than searching across unrelated memory.
Cross-wing retrieval is allowed, but it should be explicit.
In practice, retrieval usually starts from one of these scopes:
- a project wing such as
lifecore_ros2 - a shared wing such as
ros2orreact - a specific room such as
architectureoranti-patterns
The active scope should be narrow by default. Widen it only when the task clearly crosses boundaries.
If the task is "update a project-specific ROS 2 lifecycle node":
- retrieve from the project wing first
- retrieve from the shared
ros2wing second - merge results
- let project entries override shared entries only when the override is explicit
If the task is "explain general ROS 2 callback behavior", the project wing may be unnecessary.
If the task is "review whether a local project rule duplicates shared React guidance", both project and shared wings are necessary from the start.
Within a scope, retrieval can be narrowed:
- By room
- By type
- By status
Default retrieval should focus on active entries and include under_review entries only when their uncertainty is made clear.
Useful defaults:
- retrieve by room when the task is tightly scoped
- retrieve by type when the task is evaluative, such as review or policy checking
- retrieve by status only when looking for stale or superseded content intentionally
These are conceptual query shapes, not backend-specific syntax:
project_wing=lifecore_ros2, room=architecture, type=invariant|decision, status=active
shared_wing=ros2, room=anti-patterns, type=invariant, status=active
project_wing=myapp, room=incident-log, status=under_review|active
project_wing=myapp, room=architecture|contracts, type=decision|pattern, status=active
The goal is not to search everything. The goal is to retrieve the smallest set of entries that can still govern the task correctly.
Retrieval should degrade gracefully when the memory backend is unavailable, incomplete, or returns uncertain results.
Use this fallback order:
- memory backend
- in-repo architecture docs
- README or other overview docs
- workspace search
In shorthand:
memory backend -> docs -> README -> workspace search
The agent should not stop working just because the preferred retrieval layer failed. Memory is an input, not a prerequisite.
Suppose a task requires lifecycle rules, but the memory backend is down.
A sensible fallback is:
- read docs/architecture.md
- read README.md for the repo-level summary
- search the workspace for lifecycle-specific guidance
- continue the task while making the degraded retrieval path explicit
The fallback path should preserve momentum, not perfectly reproduce the backend.
Retrieved memory uses context space, so degradation should be deliberate.
When context gets tight:
- keep invariants first
- keep decisions in full when they directly shape the task
- summarize patterns if needed
- include notes only when they add direct value
This is not an optimization detail. It is a retrieval policy.
- Keep all applicable
invariantentries. - Keep only the
decisionentries that shape the current task. - Summarize matching
patternentries into short operational guidance if needed. - Drop
noteentries unless they clarify uncertainty, explain a real exception, or prevent a likely mistake.
Suppose the agent is reviewing a change in a project wing with:
- 2 relevant invariants
- 3 active decisions
- 6 patterns
- 9 notes
Under context pressure, a good retrieval result may keep:
- both invariants in full
- the 2 decisions that directly constrain the touched component
- a short summary of the 2 most relevant patterns
- no notes at all
The point is not completeness. The point is preserving authoritative context.
When reducing retrieval output, compress in this order:
- drop irrelevant notes
- summarize patterns
- trim marginal decisions
- widen only if the remaining context is still insufficient
Do not compress invariants away.
For a code review on a lifecycle component:
- query
project_wing/architecture - query
project_wing/anti-patterns - query
shared_wing/architectureorshared_wing/conventionsif needed - apply invariants first
- report any contradiction rather than resolving it silently
For modifying a repository-layer component:
- retrieve project
architectureinvariants and decisions - retrieve shared framework patterns only if the change depends on them
- ignore incident logs unless the component has a known recurring failure
- keep only the entries that constrain the touched code path
For writing a new memory entry:
- retrieve the target room first
- retrieve nearby rooms only if overlap seems likely
- check the shared wing if the topic may be reusable
- decide whether to enrich or create only after that read
For explaining a general concept rather than editing code:
- start with the shared wing
- only bring in project memory if the user asks for project-specific behavior
- prefer decisions and patterns over notes
- include invariants whenever the explanation risks implying the wrong behavior
Retrieval usually fails in one of three ways:
- the agent uses an entry that does not exist
- the agent ignores an entry that does exist
- the agent applies an entry outside its scope
When that happens, the fix is usually to review:
- entry types
- scope assignment
- room placement
- status accuracy
- instruction-file behavior
Wrong-scope retrieval
- symptom: the agent applies a shared convention where a project override exists
- likely cause: retrieval skipped the project wing or ignored the override marker
- fix: query the project wing first and treat undocumented contradictions as review issues
Over-retrieval
- symptom: the agent cites many loosely related entries but misses the key rule
- likely cause: the query was too broad or had no room filter
- fix: narrow by room and keep authority order visible
Under-retrieval
- symptom: the answer is clean but violates an invariant
- likely cause: invariant retrieval was not unconditional
- fix: make invariant retrieval mandatory for the active scope
Stale retrieval
- symptom: the agent relies on an entry that no longer matches the current system
- likely cause:
under_reviewordeprecatedhandling is weak - fix: strengthen status handling and maintenance review
Similarity-led retrieval
- symptom: a semantically similar note outranks a binding decision
- likely cause: retrieval was driven by similarity without respecting type order
- fix: resolve scope and type before using similarity as a ranking aid
For most tasks, this sequence is enough:
- identify the active wing
- identify the most likely room
- retrieve
invariantanddecisionentries first - add
patternentries only if they help with implementation or explanation - add
noteentries only if they clarify uncertainty - fall back to local docs and workspace search if needed
- widen scope only when the task genuinely crosses boundaries
[ ] Active wing identified
[ ] Likely room identified
[ ] Invariants retrieved first
[ ] Decisions retrieved next
[ ] Patterns added only when useful
[ ] Notes added only when justified
[ ] Deprecated entries excluded by default
[ ] Fallback path available if the backend is unavailable
Retrieval is healthy when it is:
- scoped before it is widened
- ordered by authority
- filtered to the smallest useful set
- resilient when the backend is unavailable
- explicit about context-window tradeoffs
If retrieval feels noisy, the fix is usually not to search harder. It is to improve scope, types, status handling, and fallback behavior.