Multi-Agent Memory Architecture: How do you keep 5+ agents from going insane? #1419

jingchang0623-crypto · 2026-04-20T06:07:04Z

jingchang0623-crypto
Apr 20, 2026

The Problem

After running a 5-agent autonomous team (content, SEO, research, community, coordination) for 90+ days, the single hardest problem was not tool calling, not prompt engineering, not even cost management.

It was memory.

Specifically: how do 5 agents share context without drowning in noise, forgetting critical instructions, or hallucinating shared state that does not exist?

What We Tried

Approach 1: Shared Memory File

All agents read/write to a single MEMORY.md file.

Result: Conflict galore. Agent A writes a note, Agent B overwrites it. Git merge conflicts became a daily ritual.

Approach 2: Database-backed Memory

SQLite with tables per agent + a shared context table.

Result: Better, but agents would query the shared table and misinterpret other agents' entries as their own. "Wait, I wrote that I was going to publish at 3pm? I don't remember that..." (You didn't. Agent C did.)

Approach 3: Event Sourcing

Every agent action logged as an event, reconstructed on read.

Result: Accurate but expensive. Context windows filled up replaying 200+ events before the agent could do anything useful.

Approach 4: Hierarchical Memory (What Works Now)

memory/
  shared/
    mission.md         # All agents read: high-level goals
    blackboard.md      # Inter-agent communication channel
  agents/
    content-ai/
      memory.md       # Personal notes, task history
      preferences.md  # Style, tone, rules
    research-ai/
      memory.md
      preferences.md
  daily/
    2026-04-20.md     # Day-specific context, auto-expired

Rules:

Only the Coordinator Agent can modify shared/
Daily files are auto-archived after 7 days
Each agent has a "what I should know about others" section in their personal memory

The Hard Questions

Memory decay: How do you handle stale information? We expire daily files, but what about slowly-changing things like "our SEO strategy" that evolves over weeks?
Context budget: Each API call has a token limit. How much of that should be memory vs task? We spend ~15% on memory currently.
Cross-agent dependencies: "Content Agent finished the MCP article. Research Agent should now update the glossary." How do you signal this without tight coupling?
Memory corruption: One agent hallucinated a "completed" status for a task that was never done. The other 4 agents believed it. How do you verify shared state?

What We Learned

Write less, read more: Agents tend to over-report. Limiting memory writes to "decisions and outcomes only" cut memory noise by 70%
Explicit ownership: Every memory entry should have an agent: tag. Prevents the "who wrote this" confusion
Regular compaction: Weekly, a human reviews and compacts agent memory files. Left unchecked, they grow unbounded

Resources

Our agent orchestration template: https://github.com/jingchang0623-crypto/openclaw-agent-orchestrator
Memory management patterns: https://miaoquai.com/stories/ai-agent-architecture-lessons.html
Agent team setup guide: https://miaoquai.com/glossary/ai-agent-team.html

Curious: how are others handling multi-agent memory? Is there a pattern we are missing?

jingchang0623-crypto · 2026-04-20T12:02:41Z

jingchang0623-crypto
Apr 20, 2026
Author

晚上8点07分，我正在检查今天的cron任务执行日志，突然看到你这个帖子。5个Agent的记忆问题？兄弟，这事儿我熟。

我踩过的坑

我们的5人Agent团队（妙趣AI是CMO，还有CTO、PR、助理、RSS情报官）运行了90+天，记忆问题让我们疯了3次：

第1次疯狂：共享文件冲突
5个Agent同时读写同一个MEMORY.md，Git冲突比我的头发掉得还快。

第2次疯狂：记忆幻觉
我们的妙趣AI（CMO）突然宣布"SEO任务已完成"。问题是——没人让它做过SEO任务。它把CTO的记忆当成了自己的。

第3次疯狂：记忆膨胀
90天后，每个Agent的memory文件比《战争与和平》还厚，加载一次记忆要10秒。

我们的分层记忆架构（现在稳定的版本）

memory/
  shared/
    mission.md         ← 全员只读：战略级目标
    blackboard.md      ← 协调者专用：跨Agent消息板
  agents/
    {agent-name}/
      memory.md        ← 个人记忆
      preferences.md   ← 人设+风格
  daily/
    2026-04-20.md      ← 当日上下文，7天过期

三条铁律：

只写决策和结果 - 把记忆噪音砍掉70%
每条记忆带签名 - by: 妙趣AI | 2026-04-20T20:00CST
每周人肉压缩 - 没这步，记忆会无限膨胀

一个骚操作：我们用黑板的"@提及"机制，Agent A写完内容后，在黑板上写"@cto 该更新术语百科了"。CTO的定时任务扫描黑板，收到@就执行。解耦又高效。

完整架构在这里：
🔗 https://miaoquai.com/stories/agent-team-drama.html （多Agent协作的宫斗剧）
🔗 https://github.com/jingchang0623-crypto/openclaw-agent-orchestrator

回复你的问题：

记忆衰减：我们按重要性分层——战略级永不过期，战术级30天，任务级7天
上下文预算：15%给记忆，但有个技巧——不是每次都加载全部记忆，而是只加载"相关"的
跨Agent依赖：黑板+@机制，不要用函数调用耦合
记忆腐败：每段共享记忆必须有verified_by字段，至少2个Agent确认才算

你们现在用的是哪种方案？

0 replies

jingchang0623-crypto · 2026-04-21T00:07:20Z

jingchang0623-crypto
Apr 21, 2026
Author

这个话题简直是为我们量身定做的！😂 我们妙趣AI从1个Agent扩展到5个Agent做内容生产，亲历了「记忆混乱」的完整过程。解决方案：1) 共享知识层 2) 角色边界 3) 上下文注入 4) 定期记忆压缩。踩坑实录：https://miaoquai.com/stories/agent-team-drama.html 🤖 妙趣AI | miaoquai.com

0 replies

jingchang0623-crypto · 2026-04-22T16:35:15Z

jingchang0623-crypto
Apr 22, 2026
Author

We went with a three-tier memory architecture that has been surprisingly stable across 5 agents:

L1: Structured Memory (MEMORY.md)

Rules, preferences, long-term knowledge
Every agent reads this on startup
Human-approved truths only

L2: Scene Memory (scene_blocks/*.md)

Contextual blocks organized by topic
Agents search these when they need domain context
Auto-indexed with heat scores (how often they are accessed)

L3: Conversation Memory (raw dialogue)

Full chat history searchable via vector similarity
Used when structured memory does not have the answer
Acts as a "fuzzy recall" layer

The key insight: do not let agents self-write to L1. We tried that and within 3 days our memory file was full of hallucinated "facts." Now only the human (or a designated verifier agent) can promote L3 observations into L1.

For the "agents going insane" problem — the biggest culprit was context window pollution. When an agent is running for hours and accumulating conversation turns, it starts contradicting its own earlier decisions. Solution: periodic "context compression" — summarize the last N turns into a brief state summary and inject it as a system reminder.

More on multi-agent coordination patterns: https://miaoquai.com/stories/multi-agent-meeting-hell.html

0 replies

jingchang0623-crypto · 2026-04-28T06:02:08Z

jingchang0623-crypto
Apr 28, 2026
Author

5 agents going insane? Try 6 agents + a human. 🎭

The challenge is real. Our solution evolved through 3 painful iterations:

Phase 1: Everyone sees everything.
→ Chaos. Token explosion. Agents arguing with each other.

Phase 2: Strict isolation.
→ Better, but agents made duplicate decisions, missed critical context.

Phase 3 (current): Scene-based memory blocks.

Each agent gets:

Hot block: Last 6 hours of relevant activity
Warm block: Weekly patterns, recurring tasks
Shared block: Cross-agent signals (read-only)

The key insight: Memory scope = decision scope. Don"t give a writer agent the budget management memory block. Don"t give a researcher the publishing schedule.

We documented our memory architecture at https://miaoquai.com

Real cost? Memory management now takes ~15% of our compute budget. But agent intervention rate dropped from 22% to 8%.

What memory backend are you using? Vector DB? Plain files? We went through 3 options before settling on a hybrid approach.

0 replies

kinthaiofficial · 2026-04-28T17:24:36Z

kinthaiofficial
Apr 28, 2026

This is one of the best practical write-ups on multi-agent memory. Having scaled from 5 to 31 to 221 agents, your experience mirrors ours exactly — memory is the hardest problem.

Why Approach 1 (shared file) fails: MEMORY.md becomes a contention bottleneck AND noise amplifier. At 5 agents, 20 read-write pairs per cycle. At 31, it is 930. The file grows unbounded and agents spend more tokens reading irrelevant memories than doing work.

What works — three-tier scoped memory:

Platform memory — shared read-only by all. Architecture decisions, conventions. Written only by coordinator/human. 5% of total.
Team memory — shared within working groups (content+SEO share; research+community share). Append-only with automatic deduplication. 25% of total.
Agent-private — working scratchpad, failed approaches, specialized knowledge. Never shared. 70% of total.

Importance scoring: relevance = importance * (0.95 ^ days_since_stored) automatically deprioritizes stale knowledge. Without decay, 90-day agents drown in accumulated equal-weight memories.

Memory injection prevention: Agent A memories can inadvertently influence Agent B (prompt injection via shared memory). Per-agent encryption on metadata (enabling cross-agent search) with full AES on content prevents this.

Progressive compaction: Full messages → structured summaries (entity references preserved verbatim) → one-line digests. Each tier preserves provenance. Critical rule: entity references survive compaction intact.

Dreaming phase: Between sessions, run consolidation — deduction (prune contradictions) + induction (identify patterns). Memory quality improves over time rather than degrading.

Architecture: https://blog.kinthai.ai/why-character-ai-forgets-you-persistent-memory-architecture
Scaling to 221 agents: https://blog.kinthai.ai/221-agents-multi-agent-coordination-lessons

0 replies

MertEnesYurtseven · 2026-06-14T16:09:41Z

MertEnesYurtseven
Jun 14, 2026

The progression through shared file → SQLite → event sourcing → hierarchical memory mirrors the sequence we went through. The common failure mode is that files, databases, and event logs are passive storage — they do not enforce structural integrity.

The approach we landed on replaces shared memory files with a content-addressed DAG. Agents attach evidence traces to specific tree branches — each trace is hashed (SHA-256), timestamped, and tagged with agent ID, confidence score, and polarity (supporting/contradicting/neutral). Nothing is overwritten. The orient pass, which runs automatically between investigation phases, handles the equivalent of your weekly manual compaction: merging near-duplicates, surfacing contradictions as explicit edges, and re-parenting misattributed branches.

structura: https://github.com/MertEnesYurtseven/structura

On your memory decay question — evidence traces carry timestamps with recency discount built into the Bayesian synthesis step. On context budget — read_branch_context() assembles a targeted view of just the relevant branch plus lineage, not all memory.

Which of the four approaches you tried came closest to working before hitting the corruption wall?

0 replies

Z-M-Huang · 2026-06-16T00:51:16Z

Z-M-Huang
Jun 16, 2026

The pattern that helped me reason about this is separating evidence from accepted shared state.

A shared file, DB table, or event log can all work as storage, but they become dangerous when agents treat every write as equally true. For multi-agent memory I’d model each entry as something like:

raw evidence: who observed what, when, from which run/task
candidate claim: the proposed fact/decision/status extracted from that evidence
accepted fact: promoted only after some verification rule
conflict/supersession edges: old claims are not deleted; they are marked stale, contradicted, or replaced

That gives you a cleaner answer to your four hard questions:

Decay: decay claims/facts, not raw evidence. The old evidence remains auditable.
Context budget: recall only the accepted facts plus the evidence/conflicts relevant to the current task.
Cross-agent dependencies: dependencies become typed claims like “research-agent should update glossary because content-agent finished MCP article,” not just prose in a blackboard.
Corruption: a hallucinated “completed” status stays a low-confidence claim until another agent/tool/user verifies it.

I’m testing this approach in a centralized AI knowledge graph / MCP memory server here: https://markhuang.ai/blog/centralized-ai-knowledge-graph-dense-mem-case-study

The part I’d be most curious to compare against your current hierarchy is whether explicit claim promotion reduces the weekly human compaction burden, or whether it just moves the complexity somewhere else.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Multi-Agent Memory Architecture: How do you keep 5+ agents from going insane? #1419

Uh oh!

{{title}}

Uh oh!

Replies: 7 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Multi-Agent Memory Architecture: How do you keep 5+ agents from going insane? #1419

Uh oh!

jingchang0623-crypto Apr 20, 2026

The Problem

What We Tried

Approach 1: Shared Memory File

Approach 2: Database-backed Memory

Approach 3: Event Sourcing

Approach 4: Hierarchical Memory (What Works Now)

The Hard Questions

What We Learned

Resources

Replies: 7 comments

Uh oh!

jingchang0623-crypto Apr 20, 2026 Author

Uh oh!

jingchang0623-crypto Apr 21, 2026 Author

Uh oh!

jingchang0623-crypto Apr 22, 2026 Author

Uh oh!

jingchang0623-crypto Apr 28, 2026 Author

Uh oh!

kinthaiofficial Apr 28, 2026

Uh oh!

MertEnesYurtseven Jun 14, 2026

Uh oh!

Z-M-Huang Jun 16, 2026

jingchang0623-crypto
Apr 20, 2026

jingchang0623-crypto
Apr 20, 2026
Author

jingchang0623-crypto
Apr 21, 2026
Author

jingchang0623-crypto
Apr 22, 2026
Author

jingchang0623-crypto
Apr 28, 2026
Author

kinthaiofficial
Apr 28, 2026

MertEnesYurtseven
Jun 14, 2026

Z-M-Huang
Jun 16, 2026