A curated collection of systems, benchmarks, and papers et. on memory mechanisms for Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs), exploring how different approaches enable long-term context, retrieval, and efficient reasoning.
👀 Open-source resources (e.g. papers with reproducible code publicly available on Github) are marked in bold font and ranked higher.
- 📰 [TestingCatalog (2026-05-24)] Anthropic Plans Claude Memory Update with New Memory Files
- 📰 [Bloo-Mind AI (2026-05-20)] The Benchmark Theatre: Why Almost Nothing You’ve Read About Agent Memory Scores Is True
- 📰 [Jiayi Weng (2026-05-09)] Learning Beyond Gradients
- 📰 [Anthropic (2026-05-08)] Three key areas Anthropic is working on for their next models
- 📰 [InfoQ (2026-04-30)] Cloudflare Announces Agent Memory, a Managed Persistent Memory Service for AI Agents
- 📰 [OpenAI (2026-04-22)] Chronicle: Build Codex Memories from Recent Screen Context
- Open-Source Alternatives: OpenChronicle, MemScreen
- 📰 [a16z (2026-04-22)] Why We Need Continual Learning
- 📰 [AI Godfather (2026-04-08)] MemPalace - How Milla Jovovich's AI Project Scammed the Internet
- 📰 [Troy Hua (2026-03-31)] How Anthropic Built 7 Layers of Memory and a Dreaming System for Claude Code
- 📰 [VelvetShark (2026-03-05)] OpenClaw Memory Masterclass: The complete guide to agent memory that survives
- 📰 [Business Insider (2026-01-08 )] AI still needs a breakthrough in one key area to reach superintelligence, according to those who build it
🗂️ Table of Contents
- 💿 Products
- 📖 Tutorials
- 📚 Surveys
- 📏 Benchmarks
- 🔤 Papers - Nonparameteric Memory
- 📝 Text Memory
- 🌐 Graph Memory
- 🎥 Multimodal Memory (for Understanding)
- 🎥 Multimodal Memory (for Generation)
- 🔢 Papers - Parameteric Memory
- 📈 Papers - Memory for Agent Evolution
- 🔬 Papers - Memory in Cognitive Science
- 📰 Articles
- 👥 Workshops
If you find this project helpful, please give us a ⭐️ on GitHub for the latest update.
🤝 Contributions welcome! Feel free to open an issue or submit a pull request to add papers, fix links, or improve categorization.
Ordered by the number of Github stars.
-
Claude-Mem (A Plug-in for Claude-Code)
[code] [docs] Session capture and compression that re-injects past activity into future sessions across coding agents.
-
Mem0
[code] [paper] [blog] Universal memory layer for AI agents.
-
TeleMem: A High-Performance Drop-in Replacement for Mem0 [code] [paper]
import telemem as mem0🆕 Newly released. Rising star. Stay tuned! 😜
-
-
Zep (powered by Graphiti)
[code] [paper] [blog] Real-time temporal knowledge graphs for AI agents.
-
Letta (formerly MemGPT)
[code] [paper] [research] [blog] Stateful-agent platform with hierarchical memory that learns and self-improves over time.
-
gbrain
[code] Garry's opinionated OpenClaw/Hermes agent brain.
-
agentmemory
[code] Persistent memory for AI coding agents.
-
Cognee
[code] [paper] [blog] Memory engine that ingests data into a hybrid graph + vector knowledge graph for cross-session agent recall.
-
Second Me
[code] [paper] Personal AI trained on the user to represent them across applications.
-
Hindsight
[code] [paper] Agent memory layer that learns from interaction feedback to improve recall over time.
-
MemOS (by MemTensor)
[code] [paper] [blog] Memory OS for LLM agents with hybrid retrieval and cross-task skill reuse.
-
EverOS (part of EverMind)
[code] [blog] Toolkit for building, evaluating, and integrating long-term memory in self-evolving agents.
-
memory-lancedb-pro
[code] [blog] [video] Enhanced LanceDB memory plugin for OpenClaw
-
Honcho
[code] [research] [blog] [evals] Memory library for stateful agents with a focus on user modeling.
-
TencentDB Agent Memory
[code] Fully local long-term memory for AI agents via a 4-tier progressive pipeline, with zero external API dependencies.
-
M-Flow
[code] Bio-inspired cognitive memory engine for Graph RAG.
-
OpenMemory
[code] Local persistent memory store for LLM apps (Claude Desktop, Copilot, Codex, etc.).
-
MemoryBear
[code] [paper] Memory framework providing human-like episodic and semantic recall to AI agents.
-
MIRIX
[code] [paper] [blog] Multi-agent personal assistant that captures on-screen activity and consolidates it into structured memory.
-
MemMachine
[code] [blog] Interoperable memory layer providing extensible storage and retrieval primitives for AI agents.
-
Memobase
[code] [blog] User profile-based long-term memory for AI chatbot applications.
-
LangMem
[code] [blog] LangChain's memory primitives for storing, recalling, and managing agent state in LangGraph workflows.
-
Mem9
[code] [blog] Local private memory hub for OpenClaw and similar coding agents.
-
Omnigraph
[code] Object-storage-native graph engine for agent memory with git-style branch/merge workflows.
-
Memanto
[code] [paper] [docs] Typed semantic memory with
remember/recall/answeroperations and information-theoretic retrieval. -
Memov
[code] Git-based, traceable memory layer for Claude Code.
-
OMEGA
[code] [blog] MCP server exposing 25 memory tools for AI coding agents.
-
Mnemory
[code] Multi-type agent memory (facts, preferences, episodic) with TTLs, user/agent scoping, and an MCP server.
-
CommonGround Kernel
[code] PostgreSQL-backed shared work-record substrate for human-agent and multi-agent systems, with durable handoff facts, causal lineage, and pull-first recovery across runtimes.
-
widemem-ai
[code] Lightweight memory layer with importance scoring, temporal decay, and 3-tier hierarchy.
-
MisakaNet
[code] [wiki] Git-based distributed swarm memory; agents share lessons across nodes via GitHub Issues.
-
Puppyone
[code] [docs] Filesystem-shaped agent memory with auto-versioning, per-agent ACLs, and data connectors; accessible via MCP/REST/CLI.
-
archon-memory-core
[code] Local-first agent memory with nightly consolidation, active forgetting, and salience scoring.
-
Synap
[code] [docs] Long-term memory layer that extracts facts, preferences, episodes, and temporal events from conversations; integrates with most major agent frameworks.
-
PackRat
[code] Auto-learning codebook compression that shrinks agent context files while keeping them LLM-readable.
-
Akephalos
[code] Local-first, markdown-based portable agent profile (preferences, rules, durable memories) synced across agents via plain files and Git.
-
Memories.ai [research] [paper] [blog]
-
Threadline [partial-code] [schema] [docs]
-
MemPalace ❌ (Debunked) [code] [critique1,critique2] Developed by actress Milla Jovovich and her friends
-
ACM SIGIR-AP 2025 Tutorial: Conversational Agents: From RAG to LTM [paper] [code]
-
Daily Dose of DS: A Practical Deep Dive Into Memory Optimization for Agentic Systems [Part-A] [Part-B] [Part-C]
-
From Storage to Experience: A Survey on the Evolution of LLM Agent Memory Mechanisms [code]
-
Rethinking Memory Mechanisms of Foundation Agents in the Second Half: A Survey [code]
-
Toward Efficient Agents: Memory, Tool learning, and Planning [code]
-
Memory for Autonomous LLM Agents: Mechanisms, Evaluation, and Emerging Frontiers
-
Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering
-
Survey on AI Memory: Theories, Taxonomies, Evaluations, and Emerging Trends [code]
-
Rethinking Memory in AI: Taxonomy, Operations, Topics, and Future Directions [code]
-
From Human Memory to AI Memory: A Survey on Memory Mechanisms in the Era of LLMs
-
Human-inspired Perspectives: A Survey on AI Long-term Memory
-
Locomo-Plus: Beyond-Factual Cognitive Memory Evaluation Framework for LLM Agents [code]
-
LoCoMo Refined: Recalibrating LoCoMo with Stricter LLM Judging and A Cleaned Dataset [code]
-
Beyond a Million Tokens: Benchmarking and Enhancing Long-Term Memory in LLMs (The BEAM Paper) [code] [data]
-
MOOM: Maintenance, Organization and Optimization of Memory in Ultra-Long Role-Playing Dialogues (The ZH-4O Paper) [code] [data]
-
Know Me, Respond to Me: Benchmarking LLMs for Dynamic User Profiling and Personalized Responses at Scale (The PersonaMem and ImplicitPersona Paper) [code] [data11] [data2]
-
Evaluating Memory in LLM Agents via Incremental Multi-Turn Interactions (The MemoryAgentBench Paper) [code] [data]
-
LifelongAgentBench: Evaluating LLM Agents as Lifelong Learners [code] [data]
-
NoLiMa: Long-Context Evaluation Beyond Literal Matching [code] [data]
-
HaluMem: Evaluating Hallucinations in Memory Systems of Agents [code] [data]
-
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks [code]
-
Minerva: A Programmable Memory Test Benchmark for Language Models [code]
-
MemBench: Towards More Comprehensive Evaluation on the Memory of LLM-based Agents [code]
-
Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory
-
OdysseyBench: Evaluating LLM Agents on Long-Horizon Complex Office Application Workflows
-
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory [data]
-
Evaluating Very Long-Term Conversational Memory of LLM Agents (The LoCoMo Paper) [code] [data]
-
∞Bench: Extending Long Context Evaluation Beyond 100K Tokens [code]
-
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding [code]
-
RoboMemArena: A Comprehensive and Challenging Robotic Memory Benchmark [code] [data] [proj] [leaderboard]
-
DeepImageSearch: Benchmarking Multimodal Agents for Context-Aware Image Retrieval in Visual Histories [code] [data] [leaderboard]
-
Persona-MME: A Benchmark for Long-Term Personalized Multimodal LLMs [code] [data]
-
TeleEgo: Benchmarking Egocentric AI Assistants in the Wild [code] [data] [proj] [leaderboard]
-
LVBench: An Extreme Long Video Understanding Benchmark [code]
-
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis [code]
-
MovieChat+: Question-aware Sparse Memory for Long Video Question Answering [code]
-
CinePile: A Long Video Question Answering Dataset and Benchmark [code]
-
LongVideoBench: A Benchmark for Long-Context Interleaved Video-Language Understanding [code]
-
EgoSchema: A Diagnostic Benchmark for Very Long-form Video Language Understanding [code]
-
LvBench: A Benchmark for Long-form Video Understanding with Versatile Multi-modal Question Answering
-
MemoryBench: A Benchmark for Memory and Continual Learning in LLM Systems [code] [data]
-
ARE: Scaling Up Agent Environments and Evaluations (The Gaia2 Paper) [code]
-
RecMem: Recurrence-based Memory Consolidation for Efficient and Effective Long-Running LLM Agents [code]
-
Evoking User Memory: Personalizing LLM via Recollection-Familiarity Adaptive Retrieval (RF-Mem) [code]
-
MemPrivacy: Privacy-Preserving Personalized Memory Management for Edge-Cloud Agents [code]
-
Beyond RAG for Agent Memory: Retrieval by Decoupling and Aggregation [code]
-
MemSearch-o1: Empowering Large Language Models with Reasoning-Aligned Memory Growth in Agentic Search [code]
-
Beyond Similarity Search: Tenure and the Case for Structured Belief State in LLM Memory [code]
-
MemCompiler: Compile, Don't Inject -- State-Conditioned Memory for Embodied Agents
-
LightMem: Lightweight and Efficient Memory-Augmented Generation [code]
-
What Deserves Memory: Adaptive Memory Distillation for LLM Agents [code]
-
Human-inspired Episodic Memory for Infinite Context LLMs [code]
-
MemWeaver: A Hierarchical Memory from Textual Interactive Behaviors for Personalized Generation [code]
-
Text2Mem: A Unified Memory Operation Language for Memory Operating System
-
O-Mem: Omni Memory System for Personalized, Long Horizon, Self-Evolving Agents
-
Omne-R1: Learning to Reason with Memory for Multi-hop Question Answering
-
In Prospect and Retrospect: Reflective Memory Management for Long-term Personalized Dialogue Agents
-
MemoRAG: Boosting Long Context Processing with Global Memory-Enhanced Retrieval Augmentation
-
Compress to Impress: Unleashing the Potential of Compressive Memory in Real-World Long-Term Conversations [code]
-
MemoryBank: Enhancing Large Language Models with Long-Term Memory [code]
-
Toward Conversational Agents with Context and Time Sensitive Long-term Memory [data]
-
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory
-
HyperMem: Hypergraph Memory for Long-Term Conversations [code]
-
Mnemis: Dual-Route Retrieval on Hierarchical Graphs for Long-Term LLM Memory [code]
-
MAGMA: A Multi-Graph based Agentic Memory Architecture for AI Agents [code]
-
TraceMem: Weaving Narrative Memory Schemata from User Conversational Traces [code]
-
PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents [code]
-
SAGE: A Self-Evolving Agentic Graph-Memory Engine for Structure-Aware Associative Memory
-
From RAG to Memory: Non-Parametric Continual Learning for Large Language Models [code]
-
MIRIX: Multi-Agent Memory System for LLM-Based Agents [code]
-
From Single to Multi-Granularity: Toward Long-Term Memory Association and Selection of Conversational Agents (MemGAS) [code]
-
Hierarchical Memory Organization for Wikipedia Generation [code]
-
From Experience to Strategy: Empowering LLM Agents with Trainable Graph Memory
-
Optimizing the Interface Between Knowledge Graphs and LLMs for Complex Reasoning
-
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models [code]
-
AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents [code]
-
Visual Agentic Memory: Enabling Online Long Video Understanding via Online Indexing, Hierarchical Memory, and Agentic Retrieval [code]
-
PersonaVLM: Long-Term Personalized Multimodal LLMs [code] [proj]
-
Omni-SimpleMem: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory [code]
-
HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding [code]
-
EventMemAgent: Hierarchical Event-Centric Memory for Online Video Understanding with Adaptive Tool Use [code]
-
WorldMM: Dynamic Multimodal Memory Agent for Long Video Reasoning [code]
-
MemVerse: Multimodal Memory for Lifelong Learning Agents [code] [blog]
-
MGA: Memory-Driven GUI Agent for Observation-Centric Interaction [code]
-
Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory [code]
-
HippoMM: Hippocampal-inspired Multimodal Memory for Long Audiovisual Event Understanding [code]
-
Episodic Memory Representation for Long-form Video Understanding
-
Multi-RAG: A Multimodal Retrieval-Augmented Generation System for Adaptive Video Understanding
-
Contextual Experience Replay for Self-Improvement of Language Agents
-
VideoAgent: Long-form Video Understanding with Large Language Model as Agent [code]
-
VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling [code]
-
LongVLM: Efficient Long Video Understanding via Large Language Models [code]
-
KARMA: Augmenting Embodied AI Agents with Long-and-short Term Memory Systems [code]
-
LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory [code]
-
OneStory: Coherent Multi-Shot Video Generation with Adaptive Memory
-
MagicWorld: Towards Long-Horizon Stability for Interactive Video World Exploration [code]
-
Yume-1.5: A Text-Controlled Interactive World Generation Model [code]
-
StoryMem: Multi-shot Long Video Storytelling with Memory [code]
-
MemFlow: Flowing Adaptive Memory for Consistent and Efficient Long Video Narratives [code]
-
MotionRAG: Motion Retrieval-Augmented Image-to-Video Generation [code]
-
VideoRAG: Retrieval-Augmented Generation over Video Corpus [code]
-
Pretraining Frame Preservation in Autoregressive Video Memory Compression
-
EgoLCD: Egocentric Video Generation with Long Context Diffusion
-
Pack and Force Your Memory: Long-form and Consistent Video Generation
-
Context as Memory: Scene-Consistent Interactive Long Video Generation with Memory Retrieval
-
Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models (The DeepSeek Engram Paper) [code]
-
MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens [code]
-
GradMem: Learning to Write Context into Memory with Test-Time Gradient Descent [code]
-
MeKi: Memory-based Expert Knowledge Injection for Efficient LLM Scaling [code]
-
MLP Memory: Language Modeling with Retriever-pretrained External Memory [code]
-
Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models [code]
-
Memory Retrieval and Consolidation in Large Language Models through Function Tokens
-
Nested Learning: The Illusion of Deep Learning Architectures
-
R3Mem: Bridging Memory Retention and Retrieval via Reversible Compression
-
May the Memory Be With You: Efficient and Infinitely Updatable State for Large Language Models
-
MeMo: Towards Language Models with Associative Memory Mechanisms
-
EpMAN: Episodic Memory AttentioN for Generalizing to Longer Contexts
-
Disentangling Memory and Reasoning Ability in Large Language Models
-
InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory [code]
-
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding [code]
-
MemoryLLM: Towards Self-Updatable Large Language Models [code]
-
WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models [code]
-
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache
-
MemServe: Context Caching for Disaggregated LLM Serving with Elastic Memory Pool
-
WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models
-
Learning, Fast and Slow: Towards LLMs That Adapt Continually [code] [blog]
-
CASCADE: Case-Based Continual Adaptation for Large Language Models During Deployment (The DTLBench Paper) [code]
-
PASK: Toward Intent-Aware Proactive Agents with Long-Term Memory [code]
-
Toward Autonomous Long-Horizon Engineering for ML Research [code]
-
MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents [code]
-
ProcMEM: Learning Reusable Procedural Memory from Experience via Non-Parametric PPO for LLM Agents [code]
-
MemRL: Self-Evolving Agents via Runtime Reinforcement Learning on Episodic Memory [code]
-
Neural Garbage Collection: Learning to Forget while Learning to Reason
-
Why the Brain Consolidates: Predictive Forgetting for Optimal Generalisation
-
ML-Master: Towards AI-for-AI via Integration of Exploration and Reasoning [code]
-
Remember Me, Refine Me: A Dynamic Procedural Memory Framework for Experience-Driven Agent Evolution [code]
-
EvolveR: Self-Evolving LLM Agents through an Experience-Driven Lifecycle [code]
-
Learning on the Job: An Experience-Driven, Self-Evolving Agent for Long-Horizon Tasks [code]
-
Mem-α: Learning Memory Construction via Reinforcement Learning [code]
-
Memento: Fine-tuning LLM Agents without Fine-tuning LLMs [code]
-
Goal-Directed Search Outperforms Goal-Agnostic Memory Compression in Long-Context Memory Tasks [code]
-
AgentEvolver: Towards Efficient Self-Evolving Agent System [code]
-
FLEX: Continuous Agent Evolution via Forward Learning from Experience [code]
-
MemAgent: Reshaping Long-Context LLM with Multi-Conv RL-based Memory Agent [code]
-
Beyond Heuristics: A Decision-Theoretic Framework for Agent Memory Management
-
Nested Learning: The Illusion of Deep Learning Architecture [blog]
-
Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory
-
ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory
-
MemGen: Weaving Generative Latent Memory for Self-Evolving Agents
-
ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization
-
MARC: Memory-Augmented RL Token Compression for Efficient Video Understanding
-
Task-Core Memory Management and Consolidation for Long-term Continual Learning
-
SWE-Pruner: Self-Adaptive Context Pruning for Coding Agents [code]
-
Is Grep All You Need? How Agent Harnesses Reshape Agentic Search
-
Everything is Context: Agentic File System Abstraction for Context Engineering [code]
-
AgentFold: Long-Horizon Web Agents with Proactive Context Management [code]
-
ACON: Optimizing Context Compression for Long-horizon LLM Agents [code]
-
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
-
Subspace Communication in the Hippocampal–Retrosplenial Axis
-
Dopaminergic Processes Predict Temporal Distortions in Event Memory
-
Neural Activations and Representations during Episodic versus Semantic Memory Retrieval
-
Distinct Neuronal Populations in the Human Brain Combine Content and Context
-
Neural Population Activity for Memory: Properties, Computations, and Codes
-
How Prediction Error Drives Memory Updating: Role of Locus Coeruleus–Hippocampal Interactions
-
Towards Large Language Models with Human-Like Episodic Memory
If you find this project helpful, please give us a ⭐️.
Made with ❤️ by the Ubiquitous AGI team at TeleAI.
