Skip to content

bug: memory overview cache key mismatch causes full LLM reprocessing on every semantic refresh #1261

@yc111233

Description

@yc111233

Summary

_process_memory_directory() has a cache mechanism intended to skip unchanged files, but the cache never hits due to a key mismatch between UUID filenames and LLM-generated descriptive titles. This causes every semantic refresh to reprocess all 1600+ memory files via LLM, consuming massive tokens.

Root Cause

Cache lookup (semantic_processor.py:467-474):

file_name = file_path.split("/")[-1]  # e.g. "mem_c4a0edcf-11b8-47fc-9c3b-c18fe0d38fb6.md"

if file_path not in changed_files and file_name in existing_summaries:
    # cache hit — reuse existing summary

Cache population (_parse_overview_md, semantic_processor.py:917-968):

header_match = re.match(r"^###\s+(.+?)\s*$", line)
# Extracts H3 heading text as key, e.g. "Session Context Management"

The LLM generates descriptive H3 headings like ### Session Context Management, but the lookup uses the actual filename mem_c4a0edcf-...md. These never match → reused 0 cached → all files reprocessed every time.

Contributing Factor

overview_generation.yaml section 4 says to create "One H3 subsection for each file/subdirectory" but does not require using the exact filename as the H3 heading. The LLM is free to write descriptive titles.

Impact

  • Every _process_memory_directory invocation generates LLM summaries for ALL files (O(n) LLM calls where n = total memory files)
  • Combined with the 45s dedupe window (_MEMORY_PARENT_SEMANTIC_DEDUPE_SEC), any active conversation creates an effectively infinite reprocessing loop:
    • Processing 1600 files takes 10-30 minutes
    • During that time, new memories are written by the compressor
    • 45s window expires → new SemanticMsg enqueued
    • Previous run finishes → next full run starts immediately
  • This was the root cause of the ~20B token consumption incident on 2026-04-05

Suggested Fix

Option A: Sidecar cache file — use an independent .summary_cache.json per directory mapping {filename: {cache_key, summary}}, bypassing the unreliable .overview.md parsing entirely.

Option B: Fix the prompt + parser — require exact filenames in H3 headings in overview_generation.yaml and update _parse_overview_md to extract them reliably.

Option A is more robust as it decouples the cache from LLM output formatting.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    In progress

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions