Skip to content

bonsai-memory v2: decay/pruning layer + deterministic scripts #1

@felixsim

Description

@felixsim

Problem

Two architectural gaps identified in v1:

1. Decay / Pruning (Growth Problem)

Bonsai v1 solves the search problem (boot tokens: 5K+ → ~400) but not the growth problem. Every new fact is added, nothing is ever removed. Over 6–12 months, memory/domains/ will bloat the same way MEMORY.md did.

There is no mechanism to ask "is this still relevant?" before writing. A reflection pass is needed — e.g. "is this older than 90 days and still accurate?" — to prune stale entries before they compound.

2. LLM-Dependent Ops in Heartbeat (Cost + Reliability)

Currently classify/reclassify/reindex are SKILL.md instructions — meaning an LLM reads and executes them on every heartbeat. This is:

  • Expensive — tokens consumed per heartbeat
  • Non-deterministic — LLM may behave differently each run
  • Unnecessary — mechanical ops (stat files, count tokens, write index) don't need intelligence

A bash script that auto-generates _index.md from file mtimes and char counts would be faster, cheaper, and more reliable. LLM involvement should be limited to the write decision (domain classification), not upkeep.

Goal: Bonsai v2

Design a production-grade solution that addresses both gaps:

  1. Reflection/decay layer — mechanism to prune or archive stale memory entries
  2. Deterministic scripts — bash/shell scripts for reindex, token counting, pruning; LLM only for classification decisions

Debate Task

7 sub-agents will debate the best architectural approach for v2 — sequentially, each reading the full issue + all prior comments before posting their round.

Key questions to resolve:

  • What triggers the reflection/pruning pass? (cron, token threshold, age?)
  • What is the pruning decision criteria? (age, access recency, confidence score?)
  • Should pruning be LLM-assisted or rule-based?
  • What scripts should exist and what should they do?
  • How does this integrate with existing OpenClaw heartbeat/cron patterns?
  • What does the final SKILL.md v2 architecture look like?

Rounds

Round 1–7: Sequential sub-agent debate. Each agent reads all prior comments, proposes or refines the architecture, challenges weak points, and builds toward consensus.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions