Skip to content

fix(memory): harden TQMemory for client installs (stable project_id, index guard, timeout, preload, upgrade)#485

Merged
Lexus2016 merged 1 commit into
mainfrom
evolution/issue-client-memory-hardening
Jun 23, 2026
Merged

fix(memory): harden TQMemory for client installs (stable project_id, index guard, timeout, preload, upgrade)#485
Lexus2016 merged 1 commit into
mainfrom
evolution/issue-client-memory-hardening

Conversation

@Lexus2016

Copy link
Copy Markdown
Owner

Why

Diagnosis on the owner server (osoba) found TQMemory silently stopped persisting on 2026-06-07. Root causes were not in the memory engine itself but in how the fork configures/uses it — and all of them reproduce on fresh client installs.

Fixes (5)

  1. Stable project_id (tqmemory_setup.py): set TQMEMORY_PROJECT_ROOT = HERMES_HOME (fallback ~/.hermes) so the project bucket no longer tracks the process cwd. On prod, memory had fragmented across two buckets (/root vs /root/.hermes). Back-filled on the repair path too, so hermes update heals existing installs, not just fresh ones.

  2. index_paths bloat guard (prompt_builder.py / TQMEMORY_GUIDANCE): instruct the agent never to index_paths huge/system trees (/root, /home, /Users/admin, /tmp, whole repos). On prod an unbounded index of /root produced ~10k chunks that crashed the LanceDB re-sync and timed out the MCP.

  3. Per-server timeout 600s (tqmemory_setup.py entry): first semantic_search loads a ~600MB embedding model and re-syncs can be slow; the global MCP default (300s) is left untouched.

  4. Embedding model preload (setup-hermes.sh): best-effort pre-cache of the sentence-transformers model at install, so the first semantic_search doesn't time out pulling ~600MB from HuggingFace at runtime.

  5. Reliable upgrade past rev-pin (tqmemory_setup.py): uv tool upgrade re-resolves a rev-pinned receipt to the same rev and never advances (observed stuck at v0.17.0 on prod). Fall back to uv tool install --reinstall against the unpinned spec when upgrade reports no change.

Tests

tests/hermes_cli/test_tqmemory_setup.py updated for the new env/timeout contract — 22 passed; prompt-builder tests green.

Generated from a live osoba diagnosis + fix session.

… index_paths guard, timeout, model preload, upgrade reliability
@github-actions

Copy link
Copy Markdown
Contributor

🔎 Lint report: evolution/issue-client-memory-hardening vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 11215 on HEAD, 11213 on base (🆕 +2)

🆕 New issues (2):

Rule Count
unresolved-attribute 2
First entries
tests/run_agent/test_credits_notices_toggle.py:76: [unresolved-attribute] unresolved-attribute: Unresolved attribute `_credits_session_start_micros` on type `AIAgent`
run_agent.py:3223: [unresolved-attribute] unresolved-attribute: Object of type `Self@get_credits_spent_micros` has no attribute `_credits_session_start_micros`

✅ Fixed issues (1):

Rule Count
invalid-assignment 1
First entries
tests/run_agent/test_credits_notices_toggle.py:76: [invalid-assignment] invalid-assignment: Object of type `None` is not assignable to attribute `_credits_session_start_micros` of type `int`

Unchanged: 5896 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

@Lexus2016 Lexus2016 merged commit b4c4224 into main Jun 23, 2026
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant