Skip to content

[FIX] introspection_extract.py ignores SessionDB (state.db) — blind to >90% of real sessions #399

Description

@acipat

Problem (observed in real usage)

The introspection_extract.py pre-extract script scans only on-disk files (*.jsonl transcripts and request_dump_*.json snapshots) under ~/.hermes/sessions/. However, on installations that persist sessions via the SQLite SessionDB (state.db), the actual conversation history lives in the messages table — NOT in on-disk files. No *.jsonl files exist, and request_dump_*.json files are only error snapshots (written when a provider request fails).

Evidence (aggregated, anonymized)

  • Frequency: 100% of runs on SessionDB-backed installs. Last 7 days: the extractor reported sessions_scanned: 11 (from request dumps only), while the SessionDB holds 108 active sessions (4,502 messages) for the same window. The pipeline sees <10% of real sessions, and that 10% is biased toward error cases.
  • Tool/area involved: scripts/introspection_extract.pybuild_digest()sessions_dir.glob("*.jsonl") and sessions_dir.glob("request_dump_*.json")
  • Failure shape: sessions_dir contains zero .jsonl files and only error-snapshot request_dump_*.json; state.db table messages (schema: id, session_id, role, content, tool_call_id, tool_calls, tool_name, timestamp, ...) is never queried. Every prior introspection cycle operated on this biased 10% sample.

Impact on real tasks

The entire self-improvement loop is working on a sample of <10% of sessions, and that sample is error-biased (request dumps are only written on failures). Successful sessions — where the agent works well and where patterns of efficiency, misunderstanding, and capability gaps actually live — are invisible. Introspection cannot find what it cannot see. Every pattern detection, issue filing, and realized-impact verification based on this data is working from a non-representative sample.

Proposed direction

Add a third scan path to build_digest() that reads from the SessionDB SQLite database (state.dbmessages table) when it exists. The table already carries role, content, tool_calls, tool_name, tool_call_id, and timestamp — the same fields scan_messages() already consumes. Group by session_id, order by timestamp, and pass each session's messages through the existing scan_messages(). This makes the extractor see 100% of sessions instead of <10%.

Value

  • Impact: 1.0 — the entire introspection pipeline is blind to >90% of real sessions
  • Effort: 0.4 — add a SQLite cursor path reusing the existing scan_messages() function
  • Priority Score: 1.68

Metadata

Metadata

Assignees

No one assigned

    Labels

    acceptedAccepted by evolution — sent to a PR / implemented

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions