Problem (observed in real usage)
The introspection_extract.py pre-extract script scans only on-disk files (*.jsonl transcripts and request_dump_*.json snapshots) under ~/.hermes/sessions/. However, on installations that persist sessions via the SQLite SessionDB (state.db), the actual conversation history lives in the messages table — NOT in on-disk files. No *.jsonl files exist, and request_dump_*.json files are only error snapshots (written when a provider request fails).
Evidence (aggregated, anonymized)
- Frequency: 100% of runs on SessionDB-backed installs. Last 7 days: the extractor reported
sessions_scanned: 11 (from request dumps only), while the SessionDB holds 108 active sessions (4,502 messages) for the same window. The pipeline sees <10% of real sessions, and that 10% is biased toward error cases.
- Tool/area involved:
scripts/introspection_extract.py → build_digest() → sessions_dir.glob("*.jsonl") and sessions_dir.glob("request_dump_*.json")
- Failure shape:
sessions_dir contains zero .jsonl files and only error-snapshot request_dump_*.json; state.db table messages (schema: id, session_id, role, content, tool_call_id, tool_calls, tool_name, timestamp, ...) is never queried. Every prior introspection cycle operated on this biased 10% sample.
Impact on real tasks
The entire self-improvement loop is working on a sample of <10% of sessions, and that sample is error-biased (request dumps are only written on failures). Successful sessions — where the agent works well and where patterns of efficiency, misunderstanding, and capability gaps actually live — are invisible. Introspection cannot find what it cannot see. Every pattern detection, issue filing, and realized-impact verification based on this data is working from a non-representative sample.
Proposed direction
Add a third scan path to build_digest() that reads from the SessionDB SQLite database (state.db → messages table) when it exists. The table already carries role, content, tool_calls, tool_name, tool_call_id, and timestamp — the same fields scan_messages() already consumes. Group by session_id, order by timestamp, and pass each session's messages through the existing scan_messages(). This makes the extractor see 100% of sessions instead of <10%.
Value
- Impact: 1.0 — the entire introspection pipeline is blind to >90% of real sessions
- Effort: 0.4 — add a SQLite cursor path reusing the existing
scan_messages() function
- Priority Score: 1.68
Problem (observed in real usage)
The
introspection_extract.pypre-extract script scans only on-disk files (*.jsonltranscripts andrequest_dump_*.jsonsnapshots) under~/.hermes/sessions/. However, on installations that persist sessions via the SQLiteSessionDB(state.db), the actual conversation history lives in themessagestable — NOT in on-disk files. No*.jsonlfiles exist, andrequest_dump_*.jsonfiles are only error snapshots (written when a provider request fails).Evidence (aggregated, anonymized)
sessions_scanned: 11(from request dumps only), while the SessionDB holds 108 active sessions (4,502 messages) for the same window. The pipeline sees <10% of real sessions, and that 10% is biased toward error cases.scripts/introspection_extract.py→build_digest()→sessions_dir.glob("*.jsonl")andsessions_dir.glob("request_dump_*.json")sessions_dircontains zero.jsonlfiles and only error-snapshotrequest_dump_*.json;state.dbtablemessages(schema: id, session_id, role, content, tool_call_id, tool_calls, tool_name, timestamp, ...) is never queried. Every prior introspection cycle operated on this biased 10% sample.Impact on real tasks
The entire self-improvement loop is working on a sample of <10% of sessions, and that sample is error-biased (request dumps are only written on failures). Successful sessions — where the agent works well and where patterns of efficiency, misunderstanding, and capability gaps actually live — are invisible. Introspection cannot find what it cannot see. Every pattern detection, issue filing, and realized-impact verification based on this data is working from a non-representative sample.
Proposed direction
Add a third scan path to
build_digest()that reads from the SessionDB SQLite database (state.db→messagestable) when it exists. The table already carriesrole,content,tool_calls,tool_name,tool_call_id, andtimestamp— the same fieldsscan_messages()already consumes. Group bysession_id, order bytimestamp, and pass each session's messages through the existingscan_messages(). This makes the extractor see 100% of sessions instead of <10%.Value
scan_messages()function