During local daily use, one difficulty I repeatedly ran into is observability: when a memory item does not appear as expected, it is hard to quickly tell where the problem is.
Typical questions are:
- did this session enter L0 successfully?
- has L1 extraction already run for this session?
- if L1 is missing, is it because:
- the trigger threshold was not reached yet,
- the pipeline is still pending,
- extraction produced no storable result,
- dedup/merge removed it,
- or an actual error occurred?
- if a session is behind, is it temporary lag or a real failure?
At the moment, this can often be inferred by checking SQLite data, checkpoint files, and gateway logs together, but in practice that is still quite difficult during long-running use.
A few improvements that would help a lot:
- clearer per-session status
- structured L0/L1 progress visibility
- grouped failure reasons instead of only raw logs
- optional read-only diagnostics/debug endpoints
To better understand these cases in my own setup, I have been experimenting with a small read-only observability UI as a reference prototype:
https://github.com/sirenexcelsior/tdai-memory-observatory
If you could provide any ideas or guidance, I would be very honored.
Thank you again.
During local daily use, one difficulty I repeatedly ran into is observability: when a memory item does not appear as expected, it is hard to quickly tell where the problem is.
Typical questions are:
At the moment, this can often be inferred by checking SQLite data, checkpoint files, and gateway logs together, but in practice that is still quite difficult during long-running use.
A few improvements that would help a lot:
To better understand these cases in my own setup, I have been experimenting with a small read-only observability UI as a reference prototype:
https://github.com/sirenexcelsior/tdai-memory-observatory
If you could provide any ideas or guidance, I would be very honored.
Thank you again.