You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: strip NUL bytes from content before PostgreSQL search indexing
rclone preallocation on virtual filesystems (e.g. Google Drive File Stream)
pads markdown files with \x00 bytes (rclone/rclone#6801), which PostgreSQL
rejects with CharacterNotInRepertoireError during search indexing.
Three-pronged fix:
- 🛡️ Primary: _strip_nul() in SearchService.index_entity_markdown() sanitizes
content_stems, content_snippet, and observation/relation content before
building SearchIndexRow objects
- 🛡️ Secondary: _strip_nul_from_row() in PostgresSearchRepository.bulk_index_items()
as a safety net before INSERT
- 🔧 Prevention: --local-no-preallocate flag added to rclone sync and bisync
commands to prevent the padding at the source
Fixes#548
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: phernandez <paul@basicmachines.co>
0 commit comments