fix: Handle YAML parsing errors and missing entity_type in markdown files#368
Conversation
…iles (#184, #185) ## Issues Fixed ### Issue #184: NOT NULL constraint on entity_type **Problem**: Sync was failing with database constraint violations when markdown files were missing the `type` field in frontmatter or had it explicitly set to `null`. **Error**: ``` NOT NULL constraint failed: entity.entity_type ``` **Solution**: Added explicit check for None/null entity_type values and always provide "note" as the default type. ### Issue #185: YAML parsing errors causing sync failures **Problem**: Files with malformed YAML frontmatter caused sync to fail completely, leading to: - 12+ minute sync operations (for just 2 files) - Infinite retry loops - Keepalive ping failures - Complete blocking of sync for entire knowledge base **Error**: ``` block sequence entries are not allowed in this context in "<unicode string>", line 5, column 7 ``` **Solution**: Wrapped YAML parsing in try/catch block to gracefully handle malformed frontmatter. When YAML parsing fails: - Log warning with file path and error details - Treat file as plain markdown without frontmatter - Continue sync with remaining files ## Changes **src/basic_memory/markdown/entity_parser.py**: - Added `yaml` and `logger` imports - Wrapped `frontmatter.loads()` in try/catch for YAMLError - Enhanced entity_type handling to catch None values explicitly - Added warning logs for failed YAML parsing **tests/markdown/test_entity_parser_error_handling.py** (new): - 7 comprehensive tests covering both issues - Tests for malformed YAML, invalid YAML, missing type, null type - Tests ensure valid files still parse correctly ## Impact - **Robustness**: Files with YAML errors no longer block entire sync - **Performance**: Eliminates 12+ minute sync hangs from single bad files - **Data integrity**: All files get synced with appropriate defaults - **User experience**: Better error messages, no more silent failures - **Coverage**: Markdown parsing now at 100% test coverage ## Testing - All 43 markdown tests pass (7 new tests added) - All existing tests pass with no regressions - Entity parser now has 100% code coverage Signed-off-by: Claude <noreply@anthropic.com> 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task —— View job Code Review for PR #368I've completed a comprehensive review of this PR against the Basic Memory team checklist. Overall, this is an excellent fix that addresses two critical production issues with robust error handling and comprehensive testing. ✅ Code Quality & StandardsAll criteria met:
Highlights:
✅ Testing & DocumentationAll criteria met:
Test quality observations:
✅ Basic Memory ArchitectureAll criteria met:
Architectural strengths:
✅ Security & PerformanceAll criteria met:
Performance impact:
🎯 Specific Code Review Notesentity_parser.py:116-126 (YAML error handling)Excellent implementation: try:
post = frontmatter.loads(file_content)
except yaml.YAMLError as e:
logger.warning(
f"Failed to parse YAML frontmatter in {absolute_path}: {e}. "
f"Treating file as plain markdown without frontmatter."
)
post = frontmatter.Post(file_content, metadata={})
entity_parser.py:135-136 (entity_type handling)Robust null handling: entity_type = post.metadata.get("type")
metadata["type"] = entity_type if entity_type is not None else "note"
test_entity_parser_error_handling.pyComprehensive test coverage:
📊 CI StatusTests are currently running. Based on the PR description, all 43 markdown tests pass locally with 100% coverage for the markdown module. Current CI status:
✅ Final VerdictThis PR is production-ready and addresses critical issues effectively. Strengths:
No issues found - all checklist items satisfied. Recommendation: ✅ APPROVE - Ready to merge once CI passes. Todo List
--- | |
Fixes #378 - Changed error handling in update_frontmatter() to match PR #368 pattern - YAML parsing errors now logged as WARNING instead of ERROR - Files with malformed frontmatter treated as plain markdown - Added test for malformed YAML case (KB: title format) This resolves 1,112 errors/3hrs in production logs from files with colons in titles that confuse the YAML parser. Co-authored-by: Paul Hernandez <phernandez@users.noreply.github.com>
The update_frontmatter() function was logging ERROR (level 17) for malformed YAML frontmatter instead of handling it gracefully. This caused 1,112+ errors per 3 hours in production from files with titles like "KB: Something" where the colon breaks YAML parsing. This fix applies the same error handling pattern from PR #368's entity_parser.py: - Catch ParseError and yaml.YAMLError when parsing frontmatter - Log as WARNING (level 13) instead of ERROR - Treat file as having no frontmatter and proceed with update - Only log ERROR for actual file operation failures Files with malformed frontmatter now get updated successfully with valid frontmatter instead of spamming error logs. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: phernandez <paul@basicmachines.co>
Issues Fixed
Fixes basicmachines-co/basic-memory-cloud#184
Fixes basicmachines-co/basic-memory-cloud#185
Issue #184: NOT NULL constraint on entity_type
Problem: Sync was failing with database constraint violations when markdown files were missing the
typefield in frontmatter or had it explicitly set tonull.Error from production (41 occurrences in 24 hours):
Root cause: Files without
typefield or withtype: nullresulted in None being passed to the database, violating the NOT NULL constraint.Issue #185: YAML parsing errors causing sync failures
Problem: Files with malformed YAML frontmatter caused sync to fail completely, leading to:
Error from production:
Root cause: Malformed YAML syntax (e.g., unexpected list syntax) caused python-frontmatter to raise YAMLError, which wasn't being caught, causing the entire sync to fail.
Solution
Changes to
src/basic_memory/markdown/entity_parser.pyAdded YAML error handling (Issue fix: enhance character conflict detection and error handling for sync operations #185):
frontmatter.loads()in try/catch foryaml.YAMLErrorEnhanced entity_type defaults (Issue fix: preserve permalink when editing notes without frontmatter permalink #184):
Nonevalues (not just missing keys)type:(missing) andtype: null(explicit null) casesNew comprehensive tests
Created
tests/markdown/test_entity_parser_error_handling.pywith 7 tests:type: null(edge case)Impact
Robustness
Performance
Data Integrity
Code Quality
Testing
Related
🤖 Generated with Claude Code