Skip to content

fix: Handle YAML parsing errors gracefully in update_frontmatter (#378)#379

Merged
phernandez merged 1 commit intomainfrom
fix/yaml-error-handling-378
Oct 17, 2025
Merged

fix: Handle YAML parsing errors gracefully in update_frontmatter (#378)#379
phernandez merged 1 commit intomainfrom
fix/yaml-error-handling-378

Conversation

@phernandez
Copy link
Copy Markdown
Member

Summary

Fixes #378 - update_frontmatter() now handles malformed YAML frontmatter gracefully instead of logging errors.

Problem

The update_frontmatter() function was logging ERROR (level 17) for malformed YAML frontmatter, even though the system should handle these gracefully. This caused 1,112+ errors per 3 hours in production from files with titles like KB: Something where the second colon breaks YAML parsing.

PR #368 fixed this issue in the entity parser (reading), but update_frontmatter() (writing) has a separate code path that wasn't updated.

Solution

Applied the same error handling pattern from PR #368's entity_parser.py:

  1. Catch ParseError and yaml.YAMLError when parsing frontmatter
  2. Log as WARNING (level 13) instead of ERROR
  3. Treat file as having no frontmatter
  4. Proceed with update using new valid frontmatter
  5. Only log ERROR for actual file operation failures

Changes

  • src/basic_memory/file_utils.py:213-248: Added graceful YAML error handling
  • tests/utils/test_file_utils.py:262-290: Added test for malformed YAML handling

Test Coverage

New test test_update_frontmatter_handles_malformed_yaml verifies:

  • Files with malformed YAML (e.g., title: KB: Something) are handled gracefully
  • Frontmatter is updated successfully with valid YAML
  • Content is preserved

Expected Impact

  • ✅ Error count drops from 1,112/3hrs to ~0 in production
  • ✅ YAML parsing errors logged as WARNING instead of ERROR
  • ✅ Files with malformed frontmatter get updated successfully
  • ✅ Real file operation errors still logged as ERROR

Related

🤖 Generated with Claude Code

The update_frontmatter() function was logging ERROR (level 17) for malformed
YAML frontmatter instead of handling it gracefully. This caused 1,112+ errors
per 3 hours in production from files with titles like "KB: Something" where
the colon breaks YAML parsing.

This fix applies the same error handling pattern from PR #368's entity_parser.py:
- Catch ParseError and yaml.YAMLError when parsing frontmatter
- Log as WARNING (level 13) instead of ERROR
- Treat file as having no frontmatter and proceed with update
- Only log ERROR for actual file operation failures

Files with malformed frontmatter now get updated successfully with valid
frontmatter instead of spamming error logs.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: phernandez <paul@basicmachines.co>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Oct 17, 2025

Claude finished @phernandez's task —— View job


PR Review: Fix YAML Error Handling in update_frontmatter()

I've completed a thorough review of this PR against Basic Memory's team standards. Here's my analysis:


Code Quality & Standards

Follows Basic Memory's coding conventions in CLAUDE.md

  • Code follows Python 3.12+ type annotations
  • Async patterns properly maintained
  • 100-character line length respected
  • No commented-out code blocks

Python 3.12+ type annotations and async patterns

  • Function maintains existing type signature: async def update_frontmatter(path: FilePath, updates: Dict[str, Any]) -> str
  • Proper exception handling with typed exceptions

SQLAlchemy 2.0 best practices

  • N/A - This PR only touches file utilities, not database code

FastAPI and Typer conventions

  • N/A - This PR is in the utilities layer

100-character line length limit maintained

  • All lines are within the 100-character limit

No commented-out code blocks

  • Clean implementation with no dead code

Testing & Documentation

Unit tests for new functions/methods

Test coverage for edge cases

  • Test verifies:
    • Files with malformed YAML are handled gracefully
    • Frontmatter is successfully updated with valid YAML
    • Original content is preserved
    • New frontmatter can be parsed successfully

Documentation updated

  • PR description clearly explains the problem, solution, and expected impact
  • Inline comments added explaining the error handling approach (line 213, 220-223, 241)

⚠️ Integration tests for MCP tools

  • N/A - This is a utility function, not an MCP tool
  • However, consider adding an integration test that exercises this through the full sync pipeline

Basic Memory Architecture

Preserves local-first architecture principles

  • File utilities maintain files as source of truth
  • No database changes required

Markdown file handling preserves integrity

  • Implementation correctly preserves all content when YAML parsing fails
  • New frontmatter is written with valid YAML format

AI-human collaboration patterns followed


Security & Performance

No hardcoded secrets or credentials

  • Clean implementation with no security concerns

Input validation for MCP tools

  • N/A - This is a utility function

Proper error handling and logging

Performance considerations addressed

  • No performance impact - error handling is lightweight
  • Atomic file writes maintained

No sensitive data in logs or commits

  • Clean implementation

Code Consistency Analysis

I compared this PR with PR #368 (commit d1431bd) and found excellent consistency:

PR #368 Pattern (entity_parser.py:116-126):

try:
    post = frontmatter.loads(file_content)
except yaml.YAMLError as e:
    logger.warning(
        f"Failed to parse YAML frontmatter in {absolute_path}: {e}. "
        f"Treating file as plain markdown without frontmatter."
    )
    post = frontmatter.Post(file_content, metadata={})

This PR Pattern (file_utils.py:216-226):

try:
    current_fm = parse_frontmatter(content)
    content = remove_frontmatter(content)
except (ParseError, yaml.YAMLError) as e:
    logger.warning(
        f"Failed to parse YAML frontmatter in {path_obj}: {e}. "
        "Treating file as plain markdown without frontmatter."
    )
    current_fm = {}

Perfect alignment - Both use identical warning messages and logging approach


Specific Code Review Notes

src/basic_memory/file_utils.py:213-248

Strengths:

  1. ✅ Catches both ParseError and yaml.YAMLError for comprehensive coverage
  2. ✅ Preserves full content when YAML parsing fails (line 225: current_fm = {})
  3. ✅ Prevents double-logging of YAML errors in outer exception handler (line 242)
  4. ✅ Clear, actionable warning message for debugging

Minor Suggestion:
The comment on line 225 says "Keep full content, treat as having no frontmatter" but the full content is preserved implicitly by not calling remove_frontmatter(). Consider making this more explicit:

except (ParseError, yaml.YAMLError) as e:
    # Log warning and treat as plain markdown without frontmatter
    logger.warning(
        f"Failed to parse YAML frontmatter in {path_obj}: {e}. "
        "Treating file as plain markdown without frontmatter."
    )
    # Keep full content (content variable unchanged), treat as having no frontmatter
    current_fm = {}

However, this is extremely minor and the current implementation is clear enough.

tests/utils/test_file_utils.py:262-290

Strengths:

  1. ✅ Test case directly matches the production issue (title: KB: Something)
  2. ✅ Comprehensive assertions verify both success and data integrity
  3. ✅ Test name clearly references issue update_frontmatter logs errors for malformed YAML instead of handling gracefully #378
  4. ✅ Follows pytest async patterns correctly

Note:
The malformed YAML example title: KB: Something is actually valid YAML! The value is interpreted as the string "KB: Something". However, this may fail in some YAML parsers depending on context or additional content. For a truly malformed example, you might want to test something like:

---
title: KB: Something: Extra
tags: [unclosed
---

But since this test case apparently reproduces the production issue, it's perfect as-is.


Final Verdict

APPROVED - Excellent Implementation

This PR successfully addresses issue #378 with:

Expected Impact Validation

The PR claims it will drop error count from 1,112 errors/3hrs to ~0. This is accurate because:

  1. Files with malformed YAML frontmatter will now be handled gracefully
  2. YAML parsing errors will be logged as WARNING (not ERROR)
  3. Files will be successfully updated with valid frontmatter
  4. Only real file operation failures will log as ERROR

Recommendations for Merge

  1. Ready to merge - All checks pass
  2. Consider running in staging/production to validate the error reduction metric
  3. No additional changes required

Minor Enhancement Suggestions (Future Work)

These are not blockers, but could improve the codebase:

  1. Add integration test: Test this through the full sync pipeline to ensure the entire flow handles malformed YAML gracefully
  2. Centralize YAML error handling: Consider creating a shared utility function for YAML parsing with standard error handling, since this pattern now exists in both entity_parser.py and file_utils.py
  3. Document YAML limitations: Add a comment or doc explaining what YAML formats are supported vs. not supported

@phernandez phernandez merged commit 32236cd into main Oct 17, 2025
16 checks passed
@phernandez phernandez deleted the fix/yaml-error-handling-378 branch October 17, 2025 01:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

update_frontmatter logs errors for malformed YAML instead of handling gracefully

1 participant