Skip to content

Normalize malformed DOI month fields in bibbot ingestion#1098

Merged
pancetta merged 2 commits into
sourcefrom
copilot/fix-pipeline-error-june
Jun 8, 2026
Merged

Normalize malformed DOI month fields in bibbot ingestion#1098
pancetta merged 2 commits into
sourcefrom
copilot/fix-pipeline-error-june

Conversation

Copilot AI commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Bibbot failed while ingesting the new DOI entry (10.2514/6.2026-4026) due to malformed month values from DOI BibTeX payloads (e.g., 'june). This change hardens DOI parsing so malformed/bare month tokens no longer break issue_to_bibtex.py.

  • Problem addressed

    • DOI-provided BibTeX can contain month fields in non-parseable forms (month = june,, month = 'june,), which caused bibtexparser.loads(...) to raise and stop ingestion.
  • Code changes

    • In bin/issue_to_bibtex.py (DOI branch), added a pre-parse normalization step that rewrites month assignments to brace-wrapped BibTeX values before bibtexparser.loads(...).
    • Scope is intentionally narrow: only month = ... lines are normalized; no other fields or workflow behavior are changed.
  • Normalization example

    bib = re.sub(
        r'(^\s*month\s*=\s*)(?:[\'"]?)([A-Za-z]+)(?:[\'"]?)\s*,',
        r'\1{\2},',
        bib,
        flags=re.MULTILINE,
    )

This keeps bibbot resilient to DOI metadata formatting variance while preserving existing entry generation flow.

Copilot AI linked an issue Jun 8, 2026 that may be closed by this pull request
Copilot AI changed the title [WIP] Fix error in pipeline for bibbot Normalize malformed DOI month fields in bibbot ingestion Jun 8, 2026
Copilot AI requested a review from pancetta June 8, 2026 06:52
@pancetta pancetta marked this pull request as ready for review June 8, 2026 06:53
@pancetta pancetta merged commit d58509c into source Jun 8, 2026
1 check passed
@tlunet

tlunet commented Jun 8, 2026

Copy link
Copy Markdown
Member

Wouldn't we want at least a small unit test to check a REGEX that is more and more complex (and maybe no one understand ...) ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

New Papers for bibbot!

3 participants