@@ -67,19 +67,17 @@ as a deprecated alias for this release cycle.
6767
6868### Removed
6969- ` IdentifierModule._check_doi_content_consistency ` and the
70- ` consistency_score ` / ` low_consistency ` warning path. The fuzzy
71- string-similarity score was empirically unable to detect subtle
72- LLM-hallucinated references (scored 85/100 on author-only
73- hallucinations against a real DOI) and was only surfaced as a
74- ` logger.warning ` that downstream tools could not observe, producing
75- false reassurance. Citation authenticity verification belongs at the
76- abstract-vs-claim semantic layer in the consuming tool (e.g. the
77- ` sci ` skill), not at the bibliographic-string layer here.
70+ ` consistency_score ` / ` low_consistency ` warning path. A fuzzy
71+ string-similarity score on bibliographic fields is not a reliable
72+ signal for detecting fabricated references, and it was only emitted
73+ as a ` logger.warning ` that downstream tools could not act on.
74+ Citation-authenticity verification belongs at the abstract-vs-claim
75+ semantic layer in the consuming tool, not at the bibliographic-string
76+ layer here.
7877
7978## [ 0.1.0] - 2026-04-17
8079
81- First formal PyPI release since ` 0.0.12 ` . Incorporates the complete
82- pyOpenSci review pass (issues #3 , #5 –#34 , #36 ) plus follow-up cleanup.
80+ First formal PyPI release since ` 0.0.12 ` .
8381
8482### Added
8583- RST documentation using Sphinx
@@ -94,25 +92,25 @@ pyOpenSci review pass (issues #3, #5–#34, #36) plus follow-up cleanup.
9492- ** Split monolithic ` pipeline.py ` (~ 3000 lines)** into a proper
9593 ` onecite/pipeline/ ` package with one module per stage
9694 (` parser.py ` / ` identifier.py ` / ` enricher.py ` / ` formatter.py ` )
97- plus a ` _utils.py ` for shared helpers ( # 17 ) . Public imports
95+ plus a ` _utils.py ` for shared helpers. Public imports
9896 (` from onecite.pipeline import IdentifierModule ` ) and mocking targets
9997 (` patch("onecite.pipeline.requests.get", ...) ` ) continue to work
10098 unchanged because ` __init__.py ` re-exports every public symbol and
10199 keeps ` requests ` at the package level.
102- - Unify CrossRef request and parsing methods ( # 26 ) ; all CrossRef calls
100+ - Unify CrossRef request and parsing methods; all CrossRef calls
103101 now go through a single helper with a proper ` User-Agent ` header and
104- ` mailto ` query-string parameter ( # 21 ) .
102+ ` mailto ` query-string parameter.
105103- Rewrite fuzzy-search scoring as a weighted title / author / year /
106104 venue model with three confidence tiers (auto-adopt / interactive /
107- cautious) and a unified low-confidence threshold ( # 3 , # 23 , # 27 ) .
105+ cautious) and a unified low-confidence threshold.
108106- Simplify identifier routing; CrossRef and Semantic Scholar are always
109107 consulted for text queries, with signal-based additional queries to
110- PubMed / Google Books / OpenAIRE / BASE ( # 8 , # 23 ) .
111- - Use ` bibtexparser.dumps() ` for BibTeX rendering ( # 30 ) .
108+ PubMed / Google Books / OpenAIRE / BASE.
109+ - Use ` bibtexparser.dumps() ` for BibTeX rendering.
112110- Expose ` use_google_scholar ` as a real CLI flag and API parameter
113- instead of a hard-coded ` False ` ( # 10 ) .
111+ instead of a hard-coded ` False ` .
114112- Clarify that templates define metadata-field requirements and a
115- fallback BibTeX entry type, not output formatting ( # 16 , # 29 ) .
113+ fallback BibTeX entry type, not output formatting.
116114- Refactored exception hierarchy
117115- Added type hints to Python API
118116- Updated README examples
@@ -125,42 +123,40 @@ pyOpenSci review pass (issues #3, #5–#34, #36) plus follow-up cleanup.
125123- APA and MLA output renderers; they produced inconsistent output and
126124 the CLI now rejects anything other than ` --output-format bibtex ` .
127125 Users wanting APA/MLA should post-process the BibTeX through pandoc
128- or citeproc-py ( # 31 , # 32 ) .
126+ or citeproc-py.
129127- Hard-coded "well-known paper" shortcut that masked failures on the
130- main example input ( # 19 ) .
128+ main example input.
131129- MCP integration page and all related references
132130- ` .readthedocs.yml ` (docs now hosted on GitHub Pages)
133131- ` docs/_build/ ` build artifacts from repository
134132
135133### Fixed
136134- README / ` docs/index.rst ` / ` docs/faq.rst ` no longer advertise
137135 OpenAlex or dblp as data sources — they were never wired into the
138- code ( # 6 ) .
136+ code.
139137- README quick-start example now shows ` booktitle ` (NeurIPS) instead
140- of ` journal = "arXiv preprint" ` for the ` @inproceedings ` sample
141- (#28 ).
138+ of ` journal = "arXiv preprint" ` for the ` @inproceedings ` sample.
142139- ` docs/api/pipeline.rst ` rewritten to match the actual module
143140 structure; removed references to classes and methods that never
144141 existed (` Validator ` / ` Identifier ` / ` Completer ` / ` Formatter ` ,
145- ` set_source_priority ` , ` set_timeout ` , ` add_template_path ` ) ( # 11 ) .
142+ ` set_source_priority ` , ` set_timeout ` , ` add_template_path ` ).
146143- ` docs/output_formats.rst ` , ` docs/faq.rst ` , ` docs/quick_start.rst ` ,
147144 ` docs/python_api.rst ` , ` docs/templates.rst ` , ` docs/index.rst ` and
148145 docstrings in ` core.py ` / ` formatter.py ` no longer advertise APA /
149- MLA output ( # 31 , # 32 ) .
146+ MLA output.
150147- Crossref author names parsed as ` given family ` instead of mangled
151- concatenations ( # 22 ) .
148+ concatenations.
152149- Semantic Scholar HTTP 429 responses return an empty candidate list
153- cleanly instead of bubbling up ( # 25 ) .
150+ cleanly instead of bubbling up.
154151- Previously-unused exception classes (` ParseError ` , ` ValidationError ` ,
155- ` FormatError ` ) are now actually raised in the right places ( # 13 ) .
152+ ` FormatError ` ) are now actually raised in the right places.
156153- ` CONTRIBUTING.md ` no longer tells developers to use a ` requirements.txt `
157- that does not exist; the documented install is ` pip install -e .[dev] `
158- (#12 ).
154+ that does not exist; the documented install is ` pip install -e .[dev] ` .
159155- ` black ` formatting is enforced via ` pyproject.toml ` ` [tool.black] `
160- plus a pre-commit hook ( # 15 ) .
161- - URL-bearing entries are no longer queried twice ( # 20 ) .
156+ plus a pre-commit hook.
157+ - URL-bearing entries are no longer queried twice.
162158- Fallback paths mark entries as ` identification_failed ` rather than
163- fabricating plausible-looking but invented metadata ( # 24 ) .
159+ fabricating plausible-looking but invented metadata.
164160- CrossRef and Semantic Scholar response parsing edge cases
165161- API documentation using incorrect return value fields (` output_content ` -> ` results ` )
166162- Version number inconsistencies across metadata files
0 commit comments