Commit 9be0085
Ang
refactor: split pipeline.py into per-stage package (#17)
The monolithic onecite/pipeline.py (~3000 lines) is replaced with a proper package:
onecite/pipeline/
__init__.py - re-exports + `requests` at package level
_utils.py - _safe_year helper
parser.py - ParserModule
identifier.py - IdentifierModule (largest stage, kept as one file)
enricher.py - EnricherModule
formatter.py - FormatterModule
Backward-compat is preserved:
* `from onecite.pipeline import IdentifierModule` still works
* `patch('onecite.pipeline.requests.get', ...)` still works because
__init__.py keeps `import requests` at package level and Python's
module cache means all child modules share the same `requests` object.
Tests:
* 3 tests in test_pipeline_unit.py that used `patch.object(pipeline_mod, 'scholarly', ...)` now patch the concrete submodule (identifier / enricher) where `scholarly` is imported.
* test_integration.py gained a missing `import pytest` so pytest.skip() works when the mocked first pass returns no results.1 parent 682e5d7 commit 9be0085
8 files changed
Lines changed: 1072 additions & 976 deletions
File tree
- onecite/pipeline
- tests
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
0 commit comments