|
1 | | -## Unreleased |
| 1 | +# Unreleased |
| 2 | +- New `value_error_thresholds` parameter added to both `evaluate_semantic()` and `evaluate_agentic()` for range-based absolute error tolerances on numeric property value comparisons: |
| 3 | + |
| 4 | + - Accepts a dict mapping `(min, max)` tuples to absolute error thresholds. When a ground-truth value falls inside a range, the extracted value is accepted if `|extracted - ground_truth| ≤ threshold`. Values outside all configured ranges fall back to exact comparison. |
| 5 | + |
| 6 | + - **Semantic evaluation**: handled inside `_is_value_in_range()` via the new `_get_error_threshold()` helper in `MaterialsDataSemanticEvaluator`. |
| 7 | + |
| 8 | + - **Agentic evaluation**: a new `GetValueErrorThresholdTool` (CrewAI `BaseTool`) is added to the composition evaluator agent when thresholds are configured. The agent calls this tool with the reference value to retrieve the tolerance before deciding on each numeric match. No tool is added and no prompt changes are made when no thresholds are provided. |
| 9 | + |
| 10 | +- Exposed `value_error_thresholds` in public evaluation methods: `ComProScanner.evaluate_semantic()`, `ComProScanner.evaluate_agentic()`, `comproscanner.evaluate_semantic()`, and `comproscanner.evaluate_agentic()`. |
| 11 | + |
| 12 | +- VLM-based graph data extraction added across all publishers and PDF processors: |
| 13 | + |
| 14 | + - New `GraphExtractorTool` — a CrewAI agent tool that reads saved figures for a given DOI and uses a vision LLM to extract composition-property value pairs from graphs and charts. Default VLM: `gemini/gemini-3-flash-preview`. |
| 15 | + |
| 16 | + - New `FigureExtractor` utility — shared helper for caption keyword-based figure filtering and saving, used by all article processors. |
| 17 | + |
| 18 | + - New `caption_keywords` parameter in `process_articles()` and `extract_composition_property_data()`, and new `vlm_model` and `related_figures_base_path` parameters in `extract_composition_property_data()`. |
| 19 | + |
| 20 | +- New unit tests added for all three agent tools in `tests/test_agent_tools/`. |
| 21 | + |
| 22 | +### Fixed |
| 23 | + |
| 24 | +- `process_articles()` now routes user-provided `doi_list` by `general_publisher` from metadata and sends each DOI only to its matching source processor. |
| 25 | + |
| 26 | +--- |
| 27 | +## [0.1.6] - 2026-04-02 |
| 28 | +### Changed |
| 29 | +- Updated [README.md](README.md), [CITATION.cff](CITATION.cff) and docs with the published version (advance article) of the ComProScanner paper in _Digital Discovery_ as fully open access: |
| 30 | + - [ComProScanner: a multi-agent based framework for composition-property structured data extraction from scientific literature](https://doi.org/10.1039/D5DD00521C) |
| 31 | + |
| 32 | +### Added |
| 33 | +- Guide for API key creation for various LLM providers and publisher APIs added to the documentation at `docs/getting-started/api-key-guide.md` with detailed instructions for each provider. |
| 34 | + |
| 35 | +--- |
| 36 | +## [0.1.5] - 2026-02-08 |
2 | 37 |
|
3 | 38 | ### Added |
| 39 | +- Data related to comparison with other agentic data extraction frameworks added for the ComProScanner paper in the `examples/piezo_test/comparing_existing_frameworks` folder. |
4 | 40 |
|
5 | 41 | - New parameter `apply_advanced_cleaning` added to data cleaning methods in `data_cleaner.py`. When set to `True`, it triggers the advanced cleaning pipeline. |
6 | 42 |
|
|
38 | 74 | - [CITATION.cff](https://github.com/slimeslab/ComProScanner/blob/main/CITATION.cff) added for standardized citation information based on the latest release and arXiv preprint. |
39 | 75 |
|
40 | 76 | ### Fixed |
| 77 | +- OAWorks API is replaced with OpenAlex API as OAWorks is no longer available. |
| 78 | + |
| 79 | +- Empty/corrupted PDF handled in `pdf_processor.py` and `wiley_processor.py` to avoid having GLYPH errors during text extraction. |
| 80 | + |
| 81 | +- Data extraction failures fixed if composition-property text data is empty. |
41 | 82 |
|
42 | 83 | - CSV progress tracking in `elsevier_processor.py`: |
43 | 84 |
|
|
63 | 104 |
|
64 | 105 | - README badges section converted from HTML to markdown format for better compatibility across platforms. |
65 | 106 |
|
66 | | -## [0.1.4] - 02-12-2025 |
| 107 | +--- |
| 108 | +## [0.1.4] - 2025-12-02 |
67 | 109 |
|
68 | 110 | ### Added |
69 | 111 |
|
|
94 | 136 |
|
95 | 137 | ### Changed |
96 | 138 |
|
97 | | -- README images updated with raw GitHub links for better reliability: [ComProScanner Logo](https://raw.githubusercontent.com/aritraroy24/ComProScanner/main/assets/comproscanner_logo.png), [ComProScanner Workflow](https://raw.githubusercontent.com/aritraroy24/ComProScanner/main/assets/overall_workflow.png) |
| 139 | +- README images updated with raw GitHub links for better reliability: |
| 140 | + - [ComProScanner Logo](https://raw.githubusercontent.com/aritraroy24/ComProScanner/main/assets/comproscanner_logo.png) |
| 141 | + - [ComProScanner Workflow](https://raw.githubusercontent.com/aritraroy24/ComProScanner/main/assets/overall_workflow.png) |
98 | 142 |
|
99 | | -## [0.1.3] - 04-11-2025 |
| 143 | +--- |
| 144 | +## [0.1.3] - 2025-11-04 |
100 | 145 |
|
101 | 146 | ### Fixed |
102 | 147 |
|
103 | 148 | - **RecursiveCharacterTextSplitter** importing updated for latest _langchain_ version to avoid import errors: |
104 | 149 | - Changed from `from langchain.text_splitter import RecursiveCharacterTextSplitter` |
105 | 150 | - To `from langchain.text_splitter.recursive_character import RecursiveCharacterTextSplitter` |
106 | 151 |
|
107 | | -## [0.1.2] - 24-10-2025 |
| 152 | +--- |
| 153 | +## [0.1.2] - 2025-10-24 |
108 | 154 |
|
109 | 155 | ### Added |
110 | 156 |
|
111 | | -- Link to ComProScanner preprint on arXiv in the documentation index page and README.md: [arXiv:2510.20362](https://arxiv.org/abs/2510.20362) |
| 157 | +- Link to ComProScanner preprint on arXiv in the documentation index page and README.md: |
| 158 | + - [arXiv:2510.20362](https://arxiv.org/abs/2510.20362) |
112 | 159 |
|
113 | | -## [0.1.1] - 22-10-2025 |
| 160 | +--- |
| 161 | +## [0.1.1] - 2025-10-22 |
114 | 162 |
|
115 | 163 | ### Fixed |
116 | 164 |
|
117 | | -- README images updated with external image link to fix PyPI rendering issue. [ComProScanner Logo](https://i.ibb.co/whHSbGvT/comproscanner-logo.png), [ComProScanner Workflow](https://i.ibb.co/QWd2qd3/overall-workflow.png) |
| 165 | +- README images updated with external image link to fix PyPI rendering issue. |
| 166 | + - [ComProScanner Logo](https://i.ibb.co/whHSbGvT/comproscanner-logo.png) |
| 167 | + - [ComProScanner Workflow](https://i.ibb.co/QWd2qd3/overall-workflow.png) |
118 | 168 |
|
119 | | -## [0.1.0] - 22-10-2025 |
| 169 | +--- |
| 170 | +## [0.1.0] - 2025-10-22 |
120 | 171 |
|
121 | 172 | ### Added |
122 | 173 |
|
|
0 commit comments