|
1 | | -## Unreleased |
| 1 | +## [0.1.6] - 2026-04-02 |
| 2 | + |
| 3 | +### Changed |
| 4 | + |
| 5 | +- Updated [README.md](README.md), [CITATION.cff](CITATION.cff) and docs with the published version (advance article) of the ComProScanner paper in _Digital Discovery_ as fully open access: |
| 6 | + - [ComProScanner: a multi-agent based framework for composition-property structured data extraction from scientific literature](https://doi.org/10.1039/D5DD00521C) |
2 | 7 |
|
3 | 8 | ### Added |
4 | 9 |
|
| 10 | +- Guide for API key creation for various LLM providers and publisher APIs added to the documentation at `docs/getting-started/api-key-guide.md` with detailed instructions for each provider. |
| 11 | + |
| 12 | +### Fixed |
| 13 | + |
| 14 | +- Model prefix handling in `rag_tool.py` standardized to reflect the docs. |
| 15 | +- `HF_TOKEN` documentation clarified as optional — only required for gated or private Hugging Face models. |
| 16 | + |
| 17 | +--- |
| 18 | + |
| 19 | +## [0.1.5] - 2026-02-08 |
| 20 | + |
| 21 | +### Added |
| 22 | + |
| 23 | +- Data related to comparison with other agentic data extraction frameworks added for the ComProScanner paper in the `examples/piezo_test/comparing_existing_frameworks` folder. |
| 24 | + |
5 | 25 | - New parameter `apply_advanced_cleaning` added to data cleaning methods in `data_cleaner.py`. When set to `True`, it triggers the advanced cleaning pipeline. |
6 | 26 |
|
7 | 27 | - Advanced composition cleaning methods in `data_cleaner.py`: |
8 | | - |
9 | 28 | - `_remove_miller_indices()` - Removes crystal plane notations from chemical formulas |
10 | 29 | - `_remove_zero_coefficient_elements()` - Removes elements with zero coefficients |
11 | 30 | - `_normalize_coefficients()` - Removes trailing zeros from coefficients |
12 | 31 | - `_expand_leading_and_trailing_coefficients()` - Expands leading/trailing coefficient patterns |
13 | 32 | - `_expand_parenthetical_coefficients()` - Expands nested bracket coefficients |
14 | 33 |
|
15 | 34 | - Enhanced documentation in `docs/usage/data-cleaning.md`: |
16 | | - |
17 | 35 | - Added `apply_advanced_cleaning` parameter documentation |
18 | 36 | - Added Mermaid process flow diagram showing cleaning stages |
19 | 37 | - Added advanced cleaning examples with tables for each transformation type |
20 | 38 |
|
21 | 39 | - Template for GitHub issues added to [.github/ISSUE_TEMPLATE](https://github.com/slimeslab/ComProScanner/tree/main/.github/ISSUE_TEMPLATE) for the following topics: |
22 | | - |
23 | 40 | - bug reports |
24 | 41 | - feature requests |
25 | 42 | - documentation improvements |
|
28 | 45 | - [Changelog page](https://slimeslab.github.io/ComProScanner/about/changelog/) added in the documentation. Also, [CHANGELOG.md](https://github.com/slimeslab/ComProScanner/blob/main/CHANGELOG.md) linked in [README.md](https://github.com/slimeslab/ComProScanner/blob/main/README.md). |
29 | 46 |
|
30 | 47 | - DeepWiki integration badge added to README.md for community Q&A support: |
31 | | - |
32 | 48 | - [Ask DeepWiki](https://deepwiki.com/slimeslab/ComProScanner) |
33 | 49 |
|
34 | 50 | - arXiv preprint badge added to README.md: |
35 | | - |
36 | 51 | - [arXiv:2510.20362](https://arxiv.org/abs/2510.20362) |
37 | 52 |
|
38 | 53 | - [CITATION.cff](https://github.com/slimeslab/ComProScanner/blob/main/CITATION.cff) added for standardized citation information based on the latest release and arXiv preprint. |
39 | 54 |
|
40 | 55 | ### Fixed |
41 | 56 |
|
42 | | -- CSV progress tracking in `elsevier_processor.py`: |
| 57 | +- OAWorks API is replaced with OpenAlex API as OAWorks is no longer available. |
| 58 | + |
| 59 | +- Empty/corrupted PDF handled in `pdf_processor.py` and `wiley_processor.py` to avoid having GLYPH errors during text extraction. |
43 | 60 |
|
| 61 | +- Data extraction failures fixed if composition-property text data is empty. |
| 62 | + |
| 63 | +- CSV progress tracking in `elsevier_processor.py`: |
44 | 64 | - DtypeWarning resolved by adding `dtype=str, low_memory=False` to `pd.read_csv()` |
45 | 65 | - Data loss issue fixed with immediate CSV persistence for processed articles |
46 | 66 | - Sleep delays optimized for batch writes |
|
63 | 83 |
|
64 | 84 | - README badges section converted from HTML to markdown format for better compatibility across platforms. |
65 | 85 |
|
66 | | -## [0.1.4] - 02-12-2025 |
| 86 | +--- |
| 87 | + |
| 88 | +## [0.1.4] - 2025-12-02 |
67 | 89 |
|
68 | 90 | ### Added |
69 | 91 |
|
70 | 92 | - New function `clean_data()` added for improved data cleaning and preprocessing instead of integrating it into data extraction function. |
71 | 93 |
|
72 | 94 | - New documentation page for Data Cleaning added: |
73 | | - |
74 | 95 | - docs/usage/data-cleaning.md |
75 | 96 | - Added to mkdocs.yml navigation. |
76 | 97 |
|
77 | 98 | - New API overview documentation page added: |
78 | | - |
79 | 99 | - docs/api.md |
80 | 100 | - Added to mkdocs.yml navigation. |
81 | 101 | - New mkdocstrings configuration added to mkdocs.yml for automatic API documentation generation. |
|
94 | 114 |
|
95 | 115 | ### Changed |
96 | 116 |
|
97 | | -- README images updated with raw GitHub links for better reliability: [ComProScanner Logo](https://raw.githubusercontent.com/aritraroy24/ComProScanner/main/assets/comproscanner_logo.png), [ComProScanner Workflow](https://raw.githubusercontent.com/aritraroy24/ComProScanner/main/assets/overall_workflow.png) |
| 117 | +- README images updated with raw GitHub links for better reliability: |
| 118 | + - [ComProScanner Logo](https://raw.githubusercontent.com/aritraroy24/ComProScanner/main/assets/comproscanner_logo.png) |
| 119 | + - [ComProScanner Workflow](https://raw.githubusercontent.com/aritraroy24/ComProScanner/main/assets/overall_workflow.png) |
98 | 120 |
|
99 | | -## [0.1.3] - 04-11-2025 |
| 121 | +--- |
| 122 | + |
| 123 | +## [0.1.3] - 2025-11-04 |
100 | 124 |
|
101 | 125 | ### Fixed |
102 | 126 |
|
103 | 127 | - **RecursiveCharacterTextSplitter** importing updated for latest _langchain_ version to avoid import errors: |
104 | 128 | - Changed from `from langchain.text_splitter import RecursiveCharacterTextSplitter` |
105 | 129 | - To `from langchain.text_splitter.recursive_character import RecursiveCharacterTextSplitter` |
106 | 130 |
|
107 | | -## [0.1.2] - 24-10-2025 |
| 131 | +--- |
| 132 | + |
| 133 | +## [0.1.2] - 2025-10-24 |
108 | 134 |
|
109 | 135 | ### Added |
110 | 136 |
|
111 | | -- Link to ComProScanner preprint on arXiv in the documentation index page and README.md: [arXiv:2510.20362](https://arxiv.org/abs/2510.20362) |
| 137 | +- Link to ComProScanner preprint on arXiv in the documentation index page and README.md: |
| 138 | + - [arXiv:2510.20362](https://arxiv.org/abs/2510.20362) |
| 139 | + |
| 140 | +--- |
112 | 141 |
|
113 | | -## [0.1.1] - 22-10-2025 |
| 142 | +## [0.1.1] - 2025-10-22 |
114 | 143 |
|
115 | 144 | ### Fixed |
116 | 145 |
|
117 | | -- README images updated with external image link to fix PyPI rendering issue. [ComProScanner Logo](https://i.ibb.co/whHSbGvT/comproscanner-logo.png), [ComProScanner Workflow](https://i.ibb.co/QWd2qd3/overall-workflow.png) |
| 146 | +- README images updated with external image link to fix PyPI rendering issue. |
| 147 | + - [ComProScanner Logo](https://i.ibb.co/whHSbGvT/comproscanner-logo.png) |
| 148 | + - [ComProScanner Workflow](https://i.ibb.co/QWd2qd3/overall-workflow.png) |
| 149 | + |
| 150 | +--- |
118 | 151 |
|
119 | | -## [0.1.0] - 22-10-2025 |
| 152 | +## [0.1.0] - 2025-10-22 |
120 | 153 |
|
121 | 154 | ### Added |
122 | 155 |
|
|
0 commit comments