|
12 | 12 | [](https://huggingface.co/collections/SciKnowOrg/) |
13 | 13 | [](https://github.com/pre-commit/pre-commit) |
14 | 14 | [](https://ontolearner.readthedocs.io/) |
15 | | -[](MAINTANANCE.md) |
| 15 | +[](MAINTENANCE.md) |
16 | 16 | [](https://doi.org/10.5281/zenodo.15399773) |
17 | 17 |
|
18 | | - |
19 | 18 | </div> |
20 | 19 |
|
21 | | -**OntoLearner** is a modular and extensible architecture designed to support ontology learning and reuse. The conceptual and functional architecture of OntoLearner is shown as following. The framework comprises three core components—**Ontologizers**, **Learning Tasks**, and **Learner Models**—structured to enable reusable and customizable ontology engineering workflows. |
| 20 | +--- |
| 21 | + |
| 22 | +**OntoLearner** is a modular and extensible Python library for **ontology learning** powered by Large Language Models (LLMs). It provides a unified framework covering the full workflow — from loading and modularizing ontologies to training, predicting, and evaluating learner models across multiple ontology learning tasks. |
| 23 | + |
| 24 | +The framework is built around three core components: |
| 25 | + |
| 26 | +- 🧩 **Ontologizers** — load, parse, and modularize ontologies from 150+ ready-to-use sources across 20+ domains. |
| 27 | +- 📋 **Learning Tasks** — support for Term Typing, Taxonomy Discovery, Non-Taxonomic Relation Extraction, and Text2Onto. |
| 28 | +- 🤖 **Learner Models** — plug-and-play LLM, Retriever, and RAG-based learners with a consistent `fit → predict → evaluate` interface. |
| 29 | + |
| 30 | +--- |
22 | 31 |
|
23 | 32 | ## 🧪 Installation |
24 | 33 |
|
25 | | -OntoLearner is available on [PyPI](https://pypi.org/project/OntoLearner/) and you can install using `pip`: |
| 34 | +OntoLearner is available on [PyPI](https://pypi.org/project/OntoLearner/) and can be installed with `pip`: |
26 | 35 |
|
27 | 36 | ```bash |
28 | 37 | pip install ontolearner |
29 | 38 | ``` |
30 | 39 |
|
31 | | -Next, verify the installation: |
| 40 | +Verify the installation: |
| 41 | + |
32 | 42 | ```python |
33 | 43 | import ontolearner |
34 | 44 |
|
35 | 45 | print(ontolearner.__version__) |
36 | 46 | ``` |
37 | 47 |
|
38 | | -Please refer to [Installation](https://ontolearner.readthedocs.io/installation.html) page for further options. |
| 48 | +> For additional installation options (e.g., from source, with optional dependencies), see the [Installation Guide](https://ontolearner.readthedocs.io/installation.html). |
| 49 | +
|
| 50 | +--- |
39 | 51 |
|
40 | 52 | ## 🔗 Essential Resources |
41 | 53 |
|
42 | | -| Resource | Info | |
43 | | -|:-----------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------| |
44 | | -| **[📚 OntoLearner Documentation](https://ontolearner.readthedocs.io/)** | OntoLearner's extensive documentation website. | |
45 | | -| **[🤗 Datasets on Hugging Face](https://huggingface.co/collections/SciKnowOrg/ontolearner-benchmarking-6823bcd051300c210b7ef68a)** | Access curated, machine-readable ontologies. | |
46 | | -| **[🚀 Quickstart](https://ontolearner.readthedocs.io/quickstart.html)** | Get started quickly with OntoLearner’s main features and workflow. | |
47 | | -| **[🕸️ Learning Tasks](https://ontolearner.readthedocs.io/learning_tasks/learning_tasks.html)** | Explore supported ontology learning tasks like LLMs4OL Paradigm tasks and Text2Onto. | | |
48 | | -| **[🧠 Learner Models](https://ontolearner.readthedocs.io/learners/llm.html)** | Browse and configure various learner models, including LLMs, Retrieval, or RAG approaches. | |
49 | | -| **[📚 Ontologies Documentations](https://ontolearner.readthedocs.io/benchmarking/benchmark.html)** | Review benchmark ontologies and datasets used for evaluation and training. | |
50 | | -| **[🧩 How to work with Ontologizer?](https://ontolearner.readthedocs.io/ontologizer/ontology_modularization.html)** | Learn how to modularize and preprocess ontologies using the Ontologizer module. | |
51 | | -| **[🤗 Ontology Metrics Dashboard](https://huggingface.co/spaces/SciKnowOrg/OntoLearner-Benchmark-Metrics)** | Benchmark ontologies with their metrics and complexity scores. | |
| 54 | +| Resource | Description | |
| 55 | +|:---------|:------------| |
| 56 | +| **[📚 Documentation](https://ontolearner.readthedocs.io/)** | Full documentation website. | |
| 57 | +| **[🤗 Datasets on Hugging Face](https://huggingface.co/collections/SciKnowOrg/ontolearner-benchmarking-6823bcd051300c210b7ef68a)** | Curated, machine-readable ontology datasets. | |
| 58 | +| **[🚀 Quickstart](https://ontolearner.readthedocs.io/quickstart.html)** | Get started in minutes. | |
| 59 | +| **[🕸️ Learning Tasks](https://ontolearner.readthedocs.io/learning_tasks/learning_tasks.html)** | Term Typing, Taxonomy Discovery, Relation Extraction, and Text2Onto. | |
| 60 | +| **[🧠 Learner Models](https://ontolearner.readthedocs.io/learners/llm.html)** | LLM, Retriever, and RAG-based learner models. | |
| 61 | +| **[📖 Ontologies Documentation](https://ontolearner.readthedocs.io/benchmarking/benchmark.html)** | Browse 150+ benchmark ontologies across 20+ domains. | |
| 62 | +| **[🧩 Ontologizer Guide](https://ontolearner.readthedocs.io/ontologizer/ontology_modularization.html)** | How to modularize and preprocess ontologies. | |
| 63 | +| **[📊 Metrics Dashboard](https://huggingface.co/spaces/SciKnowOrg/OntoLearner-Benchmark-Metrics)** | Explore benchmark ontology metrics and complexity scores. | |
| 64 | + |
| 65 | +--- |
| 66 | + |
| 67 | +## ✨ Key Features |
| 68 | + |
| 69 | +- **150+ Ontologizers** across 20+ domains (biology, medicine, agriculture, chemistry, law, finance, and more). |
| 70 | +- **Multiple learning tasks**: Term Typing, Taxonomy Discovery, Non-Taxonomic Relation Extraction, and Text2Onto. |
| 71 | +- **Three learner paradigms**: LLM-based, Retriever-based, and Retrieval-Augmented Generation (RAG). |
| 72 | +- **Hugging Face integration**: auto-download ontologies and models directly from the Hub. |
| 73 | +- **Unified API**: consistent `fit → predict → evaluate` interface across all learners. |
| 74 | +- **LearnerPipeline**: end-to-end pipeline in a single call. |
| 75 | +- **Extensible**: easily plug in custom ontologies, learners, or retrievers. |
| 76 | + |
| 77 | +--- |
52 | 78 |
|
53 | 79 | ## 🚀 Quick Tour |
54 | | -Get started with OntoLearner in just a few lines of code. This guide demonstrates how to initialize ontologies, load datasets, and train an LLM-assisted learner for ontology engineering tasks. |
55 | 80 |
|
56 | | -**Basic Usage - Automatic Download from Hugging Face**: |
| 81 | +### Loading an Ontology |
| 82 | + |
| 83 | +Load any of the 150+ built-in ontologies and extract task datasets in just a few lines: |
| 84 | + |
57 | 85 | ```python |
58 | 86 | from ontolearner import Wine |
59 | 87 |
|
60 | | -# 1. Initialize an ontologizer from OntoLearner |
| 88 | +# Initialize an ontologizer |
61 | 89 | ontology = Wine() |
62 | 90 |
|
63 | | -# 2. Load the ontology automatically from HuggingFace |
| 91 | +# Auto-download from Hugging Face and load |
64 | 92 | ontology.load() |
65 | 93 |
|
66 | | -# 3. Extract the learning task dataset |
| 94 | +# Extract learning task datasets |
67 | 95 | data = ontology.extract() |
68 | | -``` |
69 | 96 |
|
70 | | -To see the ontology metadata you can print the ontology: |
71 | | -```python |
| 97 | +# Inspect ontology metadata |
72 | 98 | print(ontology) |
73 | 99 | ``` |
74 | 100 |
|
75 | | -Now, explore [150+ ready-to-use ontologies](https://ontolearner.readthedocs.io/benchmarking/benchmark.html) or read on [how to work with ontologizers](https://ontolearner.readthedocs.io/ontologizer/ontology_modularization.html). |
| 101 | +> Explore [150+ ready-to-use ontologies](https://ontolearner.readthedocs.io/benchmarking/benchmark.html) or learn [how to work with ontologizers](https://ontolearner.readthedocs.io/ontologizer/ontology_modularization.html). |
76 | 102 |
|
77 | | -**Learner Models**: |
| 103 | +--- |
| 104 | + |
| 105 | +### Retriever-Based Learner |
| 106 | + |
| 107 | +Use a dense retriever model to perform non-taxonomic relation extraction: |
78 | 108 |
|
79 | 109 | ```python |
80 | 110 | from ontolearner import AutoRetrieverLearner, AgrO, train_test_split, evaluation_report |
81 | 111 |
|
82 | | -# 1. Programmatic import of an ontology |
| 112 | +# Load and extract ontology data |
83 | 113 | ontology = AgrO() |
84 | 114 | ontology.load() |
85 | | - |
86 | | -# 2. Load tasks datasets |
87 | 115 | ontological_data = ontology.extract() |
88 | 116 |
|
89 | | -# 3. Split into train and test sets |
| 117 | +# Split into train and test sets |
90 | 118 | train_data, test_data = train_test_split(ontological_data, test_size=0.2, random_state=42) |
91 | 119 |
|
92 | | -# 4. Initialize Learner |
| 120 | +# Initialize and load a retriever-based learner |
93 | 121 | task = 'non-taxonomic-re' |
94 | 122 | ret_learner = AutoRetrieverLearner(top_k=5) |
95 | 123 | ret_learner.load(model_id='sentence-transformers/all-MiniLM-L6-v2') |
96 | 124 |
|
97 | | -# 5. Fit the model to training data and then predict over the test data |
| 125 | +# Fit on training data and predict on test data |
98 | 126 | ret_learner.fit(train_data, task=task) |
99 | 127 | predicts = ret_learner.predict(test_data, task=task) |
100 | 128 |
|
101 | | -# 6. Evaluation |
| 129 | +# Evaluate predictions |
102 | 130 | truth = ret_learner.tasks_ground_truth_former(data=test_data, task=task) |
103 | 131 | metrics = evaluation_report(y_true=truth, y_pred=predicts, task=task) |
104 | 132 | print(metrics) |
105 | 133 | ``` |
106 | | -Other learners: |
107 | | -* [LLM-Based Learner](https://ontolearner.readthedocs.io/learners/llm.html) |
108 | | -* [RAG-Based Learner](https://ontolearner.readthedocs.io/learners/rag.html) |
109 | 134 |
|
110 | | -**LearnerPipeline**: The OntoLearner also offers a streamlined `LearnerPipeline` class that simplifies the entire process of initializing, training, predicting, and evaluating a RAG setup into a single call. |
| 135 | +Other available learners: |
| 136 | +- [LLM-Based Learner](https://ontolearner.readthedocs.io/learners/llm.html) |
| 137 | +- [RAG-Based Learner](https://ontolearner.readthedocs.io/learners/rag.html) |
111 | 138 |
|
| 139 | +--- |
112 | 140 |
|
| 141 | +### LearnerPipeline |
| 142 | + |
| 143 | +`LearnerPipeline` consolidates the entire workflow — initialization, training, prediction, and evaluation — into a single call: |
113 | 144 |
|
114 | 145 | ```python |
115 | | -# Import core components from the OntoLearner library |
116 | 146 | from ontolearner import LearnerPipeline, AgrO, train_test_split |
117 | 147 |
|
118 | | -# Load the AgrO ontology, which includes structured agricultural knowledge |
| 148 | +# Load ontology and extract data |
119 | 149 | ontology = AgrO() |
120 | | -ontology.load() # Load ontology data (e.g., entities, relations, metadata) |
| 150 | +ontology.load() |
121 | 151 |
|
122 | | -# Extract relation instances from the ontology and split them into training and test sets |
123 | 152 | train_data, test_data = train_test_split( |
124 | | - ontology.extract(), # Extract annotated (head, tail, relation) triples |
125 | | - test_size=0.2, # 20% for evaluation |
126 | | - random_state=42 # Ensures reproducible splits |
| 153 | + ontology.extract(), |
| 154 | + test_size=0.2, |
| 155 | + random_state=42 |
127 | 156 | ) |
128 | 157 |
|
129 | | -# Initialize the learning pipeline using a dense retriever |
| 158 | +# Initialize the pipeline with a dense retriever |
130 | 159 | pipeline = LearnerPipeline( |
131 | | - retriever_id='sentence-transformers/all-MiniLM-L6-v2', # Hugging Face model ID for retrieval |
132 | | - batch_size=10, # Number of samples to process per batch (if batching is enabled internally) |
133 | | - top_k=5 # Retrieve top-5 most relevant support instance per query |
| 160 | + retriever_id='sentence-transformers/all-MiniLM-L6-v2', |
| 161 | + batch_size=10, |
| 162 | + top_k=5 |
134 | 163 | ) |
135 | 164 |
|
136 | | -# Run the pipeline on the training and test data |
137 | | -# The pipeline performs: fit() → predict() → evaluate() in sequence |
| 165 | +# Run: fit → predict → evaluate |
138 | 166 | outputs = pipeline( |
139 | 167 | train_data=train_data, |
140 | 168 | test_data=test_data, |
141 | | - evaluate=True, # If True, computes precision, recall, and F1-score |
142 | | - task='non-taxonomic-re' # Specifies that we are doing non-taxonomic relation prediction |
| 169 | + evaluate=True, |
| 170 | + task='non-taxonomic-re' |
143 | 171 | ) |
144 | 172 |
|
145 | | -# Print the evaluation metrics (precision, recall, F1) |
146 | 173 | print("Metrics:", outputs['metrics']) |
147 | | - |
148 | | -# Print the total elapsed time for training and evaluation |
149 | 174 | print("Elapsed time:", outputs['elapsed_time']) |
150 | | - |
151 | | -# Print the full output dictionary (includes predictions) |
152 | | -print(outputs) |
153 | 175 | ``` |
154 | 176 |
|
| 177 | +--- |
| 178 | + |
155 | 179 | ## ⭐ Contribution |
156 | 180 |
|
157 | | -We welcome contributions to enhance OntoLearner and make it even better! Please review our contribution guidelines in [CONTRIBUTING.md](CONTRIBUTING.md) before getting started. You are also welcome to assist with the ongoing maintenance by referring to [MAINTENANCE.md](MAINTENANCE.md). Your support is greatly appreciated. |
| 181 | +We welcome contributions of all kinds — bug reports, new features, documentation improvements, or new ontologies! |
158 | 182 |
|
| 183 | +Please review our guidelines before getting started: |
| 184 | +- [CONTRIBUTING.md](CONTRIBUTING.md) — contribution guidelines |
| 185 | +- [MAINTENANCE.md](MAINTENANCE.md) — ongoing maintenance notes |
159 | 186 |
|
160 | | -If you encounter any issues or have questions, please submit them in the [GitHub issues tracker](https://github.com/sciknoworg/OntoLearner/issues). |
| 187 | +For bugs or questions, please open an issue in the [GitHub Issue Tracker](https://github.com/sciknoworg/OntoLearner/issues). |
161 | 188 |
|
| 189 | +--- |
162 | 190 |
|
163 | 191 | ## 💡 Acknowledgements |
164 | 192 |
|
165 | | -If you find this repository helpful or use OntoLearner in your work or research, feel free to cite our publication: |
| 193 | +If OntoLearner is useful in your research or work, please consider citing one of our publications: |
166 | 194 |
|
167 | 195 | ```bibtex |
168 | 196 | @inproceedings{babaei2023llms4ol, |
169 | | - title={LLMs4OL: Large language models for ontology learning}, |
170 | | - author={Babaei Giglou, Hamed and D’Souza, Jennifer and Auer, S{\"o}ren}, |
171 | | - booktitle={International Semantic Web Conference}, |
172 | | - pages={408--427}, |
173 | | - year={2023}, |
174 | | - organization={Springer} |
| 197 | + title = {LLMs4OL: Large Language Models for Ontology Learning}, |
| 198 | + author = {Babaei Giglou, Hamed and D'Souza, Jennifer and Auer, S{\"o}ren}, |
| 199 | + booktitle = {International Semantic Web Conference}, |
| 200 | + pages = {408--427}, |
| 201 | + year = {2023}, |
| 202 | + organization = {Springer} |
175 | 203 | } |
176 | 204 | ``` |
177 | | -or: |
| 205 | + |
178 | 206 | ```bibtex |
179 | 207 | @software{babaei_giglou_2025_15399783, |
180 | | - author = {Babaei Giglou, Hamed and D'Souza, Jennifer and Aioanei, Andrei and Mihindukulasooriya, Nandana and Auer, Sören}, |
181 | | - title = {OntoLearner: A Modular Python Library for Ontology Learning with LLMs}, |
182 | | - month = may, |
183 | | - year = 2025, |
184 | | - publisher = {Zenodo}, |
185 | | - version = {v1.3.0}, |
186 | | - doi = {10.5281/zenodo.15399783}, |
187 | | - url = {https://doi.org/10.5281/zenodo.15399783}, |
| 208 | + author = {Babaei Giglou, Hamed and D'Souza, Jennifer and Aioanei, Andrei |
| 209 | + and Mihindukulasooriya, Nandana and Auer, Sören}, |
| 210 | + title = {OntoLearner: A Modular Python Library for Ontology Learning with LLMs}, |
| 211 | + month = may, |
| 212 | + year = 2025, |
| 213 | + publisher = {Zenodo}, |
| 214 | + version = {v1.3.0}, |
| 215 | + doi = {10.5281/zenodo.15399783}, |
| 216 | + url = {https://doi.org/10.5281/zenodo.15399783} |
188 | 217 | } |
189 | 218 | ``` |
190 | 219 |
|
191 | | -*** |
| 220 | +--- |
192 | 221 |
|
193 | | -This software is archived in Zenodo under the DOI [](https://doi.org/10.5281/zenodo.15399773) and is licensed under [](https://opensource.org/licenses/MIT). |
| 222 | +This software is archived on Zenodo under [](https://doi.org/10.5281/zenodo.15399773) and is licensed under [](https://opensource.org/licenses/MIT). |
0 commit comments