Skip to content

Commit 25397a4

Browse files
committed
📝 cosmetic fix to README.md
1 parent 0193275 commit 25397a4

1 file changed

Lines changed: 107 additions & 78 deletions

File tree

README.md

Lines changed: 107 additions & 78 deletions
Original file line numberDiff line numberDiff line change
@@ -12,182 +12,211 @@
1212
[![Hugging Face Collection](https://img.shields.io/badge/🤗HuggingFace-Collection-blue)](https://huggingface.co/collections/SciKnowOrg/)
1313
[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit)](https://github.com/pre-commit/pre-commit)
1414
[![Documentation Status](https://app.readthedocs.org/projects/ontolearner/badge/)](https://ontolearner.readthedocs.io/)
15-
[![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg)](MAINTANANCE.md)
15+
[![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg)](MAINTENANCE.md)
1616
[![DOI](https://zenodo.org/badge/913867999.svg)](https://doi.org/10.5281/zenodo.15399773)
1717

18-
1918
</div>
2019

21-
**OntoLearner** is a modular and extensible architecture designed to support ontology learning and reuse. The conceptual and functional architecture of OntoLearner is shown as following. The framework comprises three core components—**Ontologizers**, **Learning Tasks**, and **Learner Models**—structured to enable reusable and customizable ontology engineering workflows.
20+
---
21+
22+
**OntoLearner** is a modular and extensible Python library for **ontology learning** powered by Large Language Models (LLMs). It provides a unified framework covering the full workflow — from loading and modularizing ontologies to training, predicting, and evaluating learner models across multiple ontology learning tasks.
23+
24+
The framework is built around three core components:
25+
26+
- 🧩 **Ontologizers** — load, parse, and modularize ontologies from 150+ ready-to-use sources across 20+ domains.
27+
- 📋 **Learning Tasks** — support for Term Typing, Taxonomy Discovery, Non-Taxonomic Relation Extraction, and Text2Onto.
28+
- 🤖 **Learner Models** — plug-and-play LLM, Retriever, and RAG-based learners with a consistent `fit → predict → evaluate` interface.
29+
30+
---
2231

2332
## 🧪 Installation
2433

25-
OntoLearner is available on [PyPI](https://pypi.org/project/OntoLearner/) and you can install using `pip`:
34+
OntoLearner is available on [PyPI](https://pypi.org/project/OntoLearner/) and can be installed with `pip`:
2635

2736
```bash
2837
pip install ontolearner
2938
```
3039

31-
Next, verify the installation:
40+
Verify the installation:
41+
3242
```python
3343
import ontolearner
3444

3545
print(ontolearner.__version__)
3646
```
3747

38-
Please refer to [Installation](https://ontolearner.readthedocs.io/installation.html) page for further options.
48+
> For additional installation options (e.g., from source, with optional dependencies), see the [Installation Guide](https://ontolearner.readthedocs.io/installation.html).
49+
50+
---
3951

4052
## 🔗 Essential Resources
4153

42-
| Resource | Info |
43-
|:-----------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------|
44-
| **[📚 OntoLearner Documentation](https://ontolearner.readthedocs.io/)** | OntoLearner's extensive documentation website. |
45-
| **[🤗 Datasets on Hugging Face](https://huggingface.co/collections/SciKnowOrg/ontolearner-benchmarking-6823bcd051300c210b7ef68a)** | Access curated, machine-readable ontologies. |
46-
| **[🚀 Quickstart](https://ontolearner.readthedocs.io/quickstart.html)** | Get started quickly with OntoLearner’s main features and workflow. |
47-
| **[🕸️ Learning Tasks](https://ontolearner.readthedocs.io/learning_tasks/learning_tasks.html)** | Explore supported ontology learning tasks like LLMs4OL Paradigm tasks and Text2Onto. | |
48-
| **[🧠 Learner Models](https://ontolearner.readthedocs.io/learners/llm.html)** | Browse and configure various learner models, including LLMs, Retrieval, or RAG approaches. |
49-
| **[📚 Ontologies Documentations](https://ontolearner.readthedocs.io/benchmarking/benchmark.html)** | Review benchmark ontologies and datasets used for evaluation and training. |
50-
| **[🧩 How to work with Ontologizer?](https://ontolearner.readthedocs.io/ontologizer/ontology_modularization.html)** | Learn how to modularize and preprocess ontologies using the Ontologizer module. |
51-
| **[🤗 Ontology Metrics Dashboard](https://huggingface.co/spaces/SciKnowOrg/OntoLearner-Benchmark-Metrics)** | Benchmark ontologies with their metrics and complexity scores. |
54+
| Resource | Description |
55+
|:---------|:------------|
56+
| **[📚 Documentation](https://ontolearner.readthedocs.io/)** | Full documentation website. |
57+
| **[🤗 Datasets on Hugging Face](https://huggingface.co/collections/SciKnowOrg/ontolearner-benchmarking-6823bcd051300c210b7ef68a)** | Curated, machine-readable ontology datasets. |
58+
| **[🚀 Quickstart](https://ontolearner.readthedocs.io/quickstart.html)** | Get started in minutes. |
59+
| **[🕸️ Learning Tasks](https://ontolearner.readthedocs.io/learning_tasks/learning_tasks.html)** | Term Typing, Taxonomy Discovery, Relation Extraction, and Text2Onto. |
60+
| **[🧠 Learner Models](https://ontolearner.readthedocs.io/learners/llm.html)** | LLM, Retriever, and RAG-based learner models. |
61+
| **[📖 Ontologies Documentation](https://ontolearner.readthedocs.io/benchmarking/benchmark.html)** | Browse 150+ benchmark ontologies across 20+ domains. |
62+
| **[🧩 Ontologizer Guide](https://ontolearner.readthedocs.io/ontologizer/ontology_modularization.html)** | How to modularize and preprocess ontologies. |
63+
| **[📊 Metrics Dashboard](https://huggingface.co/spaces/SciKnowOrg/OntoLearner-Benchmark-Metrics)** | Explore benchmark ontology metrics and complexity scores. |
64+
65+
---
66+
67+
## ✨ Key Features
68+
69+
- **150+ Ontologizers** across 20+ domains (biology, medicine, agriculture, chemistry, law, finance, and more).
70+
- **Multiple learning tasks**: Term Typing, Taxonomy Discovery, Non-Taxonomic Relation Extraction, and Text2Onto.
71+
- **Three learner paradigms**: LLM-based, Retriever-based, and Retrieval-Augmented Generation (RAG).
72+
- **Hugging Face integration**: auto-download ontologies and models directly from the Hub.
73+
- **Unified API**: consistent `fit → predict → evaluate` interface across all learners.
74+
- **LearnerPipeline**: end-to-end pipeline in a single call.
75+
- **Extensible**: easily plug in custom ontologies, learners, or retrievers.
76+
77+
---
5278

5379
## 🚀 Quick Tour
54-
Get started with OntoLearner in just a few lines of code. This guide demonstrates how to initialize ontologies, load datasets, and train an LLM-assisted learner for ontology engineering tasks.
5580

56-
**Basic Usage - Automatic Download from Hugging Face**:
81+
### Loading an Ontology
82+
83+
Load any of the 150+ built-in ontologies and extract task datasets in just a few lines:
84+
5785
```python
5886
from ontolearner import Wine
5987

60-
# 1. Initialize an ontologizer from OntoLearner
88+
# Initialize an ontologizer
6189
ontology = Wine()
6290

63-
# 2. Load the ontology automatically from HuggingFace
91+
# Auto-download from Hugging Face and load
6492
ontology.load()
6593

66-
# 3. Extract the learning task dataset
94+
# Extract learning task datasets
6795
data = ontology.extract()
68-
```
6996

70-
To see the ontology metadata you can print the ontology:
71-
```python
97+
# Inspect ontology metadata
7298
print(ontology)
7399
```
74100

75-
Now, explore [150+ ready-to-use ontologies](https://ontolearner.readthedocs.io/benchmarking/benchmark.html) or read on [how to work with ontologizers](https://ontolearner.readthedocs.io/ontologizer/ontology_modularization.html).
101+
> Explore [150+ ready-to-use ontologies](https://ontolearner.readthedocs.io/benchmarking/benchmark.html) or learn [how to work with ontologizers](https://ontolearner.readthedocs.io/ontologizer/ontology_modularization.html).
76102
77-
**Learner Models**:
103+
---
104+
105+
### Retriever-Based Learner
106+
107+
Use a dense retriever model to perform non-taxonomic relation extraction:
78108

79109
```python
80110
from ontolearner import AutoRetrieverLearner, AgrO, train_test_split, evaluation_report
81111

82-
# 1. Programmatic import of an ontology
112+
# Load and extract ontology data
83113
ontology = AgrO()
84114
ontology.load()
85-
86-
# 2. Load tasks datasets
87115
ontological_data = ontology.extract()
88116

89-
# 3. Split into train and test sets
117+
# Split into train and test sets
90118
train_data, test_data = train_test_split(ontological_data, test_size=0.2, random_state=42)
91119

92-
# 4. Initialize Learner
120+
# Initialize and load a retriever-based learner
93121
task = 'non-taxonomic-re'
94122
ret_learner = AutoRetrieverLearner(top_k=5)
95123
ret_learner.load(model_id='sentence-transformers/all-MiniLM-L6-v2')
96124

97-
# 5. Fit the model to training data and then predict over the test data
125+
# Fit on training data and predict on test data
98126
ret_learner.fit(train_data, task=task)
99127
predicts = ret_learner.predict(test_data, task=task)
100128

101-
# 6. Evaluation
129+
# Evaluate predictions
102130
truth = ret_learner.tasks_ground_truth_former(data=test_data, task=task)
103131
metrics = evaluation_report(y_true=truth, y_pred=predicts, task=task)
104132
print(metrics)
105133
```
106-
Other learners:
107-
* [LLM-Based Learner](https://ontolearner.readthedocs.io/learners/llm.html)
108-
* [RAG-Based Learner](https://ontolearner.readthedocs.io/learners/rag.html)
109134

110-
**LearnerPipeline**: The OntoLearner also offers a streamlined `LearnerPipeline` class that simplifies the entire process of initializing, training, predicting, and evaluating a RAG setup into a single call.
135+
Other available learners:
136+
- [LLM-Based Learner](https://ontolearner.readthedocs.io/learners/llm.html)
137+
- [RAG-Based Learner](https://ontolearner.readthedocs.io/learners/rag.html)
111138

139+
---
112140

141+
### LearnerPipeline
142+
143+
`LearnerPipeline` consolidates the entire workflow — initialization, training, prediction, and evaluation — into a single call:
113144

114145
```python
115-
# Import core components from the OntoLearner library
116146
from ontolearner import LearnerPipeline, AgrO, train_test_split
117147

118-
# Load the AgrO ontology, which includes structured agricultural knowledge
148+
# Load ontology and extract data
119149
ontology = AgrO()
120-
ontology.load() # Load ontology data (e.g., entities, relations, metadata)
150+
ontology.load()
121151

122-
# Extract relation instances from the ontology and split them into training and test sets
123152
train_data, test_data = train_test_split(
124-
ontology.extract(), # Extract annotated (head, tail, relation) triples
125-
test_size=0.2, # 20% for evaluation
126-
random_state=42 # Ensures reproducible splits
153+
ontology.extract(),
154+
test_size=0.2,
155+
random_state=42
127156
)
128157

129-
# Initialize the learning pipeline using a dense retriever
158+
# Initialize the pipeline with a dense retriever
130159
pipeline = LearnerPipeline(
131-
retriever_id='sentence-transformers/all-MiniLM-L6-v2', # Hugging Face model ID for retrieval
132-
batch_size=10, # Number of samples to process per batch (if batching is enabled internally)
133-
top_k=5 # Retrieve top-5 most relevant support instance per query
160+
retriever_id='sentence-transformers/all-MiniLM-L6-v2',
161+
batch_size=10,
162+
top_k=5
134163
)
135164

136-
# Run the pipeline on the training and test data
137-
# The pipeline performs: fit() → predict() → evaluate() in sequence
165+
# Run: fit → predict → evaluate
138166
outputs = pipeline(
139167
train_data=train_data,
140168
test_data=test_data,
141-
evaluate=True, # If True, computes precision, recall, and F1-score
142-
task='non-taxonomic-re' # Specifies that we are doing non-taxonomic relation prediction
169+
evaluate=True,
170+
task='non-taxonomic-re'
143171
)
144172

145-
# Print the evaluation metrics (precision, recall, F1)
146173
print("Metrics:", outputs['metrics'])
147-
148-
# Print the total elapsed time for training and evaluation
149174
print("Elapsed time:", outputs['elapsed_time'])
150-
151-
# Print the full output dictionary (includes predictions)
152-
print(outputs)
153175
```
154176

177+
---
178+
155179
## ⭐ Contribution
156180

157-
We welcome contributions to enhance OntoLearner and make it even better! Please review our contribution guidelines in [CONTRIBUTING.md](CONTRIBUTING.md) before getting started. You are also welcome to assist with the ongoing maintenance by referring to [MAINTENANCE.md](MAINTENANCE.md). Your support is greatly appreciated.
181+
We welcome contributions of all kinds — bug reports, new features, documentation improvements, or new ontologies!
158182

183+
Please review our guidelines before getting started:
184+
- [CONTRIBUTING.md](CONTRIBUTING.md) — contribution guidelines
185+
- [MAINTENANCE.md](MAINTENANCE.md) — ongoing maintenance notes
159186

160-
If you encounter any issues or have questions, please submit them in the [GitHub issues tracker](https://github.com/sciknoworg/OntoLearner/issues).
187+
For bugs or questions, please open an issue in the [GitHub Issue Tracker](https://github.com/sciknoworg/OntoLearner/issues).
161188

189+
---
162190

163191
## 💡 Acknowledgements
164192

165-
If you find this repository helpful or use OntoLearner in your work or research, feel free to cite our publication:
193+
If OntoLearner is useful in your research or work, please consider citing one of our publications:
166194

167195
```bibtex
168196
@inproceedings{babaei2023llms4ol,
169-
title={LLMs4OL: Large language models for ontology learning},
170-
author={Babaei Giglou, Hamed and DSouza, Jennifer and Auer, S{\"o}ren},
171-
booktitle={International Semantic Web Conference},
172-
pages={408--427},
173-
year={2023},
174-
organization={Springer}
197+
title = {LLMs4OL: Large Language Models for Ontology Learning},
198+
author = {Babaei Giglou, Hamed and D'Souza, Jennifer and Auer, S{\"o}ren},
199+
booktitle = {International Semantic Web Conference},
200+
pages = {408--427},
201+
year = {2023},
202+
organization = {Springer}
175203
}
176204
```
177-
or:
205+
178206
```bibtex
179207
@software{babaei_giglou_2025_15399783,
180-
author = {Babaei Giglou, Hamed and D'Souza, Jennifer and Aioanei, Andrei and Mihindukulasooriya, Nandana and Auer, Sören},
181-
title = {OntoLearner: A Modular Python Library for Ontology Learning with LLMs},
182-
month = may,
183-
year = 2025,
184-
publisher = {Zenodo},
185-
version = {v1.3.0},
186-
doi = {10.5281/zenodo.15399783},
187-
url = {https://doi.org/10.5281/zenodo.15399783},
208+
author = {Babaei Giglou, Hamed and D'Souza, Jennifer and Aioanei, Andrei
209+
and Mihindukulasooriya, Nandana and Auer, Sören},
210+
title = {OntoLearner: A Modular Python Library for Ontology Learning with LLMs},
211+
month = may,
212+
year = 2025,
213+
publisher = {Zenodo},
214+
version = {v1.3.0},
215+
doi = {10.5281/zenodo.15399783},
216+
url = {https://doi.org/10.5281/zenodo.15399783}
188217
}
189218
```
190219

191-
***
220+
---
192221

193-
This software is archived in Zenodo under the DOI [![DOI](https://zenodo.org/badge/913867999.svg)](https://doi.org/10.5281/zenodo.15399773) and is licensed under [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT).
222+
This software is archived on Zenodo under [![DOI](https://zenodo.org/badge/913867999.svg)](https://doi.org/10.5281/zenodo.15399773) and is licensed under [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT).

0 commit comments

Comments
 (0)