Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 58 additions & 33 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,45 +1,62 @@
![Logo](https://github.com/lucasimi/tda-mapper-python/raw/main/docs/source/logos/tda-mapper-logo-horizontal.png)

[![Source Code](https://img.shields.io/badge/lucasimi-tda--mapper--python-blue?logo=github&logoColor=silver)](https://github.com/lucasimi/tda-mapper-python)
[![PyPI version](https://img.shields.io/pypi/v/tda-mapper?logo=python&logoColor=silver)](https://pypi.python.org/pypi/tda-mapper)
[![downloads](https://img.shields.io/pypi/dm/tda-mapper?logo=python&logoColor=silver)](https://pypi.python.org/pypi/tda-mapper)
[![codecov](https://img.shields.io/codecov/c/github/lucasimi/tda-mapper-python?logo=codecov&logoColor=silver)](https://codecov.io/github/lucasimi/tda-mapper-python)
[![test](https://img.shields.io/github/actions/workflow/status/lucasimi/tda-mapper-python/test-unit.yml?logo=github&logoColor=silver&branch=main&label=test)](https://github.com/lucasimi/tda-mapper-python/actions/workflows/test-unit.yml)
[![publish](https://img.shields.io/github/actions/workflow/status/lucasimi/tda-mapper-python/publish-pypi.yml?logo=github&logoColor=silver&label=publish)](https://github.com/lucasimi/tda-mapper-python/actions/workflows/publish-pypi.yml)
[![docs](https://img.shields.io/readthedocs/tda-mapper/main?logo=readthedocs&logoColor=silver)](https://tda-mapper.readthedocs.io/en/main/)
[![codecov](https://img.shields.io/codecov/c/github/lucasimi/tda-mapper-python?logo=codecov&logoColor=silver)](https://codecov.io/github/lucasimi/tda-mapper-python)
[![DOI](https://img.shields.io/badge/DOI-10.5281/zenodo.10642381-blue?logo=doi&logoColor=silver)](https://doi.org/10.5281/zenodo.10642381)
[![Streamlit App](https://img.shields.io/badge/Streamlit-App-blue?logo=streamlit&logoColor=silver)](https://tda-mapper-app.streamlit.app/)

# tda-mapper

**tda-mapper** is a Python library based on the Mapper algorithm, a key tool in
Topological Data Analysis (TDA). Designed for efficient computations and backed
by advanced spatial search techniques, it scales seamlessly to high dimensional
data, making it suitable for applications in machine learning, data mining, and
exploratory data analysis.
**tda-mapper** is a Python library built around the Mapper algorithm, a core
technique in Topological Data Analysis (TDA) for extracting topological
structure from complex data. Designed for computational efficiency and
scalability, it leverages optimized spatial search methods to support
high-dimensional datasets. The library is well-suited for integration into
machine learning pipelines, unsupervised learning tasks, and exploratory data
analysis.

Further details in the
[documentation](https://tda-mapper.readthedocs.io/en/main/)
and in the
[paper](https://openreview.net/pdf?id=lTX4bYREAZ).

## Main Features
### Core Features

- **Efficient construction**

Leverages optimized spatial search techniques and parallelization to
accelerate the construction of Mapper graphs, supporting the analysis of
high-dimensional datasets.

- **Scikit-learn integration**

- **Fast Mapper graph construction**: Accelerates computations with efficient spatial search, enabling analysis of large, high-dimensional datasets.
Provides custom estimators that are fully compatible with scikit-learn's
API, enabling seamless integration into scikit-learn pipelines for tasks
such as dimensionality reduction, clustering, and feature extraction.

- **Scikit-learn compatibility**: Easily integrate Mapper as a part of your machine learning workflows.
- **Flexible visualization**

- **Flexible visualization options**: Visualize Mapper graphs with multiple supported backends, tailored to your needs.
Multiple visualization backends supported (e.g., Plotly, Matplotlib) for
generating high-quality Mapper graph representations with adjustable
layouts and styling.

- **Interactive exploration**: Explore data interactively through a user-friendly app.
- **Interactive app**

Provides an interactive web-based interface (via Streamlit) for dynamic
exploration of Mapper graph structures, offering real-time adjustments to
parameters and visualizations.

## Background

The Mapper algorithm transforms complex datasets into graph representations
that highlight clusters, transitions, and topological features. These insights
reveal hidden patterns in data, applicable across fields like social sciences,
biology, and machine learning. For an in-depth coverage of Mapper, including
its mathematical foundations and applications, read the
The Mapper algorithm extracts topological features from complex datasets,
representing them as graphs that highlight clusters, transitions, and key
structural patterns. These insights reveal hidden data relationships and are
applicable across diverse fields, including social sciences, biology, and
machine learning. For an in-depth overview of Mapper, including its
mathematical foundations and practical applications, read
[the original paper](https://research.math.osu.edu/tgda/mapperPBG.pdf).

| Step 1 | Step 2 | Step 3 | Step 4 |
Expand All @@ -52,13 +69,11 @@ its mathematical foundations and applications, read the
If you use **tda-mapper** in your work, please consider citing both the
[library](https://doi.org/10.5281/zenodo.10642381), archived in a permanent
Zenodo record, and the [paper](https://openreview.net/pdf?id=lTX4bYREAZ),
which provides a broader methodological overview.
We recommend citing the specific version of the library used in your research,
as well as the paper.
For citation examples, refer to the
which provides a broader methodological overview. We recommend citing the
specific version of the library used in your research, along with the paper.
For citation examples, please refer to the
[documentation](https://tda-mapper.readthedocs.io/en/main/#citations).


## Quick Start

### Installation
Expand All @@ -71,12 +86,11 @@ pip install tda-mapper

### How to Use

Here's a minimal example using the **circles dataset** from `scikit-learn` to demonstrate how to use **tda-mapper**.
We start by generating the data and visualizing it.
The dataset consists of two concentric circles.
The goal is to compute a Mapper graph that summarizes this structure while preserving topological features.
We proceed as follows:

Here's a minimal example using the **circles dataset** from `scikit-learn` to
demonstrate how to use **tda-mapper**. This example demonstrates how to apply
the Mapper algorithm on a synthetic dataset (concentric circles). The goal is
to extract a topological graph representation using `PCA` as a lens and
`DBSCAN` for clustering. We proceed as follows:

```python
import matplotlib.pyplot as plt
Expand Down Expand Up @@ -114,7 +128,10 @@ fig.show(config={"scrollZoom": True})
| ---------------- | ------------ |
| ![Original Dataset](https://github.com/lucasimi/tda-mapper-python/raw/main/resources/circles_dataset_v2.png) | ![Mapper Graph](https://github.com/lucasimi/tda-mapper-python/raw/main/resources/circles_mean_v2.png) |

Left: the original dataset consisting of two concentric circles with noise, colored by class label. Right: the resulting Mapper graph, built from the PCA projection and clustered using DBSCAN. The two concentric circles are well identified by the connected components in the Mapper graph.
Left: the original dataset consisting of two concentric circles with noise,
colored by class label. Right: the resulting Mapper graph, built from the PCA
projection and clustered using DBSCAN. The two concentric circles are well
identified by the connected components in the Mapper graph.

More examples can be found in the
[documentation](https://tda-mapper.readthedocs.io/en/main/examples.html).
Expand All @@ -124,10 +141,18 @@ More examples can be found in the
Use our Streamlit app to visualize and explore your data without writing code.
You can run a live demo directly on
[Streamlit Cloud](https://tda-mapper-app.streamlit.app/),
or locally on your machine using the following:
or locally on your machine. The first time you run the app locally, you may
need to install the required dependencies from the `requirements.txt` file by
running

```
```bash
pip install -r app/requirements.txt
```

then run the app locally with

```bash
streamlit run app/streamlit_app.py
```
![tda-mapper-app](https://github.com/lucasimi/tda-mapper-python/raw/main/resources/tda-mapper-app.png)

![tda-mapper-app](https://github.com/lucasimi/tda-mapper-python/raw/main/resources/tda-mapper-app.png)
61 changes: 35 additions & 26 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,47 +25,56 @@
https://github.com/lucasimi/tda-mapper-python/raw/main/docs/source/logos/tda-mapper-logo-horizontal.png
:alt: Logo

|Source Code| |PyPI version| |downloads| |test| |publish| |docs| |codecov| |DOI|
|Streamlit App|
|PyPI version| |downloads| |codecov| |test| |publish| |docs| |DOI|

|Source Code|

tda-mapper
==========

**tda-mapper** is a Python library based on the Mapper algorithm, a key tool in
Topological Data Analysis (TDA). Designed for efficient computations and backed
by advanced spatial search techniques, it scales seamlessly to high dimensional
data, making it suitable for applications in machine learning, data mining, and
exploratory data analysis.
**tda-mapper** is a Python library built around the Mapper algorithm, a core
technique in Topological Data Analysis (TDA) for extracting topological
structure from complex data. Designed for computational efficiency and
scalability, it leverages optimized spatial search methods to support
high-dimensional datasets. The library is well-suited for integration into
machine learning pipelines, unsupervised learning tasks, and exploratory data
analysis.

Further details in the
`documentation <https://tda-mapper.readthedocs.io/en/main/>`__
and in the
`paper <https://openreview.net/pdf?id=lTX4bYREAZ>`__.

Main features
Core features
-------------

- **Fast Mapper graph construction**: Accelerates computations with efficient
spatial search, enabling analysis of large, high-dimensional datasets.
- **Efficient construction**

Leverages optimized spatial search techniques and parallelization to accelerate the construction of Mapper graphs, supporting the analysis of high-dimensional datasets.

- **Scikit-learn integration**

Provides custom estimators that are fully compatible with scikit-learn's API, enabling seamless integration into scikit-learn pipelines for tasks such as dimensionality reduction, clustering, and feature extraction.

- **Flexible visualization**

Multiple visualization backends supported (e.g., Plotly, Matplotlib) for generating high-quality Mapper graph representations with adjustable layouts and styling.

- **Scikit-learn compatibility**: Easily integrate Mapper as a part of your
machine learning workflows.
- **Interactive app**

- **Flexible visualization options**: Visualize Mapper graphs with multiple
supported backends, tailored to your needs.
Provides an interactive web-based interface (via Streamlit) for dynamic exploration of Mapper graph structures, offering real-time adjustments to parameters and visualizations.

- **Interactive exploration**: Explore data interactively through a
user-friendly app.

Background
----------

The Mapper algorithm transforms complex datasets into graph representations
that highlight clusters, transitions, and topological features. These insights
reveal hidden patterns in data, applicable across fields like social sciences,
biology, and machine learning. For an in-depth coverage of Mapper, including
its mathematical foundations and applications, read the
`original paper <https://research.math.osu.edu/tgda/mapperPBG.pdf>`__.
The Mapper algorithm extracts topological features from complex datasets,
representing them as graphs that highlight clusters, transitions, and key
structural patterns. These insights reveal hidden data relationships and are
applicable across diverse fields, including social sciences, biology, and
machine learning. For an in-depth overview of Mapper, including its
mathematical foundations and practical applications, read
`the original paper <https://research.math.osu.edu/tgda/mapperPBG.pdf>`__.

+-----------------+-----------------+-----------------+-----------------+
| Step 1 | Step 2 | Step 3 | Step 4 |
Expand All @@ -78,12 +87,12 @@ its mathematical foundations and applications, read the
Citations
---------

If you use **tda-mapper** in your work, please consider citing both the
If you use **tda-mapper** in your work, please consider citing both the
`library <https://doi.org/10.5281/zenodo.10642381>`__,
archived in a permanent Zenodo record, and the
archived in a permanent Zenodo record, and the
`paper <https://openreview.net/pdf?id=lTX4bYREAZ>`__,
which provides a broader methodological overview.
We recommend citing the specific version of the library used in your research, as well as the paper.
which provides a broader methodological overview. We recommend citing the
specific version of the library used in your research, along with the paper.

- **tda-mapper**: For example to cite version 0.8.0 you can use:

Expand Down
26 changes: 23 additions & 3 deletions docs/source/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,11 @@ Development
How To Use
----------

Here's a minimal example using the **circles dataset** from
``scikit-learn`` to demonstrate how to use **tda-mapper**:
Here's a minimal example using the **circles dataset** from `scikit-learn` to
demonstrate how to use **tda-mapper**. This example demonstrates how to apply
the Mapper algorithm on a synthetic dataset (concentric circles). The goal is
to extract a topological graph representation using `PCA` as a lens and
`DBSCAN` for clustering. We proceed as follows:

.. code:: python

Expand Down Expand Up @@ -78,6 +81,11 @@ Here's a minimal example using the **circles dataset** from
| |Original Dataset| | |Mapper Graph| |
+----------------------------------------+-----------------------------+

Left: the original dataset consisting of two concentric circles with noise,
colored by class label. Right: the resulting Mapper graph, built from the `PCA`
projection and clustered using `DBSCAN`. The two concentric circles are well
identified by the connected components in the Mapper graph.

More examples can be found in the
`documentation <https://tda-mapper.readthedocs.io/en/main/>`__.

Expand All @@ -89,13 +97,25 @@ You can run a live demo directly on
`Streamlit Cloud <https://tda-mapper-app.streamlit.app/>`__,
or locally on your machine using the following:

Use our Streamlit app to visualize and explore your data without writing code.
You can run a live demo directly on
`Streamlit Cloud <https://tda-mapper-app.streamlit.app/>`__,
or locally on your machine. The first time you run the app locally, you may
need to install the required dependencies from the `requirements.txt` file by
running

.. code:: bash

pip install -r app/requirements.txt

then run the app locally with

.. code:: bash

streamlit run app/streamlit_app.py

|Interactive App|

.. |Original Dataset| image:: https://github.com/lucasimi/tda-mapper-python/raw/main/resources/circles_dataset_v2.png
.. |Mapper Graph| image:: https://github.com/lucasimi/tda-mapper-python/raw/main/resources/circles_mean_v2.png
.. |Interactive App| image :: https://github.com/lucasimi/tda-mapper-python/raw/main/resources/tda-mapper-app.png
.. |Interactive App| image:: https://github.com/lucasimi/tda-mapper-python/raw/main/resources/tda-mapper-app.png