diff --git a/README.md b/README.md index 7536149..047b8b1 100644 --- a/README.md +++ b/README.md @@ -1,45 +1,62 @@ ![Logo](https://github.com/lucasimi/tda-mapper-python/raw/main/docs/source/logos/tda-mapper-logo-horizontal.png) -[![Source Code](https://img.shields.io/badge/lucasimi-tda--mapper--python-blue?logo=github&logoColor=silver)](https://github.com/lucasimi/tda-mapper-python) [![PyPI version](https://img.shields.io/pypi/v/tda-mapper?logo=python&logoColor=silver)](https://pypi.python.org/pypi/tda-mapper) [![downloads](https://img.shields.io/pypi/dm/tda-mapper?logo=python&logoColor=silver)](https://pypi.python.org/pypi/tda-mapper) +[![codecov](https://img.shields.io/codecov/c/github/lucasimi/tda-mapper-python?logo=codecov&logoColor=silver)](https://codecov.io/github/lucasimi/tda-mapper-python) [![test](https://img.shields.io/github/actions/workflow/status/lucasimi/tda-mapper-python/test-unit.yml?logo=github&logoColor=silver&branch=main&label=test)](https://github.com/lucasimi/tda-mapper-python/actions/workflows/test-unit.yml) [![publish](https://img.shields.io/github/actions/workflow/status/lucasimi/tda-mapper-python/publish-pypi.yml?logo=github&logoColor=silver&label=publish)](https://github.com/lucasimi/tda-mapper-python/actions/workflows/publish-pypi.yml) [![docs](https://img.shields.io/readthedocs/tda-mapper/main?logo=readthedocs&logoColor=silver)](https://tda-mapper.readthedocs.io/en/main/) -[![codecov](https://img.shields.io/codecov/c/github/lucasimi/tda-mapper-python?logo=codecov&logoColor=silver)](https://codecov.io/github/lucasimi/tda-mapper-python) [![DOI](https://img.shields.io/badge/DOI-10.5281/zenodo.10642381-blue?logo=doi&logoColor=silver)](https://doi.org/10.5281/zenodo.10642381) -[![Streamlit App](https://img.shields.io/badge/Streamlit-App-blue?logo=streamlit&logoColor=silver)](https://tda-mapper-app.streamlit.app/) # tda-mapper -**tda-mapper** is a Python library based on the Mapper algorithm, a key tool in -Topological Data Analysis (TDA). Designed for efficient computations and backed -by advanced spatial search techniques, it scales seamlessly to high dimensional -data, making it suitable for applications in machine learning, data mining, and -exploratory data analysis. +**tda-mapper** is a Python library built around the Mapper algorithm, a core +technique in Topological Data Analysis (TDA) for extracting topological +structure from complex data. Designed for computational efficiency and +scalability, it leverages optimized spatial search methods to support +high-dimensional datasets. The library is well-suited for integration into +machine learning pipelines, unsupervised learning tasks, and exploratory data +analysis. Further details in the [documentation](https://tda-mapper.readthedocs.io/en/main/) and in the [paper](https://openreview.net/pdf?id=lTX4bYREAZ). -## Main Features +### Core Features + +- **Efficient construction** + + Leverages optimized spatial search techniques and parallelization to + accelerate the construction of Mapper graphs, supporting the analysis of + high-dimensional datasets. + +- **Scikit-learn integration** -- **Fast Mapper graph construction**: Accelerates computations with efficient spatial search, enabling analysis of large, high-dimensional datasets. + Provides custom estimators that are fully compatible with scikit-learn's + API, enabling seamless integration into scikit-learn pipelines for tasks + such as dimensionality reduction, clustering, and feature extraction. -- **Scikit-learn compatibility**: Easily integrate Mapper as a part of your machine learning workflows. +- **Flexible visualization** -- **Flexible visualization options**: Visualize Mapper graphs with multiple supported backends, tailored to your needs. + Multiple visualization backends supported (e.g., Plotly, Matplotlib) for + generating high-quality Mapper graph representations with adjustable + layouts and styling. -- **Interactive exploration**: Explore data interactively through a user-friendly app. +- **Interactive app** + + Provides an interactive web-based interface (via Streamlit) for dynamic + exploration of Mapper graph structures, offering real-time adjustments to + parameters and visualizations. ## Background -The Mapper algorithm transforms complex datasets into graph representations -that highlight clusters, transitions, and topological features. These insights -reveal hidden patterns in data, applicable across fields like social sciences, -biology, and machine learning. For an in-depth coverage of Mapper, including -its mathematical foundations and applications, read the +The Mapper algorithm extracts topological features from complex datasets, +representing them as graphs that highlight clusters, transitions, and key +structural patterns. These insights reveal hidden data relationships and are +applicable across diverse fields, including social sciences, biology, and +machine learning. For an in-depth overview of Mapper, including its +mathematical foundations and practical applications, read [the original paper](https://research.math.osu.edu/tgda/mapperPBG.pdf). | Step 1 | Step 2 | Step 3 | Step 4 | @@ -52,13 +69,11 @@ its mathematical foundations and applications, read the If you use **tda-mapper** in your work, please consider citing both the [library](https://doi.org/10.5281/zenodo.10642381), archived in a permanent Zenodo record, and the [paper](https://openreview.net/pdf?id=lTX4bYREAZ), -which provides a broader methodological overview. -We recommend citing the specific version of the library used in your research, -as well as the paper. -For citation examples, refer to the +which provides a broader methodological overview. We recommend citing the +specific version of the library used in your research, along with the paper. +For citation examples, please refer to the [documentation](https://tda-mapper.readthedocs.io/en/main/#citations). - ## Quick Start ### Installation @@ -71,12 +86,11 @@ pip install tda-mapper ### How to Use -Here's a minimal example using the **circles dataset** from `scikit-learn` to demonstrate how to use **tda-mapper**. -We start by generating the data and visualizing it. -The dataset consists of two concentric circles. -The goal is to compute a Mapper graph that summarizes this structure while preserving topological features. -We proceed as follows: - +Here's a minimal example using the **circles dataset** from `scikit-learn` to +demonstrate how to use **tda-mapper**. This example demonstrates how to apply +the Mapper algorithm on a synthetic dataset (concentric circles). The goal is +to extract a topological graph representation using `PCA` as a lens and +`DBSCAN` for clustering. We proceed as follows: ```python import matplotlib.pyplot as plt @@ -114,7 +128,10 @@ fig.show(config={"scrollZoom": True}) | ---------------- | ------------ | | ![Original Dataset](https://github.com/lucasimi/tda-mapper-python/raw/main/resources/circles_dataset_v2.png) | ![Mapper Graph](https://github.com/lucasimi/tda-mapper-python/raw/main/resources/circles_mean_v2.png) | -Left: the original dataset consisting of two concentric circles with noise, colored by class label. Right: the resulting Mapper graph, built from the PCA projection and clustered using DBSCAN. The two concentric circles are well identified by the connected components in the Mapper graph. +Left: the original dataset consisting of two concentric circles with noise, +colored by class label. Right: the resulting Mapper graph, built from the PCA +projection and clustered using DBSCAN. The two concentric circles are well +identified by the connected components in the Mapper graph. More examples can be found in the [documentation](https://tda-mapper.readthedocs.io/en/main/examples.html). @@ -124,10 +141,18 @@ More examples can be found in the Use our Streamlit app to visualize and explore your data without writing code. You can run a live demo directly on [Streamlit Cloud](https://tda-mapper-app.streamlit.app/), -or locally on your machine using the following: +or locally on your machine. The first time you run the app locally, you may +need to install the required dependencies from the `requirements.txt` file by +running -``` +```bash pip install -r app/requirements.txt +``` + +then run the app locally with + +```bash streamlit run app/streamlit_app.py ``` -![tda-mapper-app](https://github.com/lucasimi/tda-mapper-python/raw/main/resources/tda-mapper-app.png) \ No newline at end of file + +![tda-mapper-app](https://github.com/lucasimi/tda-mapper-python/raw/main/resources/tda-mapper-app.png) diff --git a/docs/source/index.rst b/docs/source/index.rst index a4f0715..c46bd93 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -25,47 +25,56 @@ https://github.com/lucasimi/tda-mapper-python/raw/main/docs/source/logos/tda-mapper-logo-horizontal.png :alt: Logo -|Source Code| |PyPI version| |downloads| |test| |publish| |docs| |codecov| |DOI| -|Streamlit App| +|PyPI version| |downloads| |codecov| |test| |publish| |docs| |DOI| + +|Source Code| tda-mapper ========== -**tda-mapper** is a Python library based on the Mapper algorithm, a key tool in -Topological Data Analysis (TDA). Designed for efficient computations and backed -by advanced spatial search techniques, it scales seamlessly to high dimensional -data, making it suitable for applications in machine learning, data mining, and -exploratory data analysis. +**tda-mapper** is a Python library built around the Mapper algorithm, a core +technique in Topological Data Analysis (TDA) for extracting topological +structure from complex data. Designed for computational efficiency and +scalability, it leverages optimized spatial search methods to support +high-dimensional datasets. The library is well-suited for integration into +machine learning pipelines, unsupervised learning tasks, and exploratory data +analysis. Further details in the `documentation `__ and in the `paper `__. -Main features +Core features ------------- -- **Fast Mapper graph construction**: Accelerates computations with efficient - spatial search, enabling analysis of large, high-dimensional datasets. +- **Efficient construction** + + Leverages optimized spatial search techniques and parallelization to accelerate the construction of Mapper graphs, supporting the analysis of high-dimensional datasets. + +- **Scikit-learn integration** + + Provides custom estimators that are fully compatible with scikit-learn's API, enabling seamless integration into scikit-learn pipelines for tasks such as dimensionality reduction, clustering, and feature extraction. + +- **Flexible visualization** + + Multiple visualization backends supported (e.g., Plotly, Matplotlib) for generating high-quality Mapper graph representations with adjustable layouts and styling. -- **Scikit-learn compatibility**: Easily integrate Mapper as a part of your - machine learning workflows. +- **Interactive app** -- **Flexible visualization options**: Visualize Mapper graphs with multiple - supported backends, tailored to your needs. + Provides an interactive web-based interface (via Streamlit) for dynamic exploration of Mapper graph structures, offering real-time adjustments to parameters and visualizations. -- **Interactive exploration**: Explore data interactively through a - user-friendly app. Background ---------- -The Mapper algorithm transforms complex datasets into graph representations -that highlight clusters, transitions, and topological features. These insights -reveal hidden patterns in data, applicable across fields like social sciences, -biology, and machine learning. For an in-depth coverage of Mapper, including -its mathematical foundations and applications, read the -`original paper `__. +The Mapper algorithm extracts topological features from complex datasets, +representing them as graphs that highlight clusters, transitions, and key +structural patterns. These insights reveal hidden data relationships and are +applicable across diverse fields, including social sciences, biology, and +machine learning. For an in-depth overview of Mapper, including its +mathematical foundations and practical applications, read +`the original paper `__. +-----------------+-----------------+-----------------+-----------------+ | Step 1 | Step 2 | Step 3 | Step 4 | @@ -78,12 +87,12 @@ its mathematical foundations and applications, read the Citations --------- -If you use **tda-mapper** in your work, please consider citing both the +If you use **tda-mapper** in your work, please consider citing both the `library `__, -archived in a permanent Zenodo record, and the +archived in a permanent Zenodo record, and the `paper `__, -which provides a broader methodological overview. -We recommend citing the specific version of the library used in your research, as well as the paper. +which provides a broader methodological overview. We recommend citing the +specific version of the library used in your research, along with the paper. - **tda-mapper**: For example to cite version 0.8.0 you can use: diff --git a/docs/source/quickstart.rst b/docs/source/quickstart.rst index b885553..8912d65 100644 --- a/docs/source/quickstart.rst +++ b/docs/source/quickstart.rst @@ -37,8 +37,11 @@ Development How To Use ---------- -Here's a minimal example using the **circles dataset** from -``scikit-learn`` to demonstrate how to use **tda-mapper**: +Here's a minimal example using the **circles dataset** from `scikit-learn` to +demonstrate how to use **tda-mapper**. This example demonstrates how to apply +the Mapper algorithm on a synthetic dataset (concentric circles). The goal is +to extract a topological graph representation using `PCA` as a lens and +`DBSCAN` for clustering. We proceed as follows: .. code:: python @@ -78,6 +81,11 @@ Here's a minimal example using the **circles dataset** from | |Original Dataset| | |Mapper Graph| | +----------------------------------------+-----------------------------+ +Left: the original dataset consisting of two concentric circles with noise, +colored by class label. Right: the resulting Mapper graph, built from the `PCA` +projection and clustered using `DBSCAN`. The two concentric circles are well +identified by the connected components in the Mapper graph. + More examples can be found in the `documentation `__. @@ -89,13 +97,25 @@ You can run a live demo directly on `Streamlit Cloud `__, or locally on your machine using the following: +Use our Streamlit app to visualize and explore your data without writing code. +You can run a live demo directly on +`Streamlit Cloud `__, +or locally on your machine. The first time you run the app locally, you may +need to install the required dependencies from the `requirements.txt` file by +running + .. code:: bash pip install -r app/requirements.txt + +then run the app locally with + +.. code:: bash + streamlit run app/streamlit_app.py |Interactive App| .. |Original Dataset| image:: https://github.com/lucasimi/tda-mapper-python/raw/main/resources/circles_dataset_v2.png .. |Mapper Graph| image:: https://github.com/lucasimi/tda-mapper-python/raw/main/resources/circles_mean_v2.png -.. |Interactive App| image :: https://github.com/lucasimi/tda-mapper-python/raw/main/resources/tda-mapper-app.png \ No newline at end of file +.. |Interactive App| image:: https://github.com/lucasimi/tda-mapper-python/raw/main/resources/tda-mapper-app.png