From 114a64e834bb2da4b62e57c6f9b9d991fc189a3f Mon Sep 17 00:00:00 2001 From: Luca Simi Date: Tue, 13 May 2025 08:26:08 +0200 Subject: [PATCH 1/2] Improved docs content and formatting --- README.md | 64 ++++++++++++++++---------------------- docs/source/index.rst | 61 ++++++++++++++++++++---------------- docs/source/quickstart.rst | 26 ++++++++++++++-- 3 files changed, 85 insertions(+), 66 deletions(-) diff --git a/README.md b/README.md index 75361490..9bff149e 100644 --- a/README.md +++ b/README.md @@ -1,46 +1,43 @@ ![Logo](https://github.com/lucasimi/tda-mapper-python/raw/main/docs/source/logos/tda-mapper-logo-horizontal.png) -[![Source Code](https://img.shields.io/badge/lucasimi-tda--mapper--python-blue?logo=github&logoColor=silver)](https://github.com/lucasimi/tda-mapper-python) [![PyPI version](https://img.shields.io/pypi/v/tda-mapper?logo=python&logoColor=silver)](https://pypi.python.org/pypi/tda-mapper) [![downloads](https://img.shields.io/pypi/dm/tda-mapper?logo=python&logoColor=silver)](https://pypi.python.org/pypi/tda-mapper) +[![codecov](https://img.shields.io/codecov/c/github/lucasimi/tda-mapper-python?logo=codecov&logoColor=silver)](https://codecov.io/github/lucasimi/tda-mapper-python) [![test](https://img.shields.io/github/actions/workflow/status/lucasimi/tda-mapper-python/test-unit.yml?logo=github&logoColor=silver&branch=main&label=test)](https://github.com/lucasimi/tda-mapper-python/actions/workflows/test-unit.yml) [![publish](https://img.shields.io/github/actions/workflow/status/lucasimi/tda-mapper-python/publish-pypi.yml?logo=github&logoColor=silver&label=publish)](https://github.com/lucasimi/tda-mapper-python/actions/workflows/publish-pypi.yml) [![docs](https://img.shields.io/readthedocs/tda-mapper/main?logo=readthedocs&logoColor=silver)](https://tda-mapper.readthedocs.io/en/main/) -[![codecov](https://img.shields.io/codecov/c/github/lucasimi/tda-mapper-python?logo=codecov&logoColor=silver)](https://codecov.io/github/lucasimi/tda-mapper-python) [![DOI](https://img.shields.io/badge/DOI-10.5281/zenodo.10642381-blue?logo=doi&logoColor=silver)](https://doi.org/10.5281/zenodo.10642381) -[![Streamlit App](https://img.shields.io/badge/Streamlit-App-blue?logo=streamlit&logoColor=silver)](https://tda-mapper-app.streamlit.app/) # tda-mapper -**tda-mapper** is a Python library based on the Mapper algorithm, a key tool in -Topological Data Analysis (TDA). Designed for efficient computations and backed -by advanced spatial search techniques, it scales seamlessly to high dimensional -data, making it suitable for applications in machine learning, data mining, and -exploratory data analysis. +**tda-mapper** is a Python library built around the Mapper algorithm, a core technique in Topological Data Analysis (TDA) for extracting topological structure from complex data. Designed for computational efficiency and scalability, it leverages optimized spatial search methods to support high-dimensional datasets. The library is well-suited for integration into machine learning pipelines, unsupervised learning tasks, and exploratory data analysis. Further details in the [documentation](https://tda-mapper.readthedocs.io/en/main/) and in the [paper](https://openreview.net/pdf?id=lTX4bYREAZ). -## Main Features +### Core Features + +- **Efficient construction** + + Leverages optimized spatial search techniques and parallelization to accelerate the construction of Mapper graphs, supporting the analysis of high-dimensional datasets. + +- **Scikit-learn integration** -- **Fast Mapper graph construction**: Accelerates computations with efficient spatial search, enabling analysis of large, high-dimensional datasets. + Provides custom estimators that are fully compatible with scikit-learn's API, enabling seamless integration into scikit-learn pipelines for tasks such as dimensionality reduction, clustering, and feature extraction. -- **Scikit-learn compatibility**: Easily integrate Mapper as a part of your machine learning workflows. +- **Flexible visualization** -- **Flexible visualization options**: Visualize Mapper graphs with multiple supported backends, tailored to your needs. + Multiple visualization backends supported (e.g., Plotly, Matplotlib) for generating high-quality Mapper graph representations with adjustable layouts and styling. -- **Interactive exploration**: Explore data interactively through a user-friendly app. +- **Interactive app** + + Provides an interactive web-based interface (via Streamlit) for dynamic exploration of Mapper graph structures, offering real-time adjustments to parameters and visualizations. ## Background -The Mapper algorithm transforms complex datasets into graph representations -that highlight clusters, transitions, and topological features. These insights -reveal hidden patterns in data, applicable across fields like social sciences, -biology, and machine learning. For an in-depth coverage of Mapper, including -its mathematical foundations and applications, read the -[the original paper](https://research.math.osu.edu/tgda/mapperPBG.pdf). +The Mapper algorithm extracts topological features from complex datasets, representing them as graphs that highlight clusters, transitions, and key structural patterns. These insights reveal hidden data relationships and are applicable across diverse fields, including social sciences, biology, and machine learning. For an in-depth overview of Mapper, including its mathematical foundations and practical applications, read [the original paper](https://research.math.osu.edu/tgda/mapperPBG.pdf). | Step 1 | Step 2 | Step 3 | Step 4 | | ------ | ------ | ------ | ------ | @@ -49,15 +46,7 @@ its mathematical foundations and applications, read the ## Citations -If you use **tda-mapper** in your work, please consider citing both the -[library](https://doi.org/10.5281/zenodo.10642381), archived in a permanent -Zenodo record, and the [paper](https://openreview.net/pdf?id=lTX4bYREAZ), -which provides a broader methodological overview. -We recommend citing the specific version of the library used in your research, -as well as the paper. -For citation examples, refer to the -[documentation](https://tda-mapper.readthedocs.io/en/main/#citations). - +If you use **tda-mapper** in your work, please consider citing both the [library](https://doi.org/10.5281/zenodo.10642381), archived in a permanent Zenodo record, and the [paper](https://openreview.net/pdf?id=lTX4bYREAZ), which provides a broader methodological overview. We recommend citing the specific version of the library used in your research, along with the paper. For citation examples, please refer to the [documentation](https://tda-mapper.readthedocs.io/en/main/#citations). ## Quick Start @@ -71,12 +60,7 @@ pip install tda-mapper ### How to Use -Here's a minimal example using the **circles dataset** from `scikit-learn` to demonstrate how to use **tda-mapper**. -We start by generating the data and visualizing it. -The dataset consists of two concentric circles. -The goal is to compute a Mapper graph that summarizes this structure while preserving topological features. -We proceed as follows: - +Here's a minimal example using the **circles dataset** from `scikit-learn` to demonstrate how to use **tda-mapper**. This example demonstrates how to apply the Mapper algorithm on a synthetic dataset (concentric circles). The goal is to extract a topological graph representation using `PCA` as a lens and `DBSCAN` for clustering. We proceed as follows: ```python import matplotlib.pyplot as plt @@ -124,10 +108,16 @@ More examples can be found in the Use our Streamlit app to visualize and explore your data without writing code. You can run a live demo directly on [Streamlit Cloud](https://tda-mapper-app.streamlit.app/), -or locally on your machine using the following: +or locally on your machine. The first time you run the app locally, you may need to install the required dependencies from the `requirements.txt` file by running -``` +```bash pip install -r app/requirements.txt +``` + +then run the app locally with + +```bash streamlit run app/streamlit_app.py ``` -![tda-mapper-app](https://github.com/lucasimi/tda-mapper-python/raw/main/resources/tda-mapper-app.png) \ No newline at end of file + +![tda-mapper-app](https://github.com/lucasimi/tda-mapper-python/raw/main/resources/tda-mapper-app.png) diff --git a/docs/source/index.rst b/docs/source/index.rst index a4f0715e..c46bd93a 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -25,47 +25,56 @@ https://github.com/lucasimi/tda-mapper-python/raw/main/docs/source/logos/tda-mapper-logo-horizontal.png :alt: Logo -|Source Code| |PyPI version| |downloads| |test| |publish| |docs| |codecov| |DOI| -|Streamlit App| +|PyPI version| |downloads| |codecov| |test| |publish| |docs| |DOI| + +|Source Code| tda-mapper ========== -**tda-mapper** is a Python library based on the Mapper algorithm, a key tool in -Topological Data Analysis (TDA). Designed for efficient computations and backed -by advanced spatial search techniques, it scales seamlessly to high dimensional -data, making it suitable for applications in machine learning, data mining, and -exploratory data analysis. +**tda-mapper** is a Python library built around the Mapper algorithm, a core +technique in Topological Data Analysis (TDA) for extracting topological +structure from complex data. Designed for computational efficiency and +scalability, it leverages optimized spatial search methods to support +high-dimensional datasets. The library is well-suited for integration into +machine learning pipelines, unsupervised learning tasks, and exploratory data +analysis. Further details in the `documentation `__ and in the `paper `__. -Main features +Core features ------------- -- **Fast Mapper graph construction**: Accelerates computations with efficient - spatial search, enabling analysis of large, high-dimensional datasets. +- **Efficient construction** + + Leverages optimized spatial search techniques and parallelization to accelerate the construction of Mapper graphs, supporting the analysis of high-dimensional datasets. + +- **Scikit-learn integration** + + Provides custom estimators that are fully compatible with scikit-learn's API, enabling seamless integration into scikit-learn pipelines for tasks such as dimensionality reduction, clustering, and feature extraction. + +- **Flexible visualization** + + Multiple visualization backends supported (e.g., Plotly, Matplotlib) for generating high-quality Mapper graph representations with adjustable layouts and styling. -- **Scikit-learn compatibility**: Easily integrate Mapper as a part of your - machine learning workflows. +- **Interactive app** -- **Flexible visualization options**: Visualize Mapper graphs with multiple - supported backends, tailored to your needs. + Provides an interactive web-based interface (via Streamlit) for dynamic exploration of Mapper graph structures, offering real-time adjustments to parameters and visualizations. -- **Interactive exploration**: Explore data interactively through a - user-friendly app. Background ---------- -The Mapper algorithm transforms complex datasets into graph representations -that highlight clusters, transitions, and topological features. These insights -reveal hidden patterns in data, applicable across fields like social sciences, -biology, and machine learning. For an in-depth coverage of Mapper, including -its mathematical foundations and applications, read the -`original paper `__. +The Mapper algorithm extracts topological features from complex datasets, +representing them as graphs that highlight clusters, transitions, and key +structural patterns. These insights reveal hidden data relationships and are +applicable across diverse fields, including social sciences, biology, and +machine learning. For an in-depth overview of Mapper, including its +mathematical foundations and practical applications, read +`the original paper `__. +-----------------+-----------------+-----------------+-----------------+ | Step 1 | Step 2 | Step 3 | Step 4 | @@ -78,12 +87,12 @@ its mathematical foundations and applications, read the Citations --------- -If you use **tda-mapper** in your work, please consider citing both the +If you use **tda-mapper** in your work, please consider citing both the `library `__, -archived in a permanent Zenodo record, and the +archived in a permanent Zenodo record, and the `paper `__, -which provides a broader methodological overview. -We recommend citing the specific version of the library used in your research, as well as the paper. +which provides a broader methodological overview. We recommend citing the +specific version of the library used in your research, along with the paper. - **tda-mapper**: For example to cite version 0.8.0 you can use: diff --git a/docs/source/quickstart.rst b/docs/source/quickstart.rst index b8855537..8912d650 100644 --- a/docs/source/quickstart.rst +++ b/docs/source/quickstart.rst @@ -37,8 +37,11 @@ Development How To Use ---------- -Here's a minimal example using the **circles dataset** from -``scikit-learn`` to demonstrate how to use **tda-mapper**: +Here's a minimal example using the **circles dataset** from `scikit-learn` to +demonstrate how to use **tda-mapper**. This example demonstrates how to apply +the Mapper algorithm on a synthetic dataset (concentric circles). The goal is +to extract a topological graph representation using `PCA` as a lens and +`DBSCAN` for clustering. We proceed as follows: .. code:: python @@ -78,6 +81,11 @@ Here's a minimal example using the **circles dataset** from | |Original Dataset| | |Mapper Graph| | +----------------------------------------+-----------------------------+ +Left: the original dataset consisting of two concentric circles with noise, +colored by class label. Right: the resulting Mapper graph, built from the `PCA` +projection and clustered using `DBSCAN`. The two concentric circles are well +identified by the connected components in the Mapper graph. + More examples can be found in the `documentation `__. @@ -89,13 +97,25 @@ You can run a live demo directly on `Streamlit Cloud `__, or locally on your machine using the following: +Use our Streamlit app to visualize and explore your data without writing code. +You can run a live demo directly on +`Streamlit Cloud `__, +or locally on your machine. The first time you run the app locally, you may +need to install the required dependencies from the `requirements.txt` file by +running + .. code:: bash pip install -r app/requirements.txt + +then run the app locally with + +.. code:: bash + streamlit run app/streamlit_app.py |Interactive App| .. |Original Dataset| image:: https://github.com/lucasimi/tda-mapper-python/raw/main/resources/circles_dataset_v2.png .. |Mapper Graph| image:: https://github.com/lucasimi/tda-mapper-python/raw/main/resources/circles_mean_v2.png -.. |Interactive App| image :: https://github.com/lucasimi/tda-mapper-python/raw/main/resources/tda-mapper-app.png \ No newline at end of file +.. |Interactive App| image:: https://github.com/lucasimi/tda-mapper-python/raw/main/resources/tda-mapper-app.png From b17a093764f5f7e8ed1fd22f051478b8c4ee896d Mon Sep 17 00:00:00 2001 From: Luca Simi Date: Tue, 13 May 2025 22:35:28 +0200 Subject: [PATCH 2/2] Fixed formatting --- README.md | 55 +++++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 45 insertions(+), 10 deletions(-) diff --git a/README.md b/README.md index 9bff149e..047b8b11 100644 --- a/README.md +++ b/README.md @@ -10,7 +10,13 @@ # tda-mapper -**tda-mapper** is a Python library built around the Mapper algorithm, a core technique in Topological Data Analysis (TDA) for extracting topological structure from complex data. Designed for computational efficiency and scalability, it leverages optimized spatial search methods to support high-dimensional datasets. The library is well-suited for integration into machine learning pipelines, unsupervised learning tasks, and exploratory data analysis. +**tda-mapper** is a Python library built around the Mapper algorithm, a core +technique in Topological Data Analysis (TDA) for extracting topological +structure from complex data. Designed for computational efficiency and +scalability, it leverages optimized spatial search methods to support +high-dimensional datasets. The library is well-suited for integration into +machine learning pipelines, unsupervised learning tasks, and exploratory data +analysis. Further details in the [documentation](https://tda-mapper.readthedocs.io/en/main/) @@ -21,23 +27,37 @@ and in the - **Efficient construction** - Leverages optimized spatial search techniques and parallelization to accelerate the construction of Mapper graphs, supporting the analysis of high-dimensional datasets. + Leverages optimized spatial search techniques and parallelization to + accelerate the construction of Mapper graphs, supporting the analysis of + high-dimensional datasets. - **Scikit-learn integration** - Provides custom estimators that are fully compatible with scikit-learn's API, enabling seamless integration into scikit-learn pipelines for tasks such as dimensionality reduction, clustering, and feature extraction. + Provides custom estimators that are fully compatible with scikit-learn's + API, enabling seamless integration into scikit-learn pipelines for tasks + such as dimensionality reduction, clustering, and feature extraction. - **Flexible visualization** - Multiple visualization backends supported (e.g., Plotly, Matplotlib) for generating high-quality Mapper graph representations with adjustable layouts and styling. + Multiple visualization backends supported (e.g., Plotly, Matplotlib) for + generating high-quality Mapper graph representations with adjustable + layouts and styling. - **Interactive app** - Provides an interactive web-based interface (via Streamlit) for dynamic exploration of Mapper graph structures, offering real-time adjustments to parameters and visualizations. + Provides an interactive web-based interface (via Streamlit) for dynamic + exploration of Mapper graph structures, offering real-time adjustments to + parameters and visualizations. ## Background -The Mapper algorithm extracts topological features from complex datasets, representing them as graphs that highlight clusters, transitions, and key structural patterns. These insights reveal hidden data relationships and are applicable across diverse fields, including social sciences, biology, and machine learning. For an in-depth overview of Mapper, including its mathematical foundations and practical applications, read [the original paper](https://research.math.osu.edu/tgda/mapperPBG.pdf). +The Mapper algorithm extracts topological features from complex datasets, +representing them as graphs that highlight clusters, transitions, and key +structural patterns. These insights reveal hidden data relationships and are +applicable across diverse fields, including social sciences, biology, and +machine learning. For an in-depth overview of Mapper, including its +mathematical foundations and practical applications, read +[the original paper](https://research.math.osu.edu/tgda/mapperPBG.pdf). | Step 1 | Step 2 | Step 3 | Step 4 | | ------ | ------ | ------ | ------ | @@ -46,7 +66,13 @@ The Mapper algorithm extracts topological features from complex datasets, repres ## Citations -If you use **tda-mapper** in your work, please consider citing both the [library](https://doi.org/10.5281/zenodo.10642381), archived in a permanent Zenodo record, and the [paper](https://openreview.net/pdf?id=lTX4bYREAZ), which provides a broader methodological overview. We recommend citing the specific version of the library used in your research, along with the paper. For citation examples, please refer to the [documentation](https://tda-mapper.readthedocs.io/en/main/#citations). +If you use **tda-mapper** in your work, please consider citing both the +[library](https://doi.org/10.5281/zenodo.10642381), archived in a permanent +Zenodo record, and the [paper](https://openreview.net/pdf?id=lTX4bYREAZ), +which provides a broader methodological overview. We recommend citing the +specific version of the library used in your research, along with the paper. +For citation examples, please refer to the +[documentation](https://tda-mapper.readthedocs.io/en/main/#citations). ## Quick Start @@ -60,7 +86,11 @@ pip install tda-mapper ### How to Use -Here's a minimal example using the **circles dataset** from `scikit-learn` to demonstrate how to use **tda-mapper**. This example demonstrates how to apply the Mapper algorithm on a synthetic dataset (concentric circles). The goal is to extract a topological graph representation using `PCA` as a lens and `DBSCAN` for clustering. We proceed as follows: +Here's a minimal example using the **circles dataset** from `scikit-learn` to +demonstrate how to use **tda-mapper**. This example demonstrates how to apply +the Mapper algorithm on a synthetic dataset (concentric circles). The goal is +to extract a topological graph representation using `PCA` as a lens and +`DBSCAN` for clustering. We proceed as follows: ```python import matplotlib.pyplot as plt @@ -98,7 +128,10 @@ fig.show(config={"scrollZoom": True}) | ---------------- | ------------ | | ![Original Dataset](https://github.com/lucasimi/tda-mapper-python/raw/main/resources/circles_dataset_v2.png) | ![Mapper Graph](https://github.com/lucasimi/tda-mapper-python/raw/main/resources/circles_mean_v2.png) | -Left: the original dataset consisting of two concentric circles with noise, colored by class label. Right: the resulting Mapper graph, built from the PCA projection and clustered using DBSCAN. The two concentric circles are well identified by the connected components in the Mapper graph. +Left: the original dataset consisting of two concentric circles with noise, +colored by class label. Right: the resulting Mapper graph, built from the PCA +projection and clustered using DBSCAN. The two concentric circles are well +identified by the connected components in the Mapper graph. More examples can be found in the [documentation](https://tda-mapper.readthedocs.io/en/main/examples.html). @@ -108,7 +141,9 @@ More examples can be found in the Use our Streamlit app to visualize and explore your data without writing code. You can run a live demo directly on [Streamlit Cloud](https://tda-mapper-app.streamlit.app/), -or locally on your machine. The first time you run the app locally, you may need to install the required dependencies from the `requirements.txt` file by running +or locally on your machine. The first time you run the app locally, you may +need to install the required dependencies from the `requirements.txt` file by +running ```bash pip install -r app/requirements.txt