Skip to content

Commit 7156e0f

Browse files
committed
docs: add pre-release changes
1 parent 2ed696e commit 7156e0f

8 files changed

Lines changed: 370 additions & 58 deletions

File tree

.gitignore

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,3 +55,11 @@ dmypy.json
5555

5656
# Pre-commit
5757
.pre-commit-config.yaml.bak
58+
.clinerules/byterover-rules.md
59+
.kilocode/rules/byterover-rules.md
60+
.roo/rules/byterover-rules.md
61+
.windsurf/rules/byterover-rules.md
62+
.cursor/rules/byterover-rules.mdc
63+
.kiro/steering/byterover-rules.md
64+
.qoder/rules/byterover-rules.md
65+
.augment/rules/byterover-rules.md

CHANGELOG.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# Changelog
2+
3+
All notable changes to AtlasPatch will be documented in this file.
4+
5+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7+
8+
## [1.0.0] - 2025-02-03
9+
10+
### Added
11+
12+
#### Core Pipeline
13+
- SAM2-based tissue segmentation finetuned on ~35,000 diverse WSI thumbnails
14+
- Four-checkpoint pipeline architecture:
15+
- `detect-tissue`: tissue detection with mask visualization
16+
- `segment-and-get-coords`: segmentation + patch coordinate extraction
17+
- `process`: full pipeline with feature embedding
18+
- `--save-images`: optional patch image export
19+
- HDF5 output format with coordinates (`coords`) and features (`features/<encoder>`)
20+
- Slide passport metadata in H5 files (vendor, MPP, staining info)
21+
- Per-slide lock files for safe parallel job execution
22+
- MPP override via CSV for slides with missing/incorrect metadata
23+
24+
#### Feature Extractors (66)
25+
- **Natural image backbones**: ResNet (18/34/50/101/152), ConvNeXt (tiny/small/base/large), ViT (B/L/H)
26+
- **DINOv2**: small, base, large, giant
27+
- **DINOv3**: vits16, vitb16, vitl16, vith16_plus, vit7b16 (+ SAT variants)
28+
- **Pathology encoders**: UNI v1/v2, Phikon v1/v2, Virchow v1/v2, GigaPath, CHIEF-CTransPath, Midnight, OpenMidnight, MUSK, PathOrchestra, H-optimus-0/1, H0-mini, CONCH v1/v1.5, Hibou B/L
29+
- **Lunit models**: ResNet50 (BT/SwAV/MoCov2), ViT-S (patch16/patch8 DINO)
30+
- **CLIP variants**: RN50/101/50x4/50x16/50x64, ViT-B/L
31+
- **Medical CLIP**: PLIP, MedSigLIP, Quilt (B-32/B-16/B-16-PMB), BiomedCLIP, OmiCLIP
32+
- Custom encoder plugin system via `--feature-plugin`
33+
34+
#### Visualization
35+
- Tissue mask overlays
36+
- Contour visualization
37+
- Patch grid overlays
38+
- Configurable output directory structure
39+
40+
#### CLI & Configuration
41+
- `atlaspatch` CLI with subcommands
42+
- Configurable patch size, target magnification, step size
43+
- Device selection for segmentation and feature extraction
44+
- Batch size controls for segmentation and feature extraction
45+
- Precision options (float32, float16, bfloat16)
46+
- Fast mode (default) for high-throughput processing
47+
- Content filtering options (white/black thresholds)
48+
- Recursive directory processing
49+
- Skip existing / force overwrite modes
50+
51+
#### HPC Support
52+
- SLURM job script templates (`jobs/atlaspatch_patch.slurm.sh`, `jobs/atlaspatch_features.slurm.sh`)
53+
- Multi-GPU support via separate device flags
54+
- Configurable worker counts for parallel processing
55+
56+
#### Documentation
57+
- Comprehensive README with usage examples
58+
- Pipeline checkpoint diagrams
59+
- Feature extractor reference table
60+
- FAQ section for common issues
61+
- Issue templates (bug report, feature request)
62+
- Pull request template
63+
64+
### Fixed
65+
- Contour scaling from mask resolution to thumbnail
66+
- H-optimus transformation pipeline
67+
- MUSK model registration
68+
69+
[1.0.0]: https://github.com/AtlasAnalyticsLab/AtlasPatch/releases/tag/v1.0.0

README.md

Lines changed: 196 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,24 @@
44

55
# AtlasPatch: An Efficient and Scalable Tool for Whole Slide Image Preprocessing in Computational Pathology
66

7+
<p align="center">
8+
<a href="https://pypi.org/project/atlas-patch/"><img alt="PyPI" src="https://img.shields.io/pypi/v/atlas-patch"></a>
9+
<a href="https://pypi.org/project/atlas-patch/"><img alt="Python" src="https://img.shields.io/pypi/pyversions/atlas-patch"></a>
10+
<a href="LICENSE"><img alt="License" src="https://img.shields.io/badge/License-CC--BY--NC--SA--4.0-blue"></a>
11+
</p>
12+
13+
<!-- TODO: Update paper link (XXXX.XXXXX) once published on arXiv -->
14+
<p align="center">
15+
<a href="https://atlasanalyticslab.github.io/AtlasPatch/"><b>Project Page</b></a> |
16+
<a href="https://arxiv.org/abs/XXXX.XXXXX"><b>Paper</b></a> |
17+
<a href="https://github.com/AtlasAnalyticsLab/AtlasPatch"><b>GitHub</b></a>
18+
</p>
19+
720
## Table of Contents
821
- [Installation](#installation)
9-
- [Using Conda (Recommended)](#using-conda-recommended)
10-
- [Using uv (pip-compatible, faster installs)](#using-uv-pip-compatible-faster-installs)
11-
- [Using venv](#using-venv)
22+
- [Quick Install (Recommended)](#quick-install-recommended)
23+
- [OpenSlide Prerequisites](#openslide-prerequisites)
24+
- [Alternative Installation Methods](#alternative-installation-methods)
1225
- [Usage Guide](#usage-guide)
1326
- [Pipeline Checkpoints](#pipeline-checkpoints)
1427
- [A - Tissue Detection](#a-tissue-detection)
@@ -37,6 +50,7 @@
3750
- [Medical- and Pathology-Specific CLIP](#medical--and-pathology-specific-clip)
3851
- [Bring Your Own Encoder](#bring-your-own-encoder)
3952
- [SLURM job scripts](#slurm-job-scripts)
53+
- [Frequently Asked Questions (FAQ)](#frequently-asked-questions-faq)
4054
- [Feedback](#feedback)
4155
- [Citation](#citation)
4256
- [License](#license)
@@ -45,59 +59,69 @@
4559

4660
## Installation
4761

48-
### Using Conda (Recommended)
62+
### Quick Install (Recommended)
4963

50-
1. Create a conda environment:
5164
```bash
52-
conda create -n atlas_patch python=3.10
53-
conda activate atlas_patch
65+
pip install atlas-patch
5466
```
5567

56-
2. Install the OpenSlide system library (required for WSI processing):
68+
> **Note:** AtlasPatch requires the OpenSlide system library for WSI processing. See [OpenSlide Prerequisites](#openslide-prerequisites) below.
69+
70+
### OpenSlide Prerequisites
71+
72+
Before installing AtlasPatch, you need the OpenSlide system library:
73+
74+
- **Using Conda (Recommended)**:
75+
```bash
76+
conda install -c conda-forge openslide
77+
```
78+
79+
- **Ubuntu/Debian**:
80+
```bash
81+
sudo apt-get install openslide-tools
82+
```
83+
84+
- **macOS**:
85+
```bash
86+
brew install openslide
87+
```
88+
89+
- **Other systems**: Visit [OpenSlide Documentation](https://openslide.org/)
90+
91+
### Alternative Installation Methods
92+
93+
<details>
94+
<summary><b>Using Conda Environment</b></summary>
95+
5796
```bash
97+
# Create and activate environment
98+
conda create -n atlas_patch python=3.10
99+
conda activate atlas_patch
100+
101+
# Install OpenSlide
58102
conda install -c conda-forge openslide
59-
```
60103

61-
3. Install the package in development mode:
62-
```bash
63-
pip install -e .
104+
# Install AtlasPatch
105+
pip install atlas-patch
64106
```
107+
</details>
65108

66-
### Using uv (pip-compatible, faster installs)
109+
<details>
110+
<summary><b>Using uv (faster installs)</b></summary>
67111

68-
1. Install uv if not already available (see [uv docs](https://docs.astral.sh/uv/getting-started/)):
69112
```bash
113+
# Install uv (see https://docs.astral.sh/uv/getting-started/)
70114
curl -LsSf https://astral.sh/uv/install.sh | sh
71-
```
72115

73-
2. Create and activate a virtual environment (UV_VENV defaults to `.venv`):
74-
```bash
116+
# Create and activate environment
75117
uv venv
76118
source .venv/bin/activate # On Windows: .venv\Scripts\activate
77-
```
78119

79-
3. Install in development mode with uv:
80-
```bash
81-
uv pip install -e .
82-
```
83-
84-
### Using venv
85-
86-
1. Create a virtual environment:
87-
```bash
88-
python -m venv venv
89-
source venv/bin/activate # On Windows: venv\Scripts\activate
120+
# Install AtlasPatch
121+
uv pip install atlas-patch
90122
```
123+
</details>
91124

92-
2. Install the OpenSlide system library:
93-
- **Ubuntu/Debian**: `sudo apt-get install openslide-tools`
94-
- **macOS**: `brew install openslide`
95-
- **Other systems**: Visit [OpenSlide Documentation](https://openslide.org/)
96-
97-
3. Install the package in development mode:
98-
```bash
99-
pip install -e .
100-
```
101125

102126
## Usage Guide
103127

@@ -373,7 +397,7 @@ with h5py.File("output/patches/sample.h5", "r") as f:
373397
| [`midnight`](https://huggingface.co/kaiko-ai/midnight) ([Training state-of-the-art pathology foundation models with orders of magnitude less data](https://arxiv.org/abs/2504.05186)) | 3072 |
374398
| [`musk`](https://github.com/lilab-stanford/MUSK) ([MUSK: A Vision-Language Foundation Model for Precision Oncology](https://www.nature.com/articles/s41586-024-08378-w)) | 1024 |
375399
| [`openmidnight`](https://sophontai.com/blog/openmidnight) ([How to Train a State-of-the-Art Pathology Foundation Model with $1.6k](https://sophontai.com/blog/openmidnight)) | 1536 |
376-
| [`pathorchestra`](https://huggingface.co/AI4Pathology/PathOrchestra) ([PathOrchestra: A Comprehensive Foundation Model for Computational Pathology with Over 100 Diverse Clinical-Grade Tasks](https://arxiv.org/abs/2503.24345)) | 512 |
400+
| [`pathorchestra`](https://huggingface.co/AI4Pathology/PathOrchestra) ([PathOrchestra: A Comprehensive Foundation Model for Computational Pathology with Over 100 Diverse Clinical-Grade Tasks](https://arxiv.org/abs/2503.24345)) | 1024 |
377401
| [`h_optimus_0`](https://huggingface.co/bioptimus/H-optimus-0) | 1536 |
378402
| [`h_optimus_1`](https://huggingface.co/bioptimus/H-optimus-1) | 1536 |
379403
| [`h0_mini`](https://huggingface.co/bioptimus/H0-mini) ([Distilling foundation models for robust and efficient models in digital pathology](https://doi.org/10.48550/arXiv.2501.16239)) | 1536 |
@@ -480,6 +504,132 @@ We prepared ready-to-run SLURM templates under `jobs/`:
480504
- Submit with `sbatch jobs/atlaspatch_features.slurm.sh`.
481505
- Running multiple jobs: you can submit several jobs in a loop (e.g., 50 job using `for i in {1..50}; do sbatch jobs/atlaspatch_features.slurm.sh; done`). AtlasPatch uses per-slide lock files to avoid overlapping work on the same slide.
482506
507+
## Frequently Asked Questions (FAQ)
508+
509+
<details>
510+
<summary><b>I'm facing an out of memory (OOM) error</b></summary>
511+
512+
This usually happens when too many WSI files are open simultaneously. Try reducing the `--max-open-slides` parameter:
513+
514+
```bash
515+
atlaspatch process /path/to/slides --output ./output --max-open-slides 50
516+
```
517+
518+
The default is 200. Lower this value if you're processing many large slides or have limited system memory.
519+
</details>
520+
521+
<details>
522+
<summary><b>I'm getting a CUDA out of memory error</b></summary>
523+
524+
Try one or more of the following:
525+
526+
1. **Reduce feature extraction batch size**:
527+
```bash
528+
--feature-batch-size 16 # Default is 32
529+
```
530+
531+
2. **Reduce segmentation batch size**:
532+
```bash
533+
--seg-batch-size 1 # Default is 1
534+
```
535+
536+
3. **Use lower precision**:
537+
```bash
538+
--feature-precision float16 # or bfloat16
539+
```
540+
541+
4. **Use a smaller patch size**:
542+
```bash
543+
--patch-size 224 # Instead of 256
544+
```
545+
</details>
546+
547+
<details>
548+
<summary><b>OpenSlide library not found</b></summary>
549+
550+
AtlasPatch requires the OpenSlide system library. Install it based on your system:
551+
552+
- **Conda**: `conda install -c conda-forge openslide`
553+
- **Ubuntu/Debian**: `sudo apt-get install openslide-tools`
554+
- **macOS**: `brew install openslide`
555+
556+
See [OpenSlide Prerequisites](#openslide-prerequisites) for more details.
557+
</details>
558+
559+
<details>
560+
<summary><b>Access denied for gated models (UNI, Virchow, etc.)</b></summary>
561+
562+
Some encoders require Hugging Face access approval:
563+
564+
1. Request access on the model's Hugging Face page (e.g., [UNI](https://huggingface.co/MahmoodLab/UNI))
565+
2. Once approved, set your token:
566+
```bash
567+
export HF_TOKEN=your_huggingface_token
568+
```
569+
3. Run AtlasPatch again
570+
</details>
571+
572+
<details>
573+
<summary><b>Missing microns-per-pixel (MPP) metadata</b></summary>
574+
575+
Some slides lack MPP metadata. You can provide it via a CSV file:
576+
577+
```bash
578+
atlaspatch process /path/to/slides --output ./output --mpp-csv /path/to/mpp.csv
579+
```
580+
581+
The CSV should have columns `wsi` (filename) and `mpp` (microns per pixel value).
582+
</details>
583+
584+
<details>
585+
<summary><b>Processing is slow</b></summary>
586+
587+
Try these optimizations:
588+
589+
1. **Enable fast mode** (skips content filtering, enabled by default):
590+
```bash
591+
--fast-mode
592+
```
593+
594+
2. **Increase parallel workers**:
595+
```bash
596+
--patch-workers 16 # Match your CPU cores
597+
--feature-num-workers 8
598+
```
599+
600+
3. **Increase batch sizes** (if GPU memory allows):
601+
```bash
602+
--feature-batch-size 64
603+
--seg-batch-size 4
604+
```
605+
606+
4. **Use multiple GPUs** by running separate jobs on different GPU devices.
607+
</details>
608+
609+
<details>
610+
<summary><b>My file format is not supported</b></summary>
611+
612+
AtlasPatch supports most common formats via OpenSlide and Pillow:
613+
- **WSIs**: `.svs`, `.tif`, `.tiff`, `.ndpi`, `.vms`, `.vmu`, `.scn`, `.mrxs`, `.bif`, `.dcm`
614+
- **Images**: `.png`, `.jpg`, `.jpeg`, `.bmp`, `.webp`, `.gif`
615+
616+
If your format isn't supported, consider converting it to a supported format or [open an issue](https://github.com/AtlasAnalyticsLab/AtlasPatch/issues/new?template=feature_request.md).
617+
</details>
618+
619+
<details>
620+
<summary><b>How do I skip already processed slides?</b></summary>
621+
622+
Use the `--skip-existing` flag to skip slides that already have an output H5 file:
623+
624+
```bash
625+
atlaspatch process /path/to/slides --output ./output --skip-existing
626+
```
627+
</details>
628+
629+
---
630+
631+
Have a question not covered here? Feel free to [open an issue](https://github.com/AtlasAnalyticsLab/AtlasPatch/issues/new) and ask!
632+
483633
## Feedback
484634
485635
- Report problems via the [bug report template](https://github.com/AtlasAnalyticsLab/AtlasPatch/issues/new?template=bug_report.md) so we can reproduce and fix them quickly.
@@ -488,14 +638,15 @@ We prepared ready-to-run SLURM templates under `jobs/`:
488638
489639
## Citation
490640
491-
If you use AtlasPatch in your research, please cite it:
641+
If you use AtlasPatch in your research, please cite our paper:
492642
493-
```
494-
@software{atlaspatch,
495-
author = {Atlas Analytics Lab},
496-
title = {AtlasPatch},
643+
```bibtex
644+
@article{atlaspatch2025,
645+
title = {AtlasPatch: An Efficient and Scalable Tool for Whole Slide Image Preprocessing in Computational Pathology},
646+
author = {Alagha, Ahmed and Leclerc, Christopher and Kotp, Yousef and Abdelwahed, Omar and Moras, Calvin and Rentopoulos, Peter and Rostami, Rose and Nguyen, Bich Ngoc and Baig, Jumanah and Khellaf, Abdelhakim and Trinh, Vincent Quoc-Huy and Mizouni, Rabeb and Otrok, Hadi and Bentahar, Jamal and Hosseini, Mahdi S.},
647+
journal = {arXiv},
497648
year = {2025},
498-
url = {https://github.com/AtlasAnalyticsLab/AtlasPatch}
649+
url = {TODO: coming soon}
499650
}
500651
```
501652

atlas_patch/__init__.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,5 @@
22

33
from . import core, services
44

5-
__all__ = ["core", "services"]
5+
__version__ = "1.0.0"
6+
__all__ = ["core", "services", "__version__"]

0 commit comments

Comments
 (0)