Skip to content

Commit 49d9fe2

Browse files
author
Donglai Wei
committed
Add untracked analysis scripts and reference files
1 parent 392e019 commit 49d9fe2

6 files changed

Lines changed: 1824 additions & 0 deletions

File tree

.claude/reference/pytc-deploy.md

Lines changed: 167 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
# pytc-deploy
2+
3+
**Location:** `/projects/weilab/weidf/lib/pytc-deploy`
4+
**License:** MIT (2024, donglai)
5+
**Purpose:** Deployment/workflow management for EM connectomics data processing pipelines using PyTorch Connectomics. Orchestrates large-scale segmentation, instance merging, and visualization on SLURM clusters.
6+
7+
## Repository Structure
8+
9+
```
10+
pytc-deploy/
11+
├── util/ # Shared utility modules
12+
│ ├── __init__.py
13+
│ ├── args.py # CLI argument parsing
14+
│ └── task.py # Core segmentation algorithms
15+
├── mito-h01/ # H01 dataset mitochondria processing
16+
│ ├── main.py # Pipeline orchestration (305 lines)
17+
│ ├── const.py # Dataset constants
18+
│ └── param.yml # SLURM/path configuration
19+
├── nuc-worm/ # C. elegans nucleus/worm processing
20+
│ ├── main.py # Pipeline orchestration (302 lines, mirrors mito-h01)
21+
│ ├── const.py # Dataset constants
22+
│ └── param.yml # SLURM/path configuration
23+
└── syn-alzhemier/ # Alzheimer's synapse analysis
24+
└── main.py # Multi-step pipeline (858 lines)
25+
```
26+
27+
## CLI Entry Point
28+
29+
All projects use:
30+
```bash
31+
python main.py -t <task> [flags]
32+
```
33+
34+
## util/args.py — `get_parser()`
35+
36+
Returns an `ArgumentParser` with these flags:
37+
38+
| Flag | Default | Purpose |
39+
|------|---------|---------|
40+
| `-t, --task` | `""` | Task name to execute |
41+
| `-s, --cmd` | `""` | SLURM command |
42+
| `-e, --env` | `"imu"` | Conda environment name |
43+
| `-ji, --job-id` | `0` | Job ID for parallel processing |
44+
| `-jn, --job-num` | `1` | Total number of jobs |
45+
| `-cn, --chunk-num` | `1` | Number of chunks |
46+
| `-n, --neuron` | `""` | Neuron IDs (comma-separated) |
47+
| `-r, --ratio` | `"1,1,1"` | Downsample ratio (Z,Y,X) |
48+
| `-cp, --partition` | `"lichtman"` | SLURM partition |
49+
| `-cm, --memory` | `"50GB"` | Memory allocation |
50+
| `-ct, --run-time` | `"0-12:00"` | Job runtime |
51+
| `-cg, --num_gpu` | `-1` | Number of GPUs |
52+
53+
## util/task.py — Core Algorithms
54+
55+
### `generate_jobs_dl(conf, neuron, job_num=1, mem='50GB', run_time='1-00:00', job_order=1)`
56+
Generates SLURM batch scripts for deep learning inference using PyTorch Connectomics.
57+
58+
### `neuron_to_tile(neuron, zid, zran, f_box, f_seg)`
59+
Maps neuron IDs to tile coordinates. Returns bounding box and tile bounding boxes for a neuron.
60+
61+
### `seg_zran_merge(f_zran_p, job_num)`
62+
Merges Z-range (min/max Z) data from parallel jobs. Returns merged ID array and Z-range array.
63+
64+
### `seg_zran_p(f_box, job_id, job_num)`
65+
Computes Z-range for each segmentation ID in parallel. Returns array of `[ID, min_z, max_z]`.
66+
67+
### `seg_bbox_p(f_seg, f_box, job_id, job_num)`
68+
Computes bounding boxes for all segmented objects in parallel, slice-by-slice. Writes to HDF5.
69+
70+
### `remove_small_instances(segm, thres_small=128, mode='background')`
71+
Removes spurious small instances from segmentation.
72+
- **Modes:** `none`, `background` (3D), `background_2d`, `neighbor` (merge with nearest, 3D), `neighbor_2d`
73+
74+
### `bc_watershed(volume, thres1=0.9, thres2=0.8, thres3=0.85, thres_small=128, scale_factors=(1.0,1.0,1.0), remove_small_mode='background', seed_thres=32, precomputed_seed=None)`
75+
Converts binary foreground probability + instance contour maps to instance masks using watershed.
76+
- `volume`: Shape `(C, Z, Y, X)` with 2 channels (foreground, boundary)
77+
- `thres1`: Seed threshold (0.9)
78+
- `thres2`: Contour threshold (0.8)
79+
- `thres3`: Foreground threshold (0.85)
80+
81+
### `mito_watershed_iou(f_mito_ws_func, arr_mito)`
82+
Computes IoU between adjacent tiles for instance matching across X, Y, Z directions.
83+
84+
### `mito_neuron_sid(f_mito_ws, arr_mito, ratio=0.6)`
85+
Finds mitochondrial instance IDs within a neuron mask, filtered by overlap ratio (default 60%).
86+
87+
## mito-h01/main.py — Tasks
88+
89+
| Task | Description |
90+
|------|-------------|
91+
| `seg-bbox` | Compute bounding boxes per segmentation slice |
92+
| `seg-zran_p` | Compute Z-ranges in parallel |
93+
| `seg-zran` | Merge Z-range data from parallel jobs |
94+
| `neuron-tile` | Map neuron IDs to tile coordinates |
95+
| `mito-folder` | Create output directory structure |
96+
| `mito-ts` | Write TensorStore config pickle |
97+
| `mito-neuron-watershed` | Decode U-Net predictions to instances via watershed |
98+
| `mito-neuron-watershed-iou` | Compute IoU between adjacent tiles |
99+
| `mito-neuron-check` | Verify file completeness |
100+
| `mito-neuron-sid` | Extract mito instance IDs within neuron mask |
101+
| `mito-neuron-sid-count` | Cumulative count of instance IDs |
102+
| `mito-neuron-sid-iou` | Merge instances across tiles using IoU + UnionFind |
103+
| `mito-neuron-export` | Generate final HDF5 with instance relabeling |
104+
| `mito-neuron-export-ds` | Downsample exported segmentation |
105+
| `mito-neuron-ng` | Create Neuroglancer-compatible tiles |
106+
| `mito-neuron-mesh` | Generate 3D mesh from segmentation |
107+
| `mito-neuron-test` | Debugging/testing |
108+
| `slurm` | Generate and submit SLURM batch jobs |
109+
110+
## mito-h01/const.py — Dataset Constants
111+
112+
- `neuron_volume_size = [1324, 15552, 27072]` (Z, Y, X voxels)
113+
- `neuron_volume_offset = [0, 2560, 3520]`
114+
- `neuron_tile_size = [25, 128, 128]`
115+
- `mito_volume_ratio = [4, 16, 16]` (mito resolution vs neuron)
116+
- `mito_tile_size = [100, 2048, 2048]`
117+
- `neuron_id = [590612150, 36750893213]`
118+
119+
## nuc-worm/ — C. elegans Nucleus Processing
120+
121+
Code-identical to `mito-h01/` (same pipeline structure, different dataset parameters).
122+
123+
## syn-alzhemier/main.py — Alzheimer's Synapse Pipeline
124+
125+
Multi-step pipeline driven by numeric option codes:
126+
127+
| Option | Description |
128+
|--------|-------------|
129+
| `0.x` | Image preprocessing: extract frames, VAST-to-HDF5 conversion, downsampling |
130+
| `2.x` | Vesicle processing: extraction, annotation processing, mito mask application |
131+
| `3.x` | Data export & validation: range checks, consistency, bbox fixes |
132+
| `4.x` | Vesicle classification: patch extraction, Laplacian quality scores, sorting |
133+
| `5.0-5.2` | Load TIF stacks, convert to HDF5 |
134+
| `5.3-5.4` | Tissue sample preparation and decoding |
135+
| `5.5` | Generate test file list (72x9x8 = 5184 tiles) + SLURM jobs |
136+
| `5.6x` | Instance merging: extract IDs, merge across tiles (IoU + UnionFind), relabel |
137+
| `5.63` | TensorStore upload to Google Cloud (multi-scale pyramid: 1x, 4x, 8x) |
138+
| `6.x` | Cell segmentation visualization, Neuroglancer setup |
139+
140+
### Key Functions in syn-alzhemier:
141+
- **`merge_syn_ins()`**: Merges pre/post-synaptic instances across tile boundaries using UnionFind
142+
143+
## Data Flow (Mito-h01 Pipeline)
144+
145+
```
146+
Raw segmentation → [seg-bbox] → [seg-zran_p] → [seg-zran]
147+
→ [neuron-tile] → U-Net inference → [mito-neuron-watershed]
148+
→ [mito-neuron-sid] → [mito-neuron-sid-iou]
149+
→ [mito-neuron-export] → [mito-neuron-export-ds]
150+
→ [mito-neuron-ng/mesh]
151+
```
152+
153+
## Key Algorithms
154+
155+
1. **Watershed Segmentation**: Seed detection + watershed flooding for pixel-to-instance conversion
156+
2. **UnionFind**: Disjoint set union for tracking/merging connected instances across tiles
157+
3. **IoU-Based Merging**: Matches instances across tile boundaries by overlap threshold
158+
4. **Tile-Based Parallelism**: SLURM job arrays for memory-efficient large-volume processing
159+
5. **Multi-Scale Pyramids**: Downsampled representations for TensorStore/Neuroglancer visualization
160+
161+
## Dependencies
162+
163+
- **Core:** numpy, scipy, h5py, cv2, scikit-image, imageio
164+
- **EM Utilities:** em_util (I/O, clustering, Neuroglancer helpers)
165+
- **Segmentation:** cc3d, fastremap
166+
- **Cloud Storage:** tensorstore (Google Cloud)
167+
- **Custom:** T_util, T_util_seg

11

Whitespace-only changes.

0 commit comments

Comments
 (0)