Skip to content

Commit 69c315c

Browse files
Donglai Weiclaude
andcommitted
Add waterz region graph + merge_id, branch_merge with singleton, IoU module, oracle ARE, bbox cleanup
waterz (lib/waterz): - Add get_region_graph() using JIT scoring functions (any merge_function) with channel masking (all/z/xy) via affinity zeroing - Add merge_id() C++/Cython: union-find merge on edge lists (4 modes: ID-only, +affinity, +count, +both), ported from waterz_dw - Clean up: remove standalone C++ region graph scan, single get_region_graph entry point for all scoring functions - frontend_merge.cpp now contains only merge_segments connectomics: - Add data/process/iou.py: bbox-accelerated seg_to_iou and segs_to_iou ported from em_util - Add data/process/bbox.py: replace find_objects with em_util row/col scan - Add metrics/segmentation_numpy.py: adapted_rand_oracle for incremental per-GT oracle ARE (O(nnz_per_row) instead of O(volume)) - Add decoding/decoders/branch_merge.py: singleton_size_ratio parameter for merging single-slice fragments via size ratio - Update decode_waterz: dust_merge_size/dust_merge_affinity/dust_remove_size renamed parameters Reference docs: em_util.md, waterz_dw.md updated Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 69bfa31 commit 69c315c

24 files changed

Lines changed: 2415 additions & 741 deletions

File tree

.claude/reference/em_util.md

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
# em_util Reference
2+
3+
**Location:** `/projects/weilab/weidf/lib/em_util`
4+
**GitHub:** https://github.com/PytorchConnectomics/em_util
5+
**License:** MIT
6+
7+
Utility library for EM connectomics: volume I/O, segmentation operations, evaluation metrics, neuroglancer visualization, WebKnossos integration, and SLURM job management.
8+
9+
## Module Overview
10+
11+
| Module | Purpose |
12+
|--------|---------|
13+
| `em_util.io` | Universal I/O (HDF5, TIFF, PNG, Zarr, CloudVolume), bbox, skeleton, chunked ops, UnionFind |
14+
| `em_util.seg` | Segmentation ops: relabel, remove small, morphology, connected components, IoU |
15+
| `em_util.eval` | Metrics: adapted_rand, VOI, confusion matrix |
16+
| `em_util.ng` | Neuroglancer 3D visualization |
17+
| `em_util.wk` | WebKnossos remote dataset management |
18+
| `em_util.vast` | VAST annotation format support |
19+
| `em_util.cluster` | SLURM job submission utilities |
20+
21+
## Conventions
22+
23+
- **Volume axis order:** ZYX (depth, height, width)
24+
- **Bounding boxes:** 2D: `[seg_id, y0, y1, x0, x1, count]`, 3D: `[seg_id, z0, z1, y0, y1, x0, x1, count]`
25+
- **Segmentation:** Integer arrays, 0 = background
26+
27+
## Key Functions
28+
29+
### I/O (`em_util.io`)
30+
31+
```python
32+
read_vol(filename, dataset=None) # Universal reader (.h5, .tif, .npy, .zarr, etc.)
33+
read_h5(filename, dataset=None) # HDF5 reader
34+
write_h5(filename, data, dataset="main") # HDF5 writer (gzip compressed)
35+
read_image(filename, image_type="image") # Single image reader
36+
read_image_folder(filename, ...) # Image stack reader (glob patterns)
37+
compute_bbox_all(seg, do_count=False) # Bounding boxes for all segments
38+
vol_to_skel(labels, res=(32,32,30)) # Kimimaro skeletonization
39+
```
40+
41+
### Segmentation (`em_util.seg`)
42+
43+
```python
44+
seg_to_count(seg, do_sort=True) # Segment sizes (sorted)
45+
seg_relabel(seg, do_sort=True) # Relabel by size (largest=1)
46+
seg_remove_small(seg, threshold=100) # Remove small segments
47+
seg_to_cc(seg) # Connected component relabeling
48+
seg_biggest_cc(seg) # Keep only largest CC per label
49+
```
50+
51+
### IoU (`em_util.seg.iou`) — Key for branch merge
52+
53+
```python
54+
seg_to_iou(seg0, seg1, uid0=None, bb0=None, uid1=None, uc1=None, th_iou=0)
55+
```
56+
Compute IoU between two segmentations (2D or 3D). Uses bounding-box-accelerated overlap counting — only scans the bbox region of each segment in seg0, masking against seg1.
57+
58+
Returns `(N, 5)` array: `[seg_id, best_match_id, count0, count1, overlap_count]`
59+
60+
```python
61+
segs_to_iou(get_seg, index, th_iou=0)
62+
```
63+
Track segments across z-slices. `get_seg(i)` returns 2D segmentation at slice `i`. Iterates consecutive pairs, computing IoU with bbox acceleration. Returns list of overlap matrices (one per boundary).
64+
65+
**Performance:** Bbox-accelerated — only scans pixels within each segment's bounding box, making it fast for sparse segmentations where segments occupy a small fraction of the image.
66+
67+
### Evaluation (`em_util.eval`)
68+
69+
```python
70+
adapted_rand(seg, gt, all_stats=False) # SNEMI3D adapted rand error
71+
voi(seg, gt) # VOI (split, merge)
72+
```
73+
74+
### Chunked Processing (large volumes)
75+
76+
```python
77+
vol_func_chunk(input_file, vol_func, ...) # Apply function chunk-by-chunk
78+
vol_downsample_chunk(input_file, ratio=[1,2,2]) # Downsample large HDF5
79+
compute_bbox_all_chunk(seg_file, chunk_num=1) # Bbox from large file
80+
```
81+
82+
### UnionFind (`em_util.io`)
83+
84+
```python
85+
uf = UnionFind(elements)
86+
uf.union(a, b) # Merge two sets
87+
uf.find(a) # Find root
88+
uf.components() # List all components
89+
uf.component_relabel_arr() # Numpy relabel array
90+
```
91+
92+
## Dependencies
93+
94+
**Core:** numpy, scipy, h5py, imageio, scikit-image, tqdm, cc3d, pyyaml, networkx
95+
**Optional:** cloudvolume, zarr, kimimaro, neuroglancer, webknossos, igneous

0 commit comments

Comments
 (0)