orthophoto merge: parallelize the block loop (in-order writer) by Chouffe · Pull Request #2035 · OpenDroneMap/ODM

Chouffe · 2026-06-19T16:20:15Z

Summary

Parallelizes the per-block blend loop in merge(), behind a max_workers parameter (default 1 = unchanged serial behavior), wired from stages/splitmerge.py as args.max_concurrency.

Stacked on #2036 (gated reads). This PR currently shows both commits; once #2036 merges it reduces to just the parallelization. The gating is a prerequisite — see below.

Why it needs the gated reads (#2036)

The block loop is embarrassingly parallel, but a naive thread pool stalls: rasterio's boundless=True reads serialize a VRT (_serialize_xml) on every read, which under concurrency dominates and effectively hangs the merge (workers park in _serialize_xml). #2036 replaces those with gated reads (no VRT for in-bounds windows), which makes a parallel loop viable.

Parallelization

Parallel compute + single in-order writer. Blocks are computed in a ThreadPoolExecutor; one thread writes them in strict block order with a bounded look-ahead (cap = 2 * max_workers). A naive "write each block as it finishes" version can't flush incrementally — GDAL needs writes in row-major (block) order — so dirty blocks accumulate in RAM until it OOMs; the in-order writer keeps writes sequential/flushable and memory small.
Bounded GDAL block cache during the merge (restored on exit) so dirty-tile eviction/flush under GDAL's global lock stays prompt.
Per-thread source handles — GDAL/rasterio datasets are not thread-safe.
Preserves --merge-skip-blending.

max_workers <= 1 is a plain serial compute-then-write path, byte-for-byte identical to the original loop.

Correctness

Deterministic across worker counts. On a real 15.9 Gpx survey (3 submodels, ~61k blocks), the merged orthophoto is byte-for-byte identical at max_workers=1 (serial, ~28 min) and max_workers=16 (~8 min) — the in-order writer makes the result independent of worker count, and the serial/default path completes cleanly with no read↔write stall.
End-to-end. Verified on the same survey: the merge completes with zero stalled blocks and produces a valid orthophoto, pixel-identical to the serial baseline (confirmed for both LZW-compressed and uncompressed output).

Notes

No new dependencies.
Default behavior unchanged (max_workers=1).

merge() reads each source window with rasterio boundless=True, which builds an in-memory VRT and serializes it via Python's ElementTree (_serialize_xml) on every read — a large per-read overhead on big merges (tens of thousands of blocks x 3 passes x N submodels). _read_window_gated() keeps identical output but avoids the VRT for the common cases: a plain non-boundless read when the window is fully inside the source, zeros when fully outside (== the 0 nodata fill boundless produces there), and boundless only for the rare partial-edge windows. Pixel-identical (verified: hundreds of fully-in-bounds windows across a real merge grid compared boundless vs plain read, 0 mismatches). Serial; no behavior change beyond the speedup. Also a prerequisite for parallelizing the merge: boundless's per-read VRT serialization is pathological under concurrency.

…orkers) With boundless reads gated (previous commit), parallelize the per-block blend loop. Blocks are computed in a ThreadPoolExecutor and written from a single thread in strict block order with a bounded look-ahead (cap = 2 * max_workers), so writes to the compressed, tiled GeoTIFF stay sequential and incrementally flushable and memory stays small. GDAL's block cache is bounded during the merge (restored on exit). Per-thread source handles (GDAL/rasterio datasets are not thread-safe). Preserves --merge-skip-blending. Wired from stages/splitmerge.py as max_workers=args.max_concurrency. max_workers<=1 is byte-for-byte identical to the original serial loop.

Chouffe changed the title ~~orthophoto merge: parallelize block processing with parallel_map~~ orthophoto merge: parallelize block processing (in-order writer + gated reads) Jun 23, 2026

Chouffe force-pushed the parallel-orthophoto-merge-2034 branch from 4fdec37 to bbd0295 Compare June 23, 2026 21:34

Chouffe added 2 commits June 23, 2026 23:54

Chouffe force-pushed the parallel-orthophoto-merge-2034 branch from bbd0295 to fa47f06 Compare June 23, 2026 21:56

Chouffe changed the title ~~orthophoto merge: parallelize block processing (in-order writer + gated reads)~~ orthophoto merge: parallelize the block loop (in-order writer) Jun 23, 2026

smathermather requested a review from spwoodcock June 25, 2026 02:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

orthophoto merge: parallelize the block loop (in-order writer)#2035

orthophoto merge: parallelize the block loop (in-order writer)#2035
Chouffe wants to merge 2 commits into
OpenDroneMap:masterfrom
Chouffe:parallel-orthophoto-merge-2034

Chouffe commented Jun 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Chouffe commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why it needs the gated reads (#2036)

Parallelization

Correctness

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Chouffe commented Jun 19, 2026 •

edited

Loading