Skip to content

[AI] Fix object mask concurrency and shape-lifetime bugs#20815

Merged
TurboGit merged 2 commits intodarktable-org:masterfrom
andriiryzhkov:seg_fix
Apr 15, 2026
Merged

[AI] Fix object mask concurrency and shape-lifetime bugs#20815
TurboGit merged 2 commits intodarktable-org:masterfrom
andriiryzhkov:seg_fix

Conversation

@andriiryzhkov
Copy link
Copy Markdown
Contributor

Two open PRs by @sjuxax have been awaiting a response from the author for a while. I don't consider everything in those PRs meaningful for merging, but a couple of fixes are genuinely worth having. This PR ports only those.

The originals:

  • #20774 – Fix object mask ASan races and scratch overflows

    • Ported: the _run_decoder thread-join fix. Real concurrency bug.
    • Skipped: per-iteration malloc/free replacements of dt_get_perthread() in box_filters.cc and guided_filter.c. These patch symptoms rather than the root cause (likely a pool-sizing issue in dt_alloc_perthread_float), and the hot-path allocator churn would regress performance for every user, not just those running ASan.
  • #20775 – Fix SAM2 segmentation ASan crashes

    • Ported: the decoder shape-array lifetime fix. Genuine UB per C standard and zero runtime cost.
    • Skipped: DT_AI_OPT_DISABLED for the SAM2.1 encoder. This disables ORT graph optimization for all SAM2 users to work around an ORT 1.24.x UAF that may be ASan-only, at a measurable inference-performance cost. Worth revisiting if confirmed to crash in release builds.

What's in this PR

  1. Wait for encode thread before decoding. ENCODE_READY is published before the background warmup finishes, so a fast user click could race the warmup on the shared dt_seg_context_t. _run_decoder() now joins the thread first – immediate no-op if warmup already completed, brief pause otherwise. Related comment at the ready-signal site was updated to reflect the new behavior.

  2. Keep decoder output shape arrays alive through dt_ai_run. In dt_seg_warmup_decoder and dt_seg_compute_mask, iou_shape / lr_shape / low_res_shape were declared inside an inner if(is_sam) block while pointers to them were stored in an outputs[] array from the outer scope. Per C, their lifetime ended at the closing } of the inner block – the dt_ai_run() call afterward read through dangling pointers. Works in release today because compilers rarely recycle stack slots between sibling scopes, but caught by ASan and a latent time bomb for future compiler versions. Moved to the outer scope; zero runtime cost.

@TurboGit TurboGit added this to the 5.6 milestone Apr 15, 2026
@TurboGit TurboGit added bugfix pull request fixing a bug priority: high core features are broken and not usable at all, software crashes labels Apr 15, 2026
Copy link
Copy Markdown
Member

@TurboGit TurboGit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@TurboGit TurboGit merged commit 4d36770 into darktable-org:master Apr 15, 2026
5 checks passed
@andriiryzhkov andriiryzhkov deleted the seg_fix branch April 15, 2026 16:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugfix pull request fixing a bug priority: high core features are broken and not usable at all, software crashes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants