Skip to content

[Autotuner] Add long-lived benchmark worker pool#2289

Closed
choijon5 wants to merge 1 commit into
mainfrom
choijon5/stack/46
Closed

[Autotuner] Add long-lived benchmark worker pool#2289
choijon5 wants to merge 1 commit into
mainfrom
choijon5/stack/46

Conversation

@choijon5
Copy link
Copy Markdown
Contributor

@choijon5 choijon5 commented May 5, 2026

Stacked PRs:


Add long-lived benchmark worker pool

splitting up this #2128 so that it's reviewable. This is extending the idea used for long lived processed during benchmarking (#2111) to precompilation. Each worker owns its own CUDA context and can be timed out, killed, and respawned independently, so CUDA sticky errors or hung jobs are contained to the worker instead of poisoning the autotune parent process.
Local benchmarking on H100 (run 3 times each) shows that the new mode, "pool", is faster than fork and on par in terms of perf.
image

Variance in perf and compile time is also slightly lower than fork:
image

This PR adds the low-level primitive needed by later PRs to parallelize autotune precompile/benchmark work while keeping CUDA failures isolated to worker processes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 5, 2026
@choijon5 choijon5 force-pushed the choijon5/stack/46 branch from 6a4339b to 45b1709 Compare May 5, 2026 22:00
@choijon5 choijon5 force-pushed the choijon5/stack/46 branch from 45b1709 to 5103b79 Compare May 6, 2026 00:26
@choijon5 choijon5 changed the title [Autotuner] Add long-lived benchmark worker pool Add long-lived benchmark worker pool May 6, 2026
@choijon5 choijon5 force-pushed the choijon5/stack/46 branch 3 times, most recently from 023f042 to 39330c7 Compare May 6, 2026 07:03
@choijon5 choijon5 changed the title Add long-lived benchmark worker pool [Precompile] Add long-lived benchmark worker pool May 6, 2026
@choijon5 choijon5 force-pushed the choijon5/stack/46 branch from 39330c7 to 2bcd8e4 Compare May 8, 2026 02:52
@choijon5 choijon5 changed the title [Precompile] Add long-lived benchmark worker pool [Autotuner] Add long-lived benchmark worker pool May 8, 2026
stack-info: PR: #2289, branch: choijon5/stack/46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant