[Autotuner] Add long-lived benchmark worker pool by choijon5 · Pull Request #2289 · pytorch/helion

choijon5 · 2026-05-05T22:00:24Z

Stacked PRs:

Add long-lived benchmark worker pool

splitting up this #2128 so that it's reviewable. This is extending the idea used for long lived processed during benchmarking (#2111) to precompilation. Each worker owns its own CUDA context and can be timed out, killed, and respawned independently, so CUDA sticky errors or hung jobs are contained to the worker instead of poisoning the autotune parent process.
Local benchmarking on H100 (run 3 times each) shows that the new mode, "pool", is faster than fork and on par in terms of perf.

Variance in perf and compile time is also slightly lower than fork:

This PR adds the low-level primitive needed by later PRs to parallelize autotune precompile/benchmark work while keeping CUDA failures isolated to worker processes.

stack-info: PR: #2289, branch: choijon5/stack/46

meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 5, 2026

choijon5 force-pushed the choijon5/stack/46 branch from 6a4339b to 45b1709 Compare May 5, 2026 22:00

choijon5 force-pushed the choijon5/stack/46 branch from 45b1709 to 5103b79 Compare May 6, 2026 00:26

choijon5 changed the title ~~[Autotuner] Add long-lived benchmark worker pool~~ Add long-lived benchmark worker pool May 6, 2026

choijon5 force-pushed the choijon5/stack/46 branch 3 times, most recently from 023f042 to 39330c7 Compare May 6, 2026 07:03

choijon5 changed the title ~~Add long-lived benchmark worker pool~~ [Precompile] Add long-lived benchmark worker pool May 6, 2026

choijon5 force-pushed the choijon5/stack/46 branch from 39330c7 to 2bcd8e4 Compare May 8, 2026 02:52

choijon5 changed the title ~~[Precompile] Add long-lived benchmark worker pool~~ [Autotuner] Add long-lived benchmark worker pool May 8, 2026

choijon5 mentioned this pull request May 8, 2026

[Autotuner] Add pool benchmark subprocess mode #2359

Closed

[Autotuner] Add long-lived benchmark worker pool

5a7590e

stack-info: PR: #2289, branch: choijon5/stack/46

choijon5 force-pushed the choijon5/stack/46 branch from 2bcd8e4 to 5a7590e Compare May 8, 2026 02:59

choijon5 mentioned this pull request May 8, 2026

[Autotuner] Add pool benchmark subprocess mode #2360

Closed

choijon5 closed this May 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Autotuner] Add long-lived benchmark worker pool#2289

[Autotuner] Add long-lived benchmark worker pool#2289
choijon5 wants to merge 1 commit into
mainfrom
choijon5/stack/46

choijon5 commented May 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

choijon5 commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add long-lived benchmark worker pool

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

choijon5 commented May 5, 2026 •

edited

Loading