Auto-spread large WebGPU compute dispatches#8696
Auto-spread large WebGPU compute dispatches#8696aashu2006 wants to merge 2 commits intoprocessing:dev-2.0from
Conversation
davepagurek
left a comment
There was a problem hiding this comment.
Thanks for looking into this, it's looking good!
| const isLarge1D = totalIterations > 1024 && y === 1 && z === 1; | ||
|
|
||
| if (exceedsLimits || isLarge1D) { | ||
| if (totalIterations > 1000000) { |
There was a problem hiding this comment.
Out of curiosity is there any benefit to spreading across dimensions like this for lower iteration counts too? e.g. if you're doing a big for loop inside of each iteration, with a smaller number of iterations, is there any difference?
There was a problem hiding this comment.
Good question! Currently I only auto-spread when count > 1024 to avoid overhead for small dispatches. For lower counts with heavy per iteration work, manual spreading might still help but I kept it simple for now. We could test this if you think it's worth optimizing?
There was a problem hiding this comment.
I think it's worth testing at least to know what kind of difference it makes, and similarly if it's better to spread across 3 dimensions earlier too. A sort of table of performance tests would help us just be a bit more confident about our optimizations.
There was a problem hiding this comment.
I can run some quick tests comparing different spreading approaches across small, medium, and large counts. I’ll check 1D, 2D (square/rectangular), and 3D, and share a simple performance table with the results. Should be interesting to see where things start to slow down.
let me know if there’s anything specific you’d like me to test, or if you want to try something on your machine as well 👍
| nodeType: NodeType.STATEMENT, | ||
| statementType: StatementType.EARLY_RETURN, | ||
| dependsOn: [valueNode.id] | ||
| dependsOn: value !== undefined ? [valueNode.id] : [] |
There was a problem hiding this comment.
Mind elaborating on what these changes are there to handle? Anything we should have more test cases for in the tests?
There was a problem hiding this comment.
These fix void return types in compute shaders. Without them, doing return; in a compute hook would crash with "Missing dataType". Most compute shaders use void (side-effects only), so the auto-spread wouldn't work without this fix.
For tests - should I add cases for void hooks with early returns? The main compute functionality already has test coverage.
There was a problem hiding this comment.
ah, got it. Right, let's add a test for early returns, since this wasn't a case covered by any tests before. Thanks!
There was a problem hiding this comment.
I've added the test cases for void compute hooks with early returns. Both tests are passing.
Thanks!
Continuous ReleaseCDN linkPublished PackagesCommit hash: fad5960 Previous deploymentsThis is an automated message. |
Resolves #8690
Changes
Auto spreads large compute dispatches across 2D/3D to stay within GPU limits
Problem
compute(shader, 1000000)would fail (GPU limit), users had to manually spread and handle indicesSolution
Modified files
p5.RendererWebGPU.js: Spreading logiccompute.js: Index reconstructTesting
Tested with 1M+ particle simulations:
compute(shader, 1000000)- black screen (failed)compute(shader, 1000000)- gives 60 fpscompute(shader, 1000003)- gives 48 fpsScreenshots
PR Checklist
npm run lintpasses