Commit 8e281fe
ggml-webgpu: compute pass batching and removing profiling overhead (ggml-org#21873)
* Update register tiling matmul to use f32 accumulation
* fix profiling code
* Fix register tiling matmul for chrome, i'm blaming dawn
* Update batch tuning value for iOS
* compile fix
* Fix use of new load function
* Move to a single query set for GPU profiling
* Move to batching compute passes when not profiling
* Refactor build_multi
* remove iOS throttling now that we're batching compute passes1 parent 6c8449d commit 8e281fe
1 file changed
Lines changed: 348 additions & 451 deletions
0 commit comments