Commit b6e1790
Fix GPU broadcast for VectorMPI .* MatrixMPI; remove CPU fallback
- Add _prepare_broadcast_arg for MatrixMPI to extract underlying .A
matrix, enabling GPU broadcasts without non-bitstype wrapper
- Remove n < 256 CPU fallback in Metal extension for unified GPU path
- Fix scalar indexing in _map_rows_gpu_kernel (copy row to CPU first)
Bump version to 0.1.91 parent 82c4a9f commit b6e1790
3 files changed
Lines changed: 11 additions & 13 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
228 | 228 | | |
229 | 229 | | |
230 | 230 | | |
231 | | - | |
232 | | - | |
233 | | - | |
234 | | - | |
235 | | - | |
236 | | - | |
237 | | - | |
238 | | - | |
239 | | - | |
240 | | - | |
241 | | - | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
242 | 234 | | |
243 | | - | |
| 235 | + | |
| 236 | + | |
244 | 237 | | |
245 | 238 | | |
246 | 239 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1841 | 1841 | | |
1842 | 1842 | | |
1843 | 1843 | | |
| 1844 | + | |
| 1845 | + | |
| 1846 | + | |
| 1847 | + | |
| 1848 | + | |
0 commit comments