fix: Add stream parameter to cgemm_nvfp4 for CUDA graph support #2335
| Job | Run time |
|---|---|
| 1m 47s | |
| 15s | |
| 15s | |
| 12s | |
| 45s | |
| 5m 55s | |
| 3m 2s | |
| 3m 15s | |
| 3m 20s | |
| 3m 14s | |
| 3m 0s | |
| 3m 14s | |
| 4m 23s | |
| 4m 24s | |
| 4m 50s | |
| 4m 16s | |
| 4m 32s | |
| 4m 3s | |
| 4m 24s | |
| 4m 33s | |
| 4m 9s | |
| 3m 48s | |
| 4m 23s | |
| 4m 36s | |
| 3m 24s | |
| 3m 52s | |
| 3m 40s | |
| 3m 25s | |
| 5m 59s | |
| 5m 33s | |
| 3m 42s | |
| 5m 48s | |
| 3m 21s | |
| 5m 14s | |
| 4m 34s | |
| 3m 25s | |
| 4m 51s | |
| 4m 58s | |
| 6m 5s | |
| 6m 3s | |
| 5m 12s | |
| 6m 6s | |
| 6m 3s | |
| 6m 5s | |
| 5m 47s | |
| 1s | |
| 1s | |
| 1s | |
| 1s | |
| 3h 3m 46s |