Add GPU-side Gumbel-max sampling for CUDA graph compatibility #12862
| Job | Run time |
|---|---|
| 41m 1s | |
| 43m 41s | |
| 21m 12s | |
| 15m 52s | |
| 16m 41s | |
| 17m 39s | |
| 1h 5m 39s | |
| 22m 2s | |
| 30m 53s | |
| 32m 54s | |
| 31m 0s | |
| 31m 52s | |
| 21m 10s | |
| 22m 36s | |
| 17m 57s | |
| 34m 9s | |
| 1h 31m 5s | |
| 22m 55s | |
| 21m 54s | |
| 32m 49s | |
| 42m 15s | |
| 1h 30m 20s | |
| 3s | |
| 0s | |
| 0s | |
| 12h 47m 39s |