Add GradientAccumulation utility for SupervisedTrainer (#8763) #18482
| Job | Run time |
|---|---|
| 6m 41s | |
| 4m 59s | |
| 14m 14s | |
| 5m 23s | |
| 7m 7s | |
| 8m 53s | |
| 5m 19s | |
| 5m 20s | |
| 6m 39s | |
| 7m 1s | |
| 4m 50s | |
| 1h 16m 26s |
| Job | Run time |
|---|---|
| 6m 41s | |
| 4m 59s | |
| 14m 14s | |
| 5m 23s | |
| 7m 7s | |
| 8m 53s | |
| 5m 19s | |
| 5m 20s | |
| 6m 39s | |
| 7m 1s | |
| 4m 50s | |
| 1h 16m 26s |