In the current implementation of ODM, only batch sizes that are a multiple of num_processes can be used due to the usage of split_batches in the accelerate dataloader.
This might not always be possible, so we should figure out a way to remove this restriction
In the current implementation of ODM, only batch sizes that are a multiple of
num_processescan be used due to the usage ofsplit_batchesin the accelerate dataloader.This might not always be possible, so we should figure out a way to remove this restriction