In the current implementation of ODM, only batch sizes that are a multiple of num_processes can be used due to the usage of split_batches in the accelerate dataloader.
This might not always be possible, so we should figure out a way to remove this restriction