Describe the bug
The function preparing data for DQ (get_tokenized_data in fms_mo/utils/calib_data.py) should be capable of providing data processed from different dataset: wiki, c4, ptb, etc.
While the wiki option is functional and well integrated with DQ, the type of objects returned for other datasets are not consistent and can fail to run DQ and/or eval.
Expected behavior
Consistent return type of get_tokenized_data across all selected datasets.