Skip to content

NotImplementedError for iterator in IterableData class while debugging CodonTransformer finetuning #17

@Cauwth

Description

@Cauwth

I am trying to implement a subclass of IterableData to iterate over a JSON file to finetuen the model, but I am encountering an error. The IterableData class has an abstract method iterator that is supposed to be implemented in subclasses. However, I am unsure how to correctly implement the iterator method in my IterableJSONData class.

I did not use the SLURM

train_data = IterableJSONData(args.dataset_dir)

and the error is like this :
Exception has occurred: NotImplementedError Caught NotImplementedError in DataLoader worker process 0. Original Traceback (most recent call last): File "/home/tianhao/miniconda3/envs/CodonTransformer/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 291, in _worker_loop fetcher = _DatasetKind.create_fetcher( File "/home/tianhao/miniconda3/envs/CodonTransformer/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 81, in create_fetcher return _utils.fetch._IterableDatasetFetcher( File "/home/tianhao/miniconda3/envs/CodonTransformer/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 22, in __init__ self.dataset_iter = iter(dataset) File "/data/wth/plant_protein/CodonTransformer/CodonTransformer/CodonUtils.py", line 541, in __iter__ return itertools.islice(self.iterator, worker_rk, None, worker_nb) File "/data/wth/plant_protein/CodonTransformer/CodonTransformer/CodonUtils.py", line 517, in iterator raise NotImplementedError NotImplementedError
How should I implement the iterator method in the IterableJSONData subclass to properly read the JSON file line by line and handle multi-processing environments?
I have tried code like this
image
to the CLASS IterableJSONData

but got another error

Exception has occurred: ValueError Expected positive integer total_steps, but got -1 File "/data/wth/plant_protein/CodonTransformer/finetune.py", line 87, in configure_optimizers "scheduler": torch.optim.lr_scheduler.OneCycleLR( File "/data/wth/plant_protein/CodonTransformer/finetune.py", line 167, in main trainer.fit(harnessed_model, data_loader) File "/data/wth/plant_protein/CodonTransformer/finetune.py", line 231, in <module> main(args) ValueError: Expected positive integer total_steps, but got -1

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions