WeatherBench2 indexes#131
Conversation
The WeatherBench2 dataset is now used to ease creation of index classes for each dataset of the WeatherBench2 project. I used __init_subclass__ to encode the verification code to ensure derived classes have the right class attributes to setup variables and levels validation in the constructor, and provide the dataset URL. ERA5 is provided as the first example.
Pull Request Test Coverage Report for Build 16134469897Details
💛 - Coveralls |
This solution is not viable as it is still requiring to open the online dataset first. But this is a first draft to improve.
This is practical to use logs to print in a notebook by default instead of requiring to setup a stream handler for the logger in order to see messages about data download.
|
This seems to be coming along well. I wonder if this is around the right point in the development to merge in the current functionality and then proceed to the next related features. Your thoughts? |
I'd like to convert one notebook at least to use one of these accessors. But if I don't manage to do it today (before catch-up), then yes, please feel free to merge if the PR is good to go for you :). |
|
@tennlee this PR is now ready. So for now I'll leave the following for future PRs
I have moved my fork readthedocs documentation to point to this branch if you want to see how it renders the updated parts |
|
Looks great, thanks for the work. Please do keep going on the future PR work to fully round this out, but I'm happy to merge this increment as a useful step on the journey. |
This PR provides new data accessors for the WeatherBench2 datasets (see complete list).
There is one class per set of datasets (e.g. one for ERA5, one for ERA5 climatology, etc.). These classes do not use the decorator
decorators.check_argumentson the__init__constructor to validate the dataset variable names and levels, as these change for each dataset, even with a same set (e.g. the raw ERA5 dataset do not have the same levels as the down-sampled ones). Instead, I used directly the functionspellcheck.check_prompt(same used indecorators.check_arguments) in the parent classWeatherBench2combined with an automatically generated dictionary containing the information (variable names and levels) for all datasets. This dictionary is created using the functionpyearthtools.data.download.weatherbench.create_dataset_mappingand saved in the private modulepyearthtools.data.download._weatherbench.To complete this pull-request, the following elements need to be added:
@tennlee please let me know if you would be OK for me to implement the first 2 points to complete this PR (doc and test) and add datasets and caching in a subsequent PR.
Also, would it be ok to change some of the example notebooks to use these online dataset?