Skip to content

Comments

WeatherBench2 indexes#131

Merged
tennlee merged 28 commits intoACCESS-Community-Hub:developfrom
jennan:weatherbench
Jul 8, 2025
Merged

WeatherBench2 indexes#131
tennlee merged 28 commits intoACCESS-Community-Hub:developfrom
jennan:weatherbench

Conversation

@jennan
Copy link
Collaborator

@jennan jennan commented Jun 23, 2025

This PR provides new data accessors for the WeatherBench2 datasets (see complete list).

There is one class per set of datasets (e.g. one for ERA5, one for ERA5 climatology, etc.). These classes do not use the decorator decorators.check_arguments on the __init__ constructor to validate the dataset variable names and levels, as these change for each dataset, even with a same set (e.g. the raw ERA5 dataset do not have the same levels as the down-sampled ones). Instead, I used directly the function spellcheck.check_prompt (same used in decorators.check_arguments) in the parent class WeatherBench2 combined with an automatically generated dictionary containing the information (variable names and levels) for all datasets. This dictionary is created using the function pyearthtools.data.download.weatherbench.create_dataset_mapping and saved in the private module pyearthtools.data.download._weatherbench.

To complete this pull-request, the following elements need to be added:

  • add documentation to the module
  • add tests to the module
  • add all datasets
  • add a download mechanism to the input data
  • ensure license is presented to user on there first use of the dataset

@tennlee please let me know if you would be OK for me to implement the first 2 points to complete this PR (doc and test) and add datasets and caching in a subsequent PR.

Also, would it be ok to change some of the example notebooks to use these online dataset?

jennan added 9 commits May 28, 2025 23:50
The WeatherBench2 dataset is now used to ease creation of index classes for
each dataset of the WeatherBench2 project. I used __init_subclass__ to encode
the verification code to ensure derived classes have the right class attributes
to setup variables and levels validation in the constructor, and provide the
dataset URL.

ERA5 is provided as the first example.
@jennan jennan self-assigned this Jun 23, 2025
@jennan jennan requested a review from tennlee June 23, 2025 22:36
@coveralls
Copy link

coveralls commented Jun 23, 2025

Pull Request Test Coverage Report for Build 16134469897

Details

  • 43 of 157 (27.39%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.4%) to 60.706%

Changes Missing Coverage Covered Lines Changed/Added Lines %
packages/data/src/pyearthtools/data/download/weatherbench.py 42 156 26.92%
Totals Coverage Status
Change from base Build 16114619003: -0.4%
Covered Lines: 10084
Relevant Lines: 16134

💛 - Coveralls

@tennlee
Copy link
Collaborator

tennlee commented Jul 7, 2025

This seems to be coming along well. I wonder if this is around the right point in the development to merge in the current functionality and then proceed to the next related features. Your thoughts?

@jennan
Copy link
Collaborator Author

jennan commented Jul 7, 2025

This seems to be coming along well. I wonder if this is around the right point in the development to merge in the current functionality and then proceed to the next related features. Your thoughts?

I'd like to convert one notebook at least to use one of these accessors. But if I don't manage to do it today (before catch-up), then yes, please feel free to merge if the PR is good to go for you :).

@jennan jennan changed the title [Draft] WeatherBench2 indexes WeatherBench2 indexes Jul 8, 2025
@jennan
Copy link
Collaborator Author

jennan commented Jul 8, 2025

@tennlee this PR is now ready. So for now I'll leave the following for future PRs

  • license validation
  • unit test
  • more datasets

I have moved my fork readthedocs documentation to point to this branch if you want to see how it renders the updated parts

@tennlee
Copy link
Collaborator

tennlee commented Jul 8, 2025

Looks great, thanks for the work. Please do keep going on the future PR work to fully round this out, but I'm happy to merge this increment as a useful step on the journey.

@tennlee tennlee merged commit 219ed08 into ACCESS-Community-Hub:develop Jul 8, 2025
7 checks passed
@jennan jennan deleted the weatherbench branch July 29, 2025 01:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants