Skip to content

Comments

Create met office site archive#133

Merged
tennlee merged 23 commits intoACCESS-Community-Hub:developfrom
JMP-MO:create-met-office-site-archive
Jul 15, 2025
Merged

Create met office site archive#133
tennlee merged 23 commits intoACCESS-Community-Hub:developfrom
JMP-MO:create-met-office-site-archive

Conversation

@JMP-MO
Copy link
Collaborator

@JMP-MO JMP-MO commented Jun 26, 2025

Overview

Create a Met Office site archive module that can be used to access data on disk at the Met Office.
There is a config approach to detach the paths to dataset locations from the PyEarthTools repo. Met Office staff will need to add the config file to their own home dir for PET to register the locations.

Changes

New Met Office Site Archive files:

  • init - to create the met office site archive.
  • Dataset accessor files: MOGlobal, MO UKV, ERA5low res
  • Structure, ancillary, utility, gitignore, manifest, pyproject.toml files similar to the NCI archive.
  • New tutorial notebook demonstrating how to use the met office site archive.

Changes / updates to existing files:

  • data/archive/extensions - added methods to get root_directories, set root_directories and load root directories from config files so these methods can be applied to future archives / data accessors.
  • data/archive/init to add the new methods.

Testing

Add the .pyearthtoolsconfig file (not sure where to host this yet) to your home directory and attempt to run the Using the Met Office site archive tutorial notebook. You can also use the .set_directory() method as detailed in the notebook to add paths manually.

Considerations

It is evident there will be duplication between different site archives. Is it worth thinking about a 'Universal' site archive where common open source datasets such as ERA5 are located to prevent duplication.
This is just a start to the Met Office archive, it will require development and improvement but does work with some demo data for now. Most changes are new files in the new met office archive. Whilst there are many 'files changed' most are new files, with little changes to existing pyearthtools files.

Closes #128

@JMP-MO JMP-MO requested a review from tennlee June 26, 2025 15:29
@coveralls
Copy link

coveralls commented Jun 26, 2025

Pull Request Test Coverage Report for Build 16120407929

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 8 of 23 (34.78%) changed or added relevant lines in 4 files are covered.
  • 109 unchanged lines in 4 files lost coverage.
  • Overall coverage increased (+0.08%) to 61.0%

Changes Missing Coverage Covered Lines Changed/Added Lines %
packages/data/src/pyearthtools/data/archive/extensions.py 5 20 25.0%
Files with Coverage Reduction New Missed Lines %
packages/nci_site_archive/tests/test_radar_proj.py 2 99.21%
packages/data/src/pyearthtools/data/transforms/normalisation/default.py 5 52.29%
packages/data/tests/download/test_arcoera5.py 25 50.0%
packages/pipeline/src/pyearthtools/pipeline/controller.py 77 75.06%
Totals Coverage Status
Change from base Build 15940767200: 0.08%
Covered Lines: 10047
Relevant Lines: 15998

💛 - Coveralls

@JMP-MO JMP-MO removed the request for review from tennlee June 26, 2025 16:06
Copy link
Collaborator

@millerjoel millerjoel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!
Just a couple of things:

  1. The path to the era5 data needs to be changed to use the data in dscop.
  2. Best approach for importing the package? I had to pip install (I've left some comments about this next to the relevant lines)
  3. @stevehadd suggested that it might be nice to have some summary stats, perhaps for a single location across multiple times?
  4. Here is some code that can be used to make the plots nicer (add labels and titles where necessary):
    ERA5
    fig1 = plt.figure(figsize=(16,8)) ax1 = fig1.add_subplot(1,1,1,projection=ccrs.PlateCarree()) ax1.coastlines() era5_data.t2m.plot(ax=ax1, cmap=cmap, vmin=vmin, vmax=vmax)
    UKV
    fig1 = plt.figure(figsize=(16,8)) ax2 = fig1.add_subplot(1,1,1,projection=ccrs.RotatedPole()) # ax2.coastlines() # this wasn't working at this scale for some reason moukv_data.air_temperature.plot(ax=ax2, cmap=cmap, vmin=270, vmax=300)
    Moglobal:
    fig1 = plt.figure(figsize=(16,8)) ax1 = fig1.add_subplot(1,1,1,projection=ccrs.PlateCarree()) #ax1.coastlines() moglobal_data.air_temperature.plot(ax=ax1, cmap=cmap, vmin=vmin, vmax=vmax)

@JMP-MO
Copy link
Collaborator Author

JMP-MO commented Jul 7, 2025

I've updated the charts. In terms of adding statistics this may be better to do as a new issue as it will require a new API method to access descriptive stats from a accessor class.

@JMP-MO JMP-MO self-assigned this Jul 14, 2025
@JMP-MO JMP-MO added the enhancement New feature or request label Jul 14, 2025
@tennlee
Copy link
Collaborator

tennlee commented Jul 15, 2025

This all looks broadly reasonable. As discussed in our meeting, I have reviewed this to confirm (a) nothing will pose any problem for general PyEarthTools users and (b) nothing seems egregious. I did read through everything. Obviously I can't perform any manual testings since I am in a different computing environment. It all seemed quite reasonable. Feel free to ask more specific questions if you want me to provide any post-merge feedback on the change.

@tennlee tennlee merged commit 608db1b into ACCESS-Community-Hub:develop Jul 15, 2025
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create a Met Office Site Archive

4 participants