Skip to content

Comments

Data Accessor - Rainfields3 Mosiac#106

Merged
tennlee merged 11 commits intoACCESS-Community-Hub:developfrom
nikeethr:feature/rainfields3-data-accessor
May 8, 2025
Merged

Data Accessor - Rainfields3 Mosiac#106
tennlee merged 11 commits intoACCESS-Community-Hub:developfrom
nikeethr:feature/rainfields3-data-accessor

Conversation

@nikeethr
Copy link
Collaborator

@nikeethr nikeethr commented May 5, 2025

This is ported from the nci gitlab. Note that mainly "RadarDemo" has been renamed to "Rainfields3"

ISSUE: #107
RELATED PR: #95

Things to consider for reviewer:

  • check renaming has not been missed anywhere.
  • check tutorial still works (MultipleSources.ipynb). May need to be done on NCI.
  • scope: This PR will not address import naming.
    • Currently import site_archive_nci is used to import the archive, whereas pyearthtools imports generally have the import pyearthtools.data as ... format.

Original notes:

Details

Since this is a EXPERIMENTAL accessor, we don't need a magnifying glass on this one - as long as the implementation seems sensible. Any issues will (hopefully) be exposed in the tutorials and when others utilize it.

Notable changes:

  1. Renamed: RadarDemo.py -> _RadarDemo.py - avoids conflict with the class name, also ran black/isort
  2. Added RadarProj namespace containing (static) methods to perform the xy-latlon inverse projection
    • Currently only ALBERS CONICAL EQUAL AREA (+proj=aea) is supported.
    • The methods are not coupled to RadarDemo, the namespace is to guard them from being used generically for now, until they are no longer experimental.
  3. Added an alternate initialiser to RadarDemo: init_with_lonlatproj which "hooks" in the RadarProj.
    • uses __new__ to initialise the hook before __init__ to bypass potential side-effects from inheritance; and adding another derived structure may just make things more confusing.
  4. Usage: calling the alternate initialiser init_with_lonlatproj, i.e. instead of RadarDemo(...), RadarDemo.init_with_lonlatproj(...)
    • by default this will do an inversion only - preserving the x-y coords and inserting new lon-lat coords.
  5. Usage: optionally user can:
    • choose to perform xy interpolation instead of inversion with their own provided latlon grid.
      • interp_lonlat=True
      • lonlat_meshgrid (required) and
      • interp_method (optional - default="linear") Interpolation may be desirable since inversion does not guarantee rectilinear grids, also inversion is not possible with forward-only projections.
    • force their own projection via force_proj (pyproj obj)
    • provide a projection cache via proj_cache (list of pyproj objs) to store and optionally warn via warn_on_different_proj (bool) if there are different projections in the collection of data being loaded. Different projections are not inherently invalid. However they may not be intended to be loaded, hence the user warning.
  6. Added reasonably extensive docstrings and battery of tests for RadarProj which will prove useful, should it be made more generic.
  7. Added some additional dependencies to site_archive_nci package to run pytest without needing the full pyearthtools dependency chain.

Note that for the above options, the user is responsible for managing the inputs they provide - there are no guarantees on expected behaviour, since its hard to account for all possible parse-able inputs. Any warnings and errors that are not handled will be offloaded to dependencies used e.g. pyproj to handle.

@nikeethr nikeethr requested a review from tennlee May 5, 2025 06:41
@nikeethr nikeethr self-assigned this May 5, 2025
@nikeethr nikeethr linked an issue May 5, 2025 that may be closed by this pull request
@coveralls
Copy link

coveralls commented May 5, 2025

Pull Request Test Coverage Report for Build 14896066608

Details

  • 462 of 556 (83.09%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.4%) to 59.975%

Changes Missing Coverage Covered Lines Changed/Added Lines %
packages/nci_site_archive/tests/test_radar_proj.py 246 248 99.19%
packages/nci_site_archive/src/site_archive_nci/_Rainfields3.py 215 307 70.03%
Totals Coverage Status
Change from base Build 14890588815: -0.4%
Covered Lines: 9660
Relevant Lines: 15646

💛 - Coveralls

@nikeethr
Copy link
Collaborator Author

nikeethr commented May 5, 2025

@tennlee tests seem to pass now - not sure why, but removing a __init__.py in nci_site_archive/tests fixed this, eventhough __init__.py exists in pipeline/tests as well...

I noted that some of the other sub-packages didn't have it so I figured I'd remove it and it worked.

If there's a good reason for the import error. I'd be interested to understand. Especially since the error didn't show up anywhere in nci_site_archive but in pipeline instead...

Also I did add cov to site_archive_nci but the only thing tested is RadarProj so it probably tanked the coverage a little.

@nikeethr
Copy link
Collaborator Author

nikeethr commented May 5, 2025

RE: review - I just put a couple of notes for the reviewer on the description above - have a look through, other than the minor changes, its the same PR as in NCI gitlab.

@nikeethr
Copy link
Collaborator Author

nikeethr commented May 5, 2025

RE: "random" test failure

I suspect this has to do with pytest import modes: https://pytest.org/en/7.4.x/explanation/pythonpath.html#import-modes

Hypothesis:

with the default i.e. prepend, having two tests directory in the mono repo with __init__.py specified will cause conflicts. It could be that pipeline package was the only package with __init__.py in its tests folder (needs verification). Introducing __init__.py to site_archive_nci/tests caused two "tests" modules to be recognised, with site_archive_nci taking precedence, since it is installed last (due to prepend - LIFO). This may have caused helper functions in pipeline to not be recognized.

Regardless of precedence, we want the ability for package tests to have utility modules since they are in some instances easier to write and organise than fixtures.

There are a few options to work around this, but its beyond the scope of this issue.

@tennlee
Copy link
Collaborator

tennlee commented May 8, 2025

Thanks. Code review performed, looks like a useful step forward. Very happy with the general coding quality. Notebook tested manually. I'll get a post-merge review from one of our radar experts to double check, but looks good to me.

@tennlee tennlee merged commit ce80505 into ACCESS-Community-Hub:develop May 8, 2025
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add data accessor for Rainfields3 Mosiac

4 participants