Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor and simplify testing data fetching #332

Merged
merged 14 commits into from
Jan 8, 2024
Merged

Refactor and simplify testing data fetching #332

merged 14 commits into from
Jan 8, 2024

Conversation

Zeitsperre
Copy link
Member

@Zeitsperre Zeitsperre commented Jan 5, 2024

Changes

  • Added h5netcdf as a core dependency, acting as a backup engine for reading netcdf files.
  • Made use of model_copy(deep=True) when dealing with Raven configs in the testing suite.
  • Updated autodoc-pydantic for compatibility with pydantic v2.0+
  • Modified some session-scoped pytest fixtures that were causing problems in other tests.
  • Removed the redundant emulators-based pytest fixtures; These are now loaded from emulators.py in the tests folder.

@Zeitsperre Zeitsperre added the bug Something isn't working label Jan 5, 2024
@Zeitsperre Zeitsperre self-assigned this Jan 5, 2024
@coveralls
Copy link

coveralls commented Jan 5, 2024

Coverage Status

coverage: 81.5%. first build
when pulling 346cf8f on set_netcdf4
into 021ef31 on master.

@Zeitsperre Zeitsperre changed the title Address macOS bug by setting backend engine manually Refactor and simplify testing data fetching Jan 5, 2024
@Zeitsperre Zeitsperre requested a review from huard January 8, 2024 17:02
@Zeitsperre
Copy link
Member Author

@huard

I've been really digging into this to see what's going on, and I can confidently say the following:

  • There are a few pytest fixtures that are session-scoped (i.e. They are created once when first called and persist between tests). When functions modify the internals of the objects and are then passed to other tests, things get weird. This was causing several problems with threads: likely from reading data from objects that are being written in the course of running other tests. I've fixed this by copying data objects (using model_copy(deep=True) or by copying the file itself to a tmp_dir)
  • There are a few methods in the code for emulators that will modify any NetCDF objects/files passed to them (like from_nc). This is kind of unsafe. Is there a reason for this (are there features necessary on mutating the original NetCDF files passed to them)? If not, dealing with this (via model_copy(deep=True)?) should be the next step.
  • For the session-scoped pytest emulator fixtures, would it be safe to make them test-scoped instead?
  • I've added h5netcdf as a backend for reading datasets in ravenpy.config.commands. I've found it to be much more stable than python-netcdf4. This could probably be set up in other functions that read NetCDF files as well. (Ideally, we shouldn't have too many functions doing IO for NetCDF files; Is there a way to combine these file-loaders into a single utility?)

Let me know what you think about this so far. I see this PR as a bandage for the testing issues (much more stable) but once I can find the time to replace everything with pooch I can rewrite much more of the testing suite.

@Zeitsperre Zeitsperre marked this pull request as ready for review January 8, 2024 17:17
Copy link
Collaborator

@huard huard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good, thanks for looking into this. There is no reason for from_nc to modify the dataset. Where do you see this happening ?

@Zeitsperre
Copy link
Member Author

There is no reason for from_nc to modify the dataset. Where do you see this happening ?

My bad, I was thinking of the *climatology_esp functions. It clearly states that the config objects are modified when passed to them. I take that approach in a few functions used in Miranda (modifying classes via functions, not methods) as well.

I'll update the change log here, and hopefully we should have a (relatively) more stable testing experience going forwards.

@Zeitsperre Zeitsperre merged commit a18d942 into master Jan 8, 2024
16 checks passed
@Zeitsperre Zeitsperre deleted the set_netcdf4 branch January 8, 2024 22:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants