-
-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase test coverage #102
Conversation
Codecov Report
@@ Coverage Diff @@
## main #102 +/- ##
==========================================
+ Coverage 39.30% 45.37% +6.07%
==========================================
Files 10 10
Lines 832 833 +1
==========================================
+ Hits 327 378 +51
+ Misses 505 455 -50
📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more |
ca11393
to
48d74ae
Compare
@jacobbieker As mentioned over in #72 : I am baffled by the error in the tests. The error message states that it can not load the ecCodes library, but installation was successful, according to the installation logs. I pulled the docker image and there, the imports work. Do you have any idea what could cause this? I can't replicate this error locally with the Wild guess: Does the test setup use a different one but the most recent Docker image, which contain different lib-versions? |
I think because the plots workflow runs, but the tests do not, we might need to install the test requirements with conda, like we do for the plotting workflow. Otherwise, while eccodes is installed, it doesn't have the binaries to load if its just pip installed. So I think the fix would be to copy the setup steps for the plotting workflow into the test workflow, and then it should work. The reason to leave eccodes and some of the plotting code requirements out of Satip's default requirements is that they can be a bit finicky and are only used for a smaller subset of the overall functionality. |
c183310
to
045bada
Compare
@jacobbieker Makes a ton of sense and you were right: The workflow setups for tests and for plots were different. Now, the weird thing is that if I make the testing workflow use the same setup steps as the plotting job such that the machine setup should be identical up the the For a side by side comparison of the workflow-definitions: Is there anything I am missing? Any idea why it works in one case but not in the other (I have a hunch I am missing something obvious here)? |
The only thing I noticed is that the tests also run on MacOS, which is where the installation is failing, while the plots are only on Linux. Now, we could just remove tests on macOS, I'd prefer not to, but for now, it might be the easiest to make sure the unit tests pass that you've written, and we can readd back in MacOS support later? |
8ee0340
to
c17a3eb
Compare
@jacobbieker You were right. So ... the mac-os-test failing during the dependency-installation-step led to a cancellation of the linux-dependency-installation-step? TIL. Anyway, the removal of mac-os-tests is totally acceptable and I would argue that mac-os-specific unit-tests aren't really needed. I can see the reasoning behind that, but going down that route effectively means lumping unit tests and system setup tests into one. If you feel strongly about the mac-os-tests, we can try to re-add them, as you suggested. There is something off with the pytest call, and I will look into that tomorrow. Road looks clear-ish from here on. |
d359d9d
to
471e919
Compare
- Tests for downloading and zarr-conversion of RSS and cloudmask-data Please note that these require an actual eumetsat-data-login and will download data each time they run. - Also flagged an unused variable in utils.py for later clean-up, if needed. - Also corrected minor spelling and improved comment in scripts/generate_plots.py. - Added a test to test save_dataset_to_zarr.
- Added test for fit-method - Added return value in fit function to allow for function chaining - Added test for rescale-method - Added warning to highlight that variable names have to be very specificly defined for the scalers to properly work.
471e919
to
2c6f79a
Compare
@jacobbieker I took the liberty of adding you as reviewer. Please read the description at the top of the PR for an extensive report. Happy to discuss, if need be even in a vid-call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks good! Just one minor comment, to reduce randomness in the tests. Also, could you make sure the test_generate_test_script passes? Or remove it, if it doesn't work, the plot generation fails anyway and so that acts as its own form of test. But thanks loads for all this work! Looks great overall.
# we replace nan-values in both arrays with the same but random number. | ||
# If there would not be the exact same nan-values at the exact same places, any comparison | ||
# would thus fail: | ||
nan_replacement = np.random.rand() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense, but what might be slightly more reproducible or consistent would be to set to a negative number, like -1 or something, as all values are above 0 in the compressed and decompressed data.
nan_replacement = np.random.rand() | |
nan_replacement = -1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, will do!
As I have no idea what makes this script fail non-reproducibly, I would agree it is best to remove it (see discussion in the PR description for details). Seems the most stable path and leaves the new user with a usable, intuitive script. |
- nan-replacement-number in tests is now not random anymore - removed failing plot-test-script as it is redundant with non-failing scripts/generate_test_plots.py
Pull Request
Description
Adds unit tests where applicable.
Fixes #72 and #9, though both only partly. I will provide a lenghty discussion further below, but the TL;DR is: There are two problems which prevent a higher code coverage or even make it undesirable:
For these reasons (again: details below) do I recommend to leave the code coverage at the 57%(?) that we would have after this PR. Also, I would vote to keep the generate_test_plots as the main method to run a sort-of-system-integration-test and not have any unit-tests on the code run there.
How Has This Been Tested?
Well ... it's unit tests ... so the code has been tested. ;)
If your changes affect data processing, have you plotted any changes? i.e. have you done a quick sanity check?
Checklist:
Reason no. 1 for lower test coverage: download-heavy nature of code
satip/downloads.py
_download_time_range
not being registered as called when dispatched to the poolsatip/eumetsat.py’s missing coverage is mostly code which governs the case of more than 10.000 search results, which seems to infeasible to test on a regular basis.
satip/intermediate.py mostly deals with splitting stuff into month-wide-zarr-files.
In order to properly test this, we would need downloaded zarr-files for more than a month, which is prohibitive, since five minutes worth of data have a size of about 100 Mb. A month would be in the order of 0.9 Tb. Too much for a recurring test.
Conclusion: Testing these parts would create a huge overhead. It seems preferrable to leave
scripts/generate_test_plots.py
as the main source of testing the downloading-part of the satip-code.Reason no. 2 for lower test coverage: Strange test-setup bug
This is mostly centered around scripts/generate_test_plots.py and satip/utils.py.
tests/test_plot_generation.py
, the code fails due to allegedly not being able to read the file (tested on local machine). At least sometimes.pytest
.A variant of this error on github: It fails because it can not import cartopy, which was properly installed in the steps before (check “install dependencies”-step) and does not cause any issues with the parallel test-plot-action.
Again, as
scripts/generate_test_plots.py
is working, adding tests for the utils in the unit-tests seem not to add too much value. While it sure would be nice to not have to rely on the script during testing, it seems a stable way to verify the code working.