-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve subsample
across Coreg
subclasses and pipelines
#436
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love this and I love you
Just minor comments!
dev-environment.yml
Outdated
@@ -52,4 +52,4 @@ dependencies: | |||
- noisyopt | |||
|
|||
# To run CI against latest GeoUtils | |||
# - git+https://github.com/GlacioHack/GeoUtils.git | |||
- git+https://github.com/GlacioHack/geoutils.git |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't there a reason we left pulling from github? I think if we make a breaking change on geoutils, xdem will fail on new PRs without it being the "PR's fault". Any way around this? Otherwise, please make an issue for it so we can revert this asap!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this should only be used to test that CI passes temporarily on the main branch for devs, we don't want to make a release with it! Should we add a reminder in HOW_TO_RELEASE
?
Now that GeoUtils 0.0.15 is published, I can revert it back!
# Check that the estimated biases are similar | ||
assert coreg_sub._meta["coefficients"] == pytest.approx(coreg_full._meta["coefficients"], rel=1e-1) | ||
|
||
def test_subsample__pipeline(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The double underscore is a typo, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No haha it's for subdividing tests more clearly! Many packages do it, and I liked it a lot. I thought it could improve the clarity/organization of our tests that were getting a bit messy.
I think we talked about this in the biascorr PR a bit already! Maybe we should establish guidelines for our tests?
Happy you like it!! Thanks a lot for the quick review 😊 |
All accounted for, merging and focusing on the next steps! 😄 |
This PR makes the
subsample
argument consistent across allCoreg
subclasses, and allows them to function in pipelines independently for each step. It also reworksCoreg.fit()
to be more durable for future chunking routines.All details in the full discussion here: #428
Only difference with the discussion: Now passing directly
inlier_mask
to each subclass'_fit_func
, so that the subsampling mask can be computed on the final mask of valid pixels within the subclass (valid data changes in the subclass because of NaNs in auxiliary data not yet available toCoreg.fit()
: slope, curvature, etc...). This way we will always have the exact number of valid points asked when subsampling 😄!To perform the subsampling consistently on the final valid mask, each
Coreg
subclass calls the methodget_subsample_on_valid_mask
.Tests significantly improved.
As now each class has a default
subsample
argument during instantiation, methods that currently rely on optimization without binning (NuthKaab
,Deramping
)* have5e5
to avoid very long computing time, otherwise1
for all others.Also a small fix to
geoutils.raster.subsampling
was necessary: GlacioHack/geoutils#402.Resolves #428
Resolves #243
Resolves #137