You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is based on a brilliant suggestion by @dcherian, which I have used in multiple projects.
I think it would be very nice to document this little 'trick' to efficiently compute a histogram in 'regions' of an array which are encoded in a numbered mask (this is often how e.g. ocean basins are masked in climate models). Instead of masking out one region at a time with something like this:
One can simply add the mask as a second input variable and with appropriate bins, the results will be segmented into the appropriate regions (saving n-1 computations steps; where n is the number of regions).
I wrote up a quick example:
importnumpyasnpimportxarrayasxrfromxhistogram.xarrayimporthistogram# make a quadrant maskmask=np.concatenate([
np.concatenate([np.full((2,2), 1), np.full((2,2), 2)]),
np.concatenate([np.full((2,2), 3), np.full((2,2), 4)]),
],axis=1)
# create data that has similar values but with some added noisedata= (np.random.rand(*mask.shape)-0.5)*0.5+maskda=xr.DataArray(data, dims=['x','y'], name='data', coords={'mask':xr.DataArray(mask, dims=['x','y'], name='mask')})
da
You can see that this successfully segemented the regions and computed the histogram for each (the data is slightly spread around the actual mask number).
I think this really is a brilliant application case for so many science applications, but it is not obvious untils one tries it (at least it wasnt for me until @dcherian suggested it). So I think it would make a great example for the docs?
Id be happy to add this if ppl think it provides value. Maybe @shanicetbailey is interested in collaborating on this, since this might be very helpful in her current project?
The text was updated successfully, but these errors were encountered:
This is based on a brilliant suggestion by @dcherian, which I have used in multiple projects.
I think it would be very nice to document this little 'trick' to efficiently compute a histogram in 'regions' of an array which are encoded in a numbered mask (this is often how e.g. ocean basins are masked in climate models). Instead of masking out one region at a time with something like this:
One can simply add the mask as a second input variable and with appropriate bins, the results will be segmented into the appropriate regions (saving n-1 computations steps; where n is the number of regions).
I wrote up a quick example:
You can see that this successfully segemented the regions and computed the histogram for each (the data is slightly spread around the actual mask number).
I think this really is a brilliant application case for so many science applications, but it is not obvious untils one tries it (at least it wasnt for me until @dcherian suggested it). So I think it would make a great example for the docs?
Id be happy to add this if ppl think it provides value. Maybe @shanicetbailey is interested in collaborating on this, since this might be very helpful in her current project?
The text was updated successfully, but these errors were encountered: