-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
divide by zero error #16
Comments
Hi @rabernat, I'm starting to get this error coming up often. I'm working with model output with dimension lengths Have you had any luck getting to the bottom of this? |
Hello @gmacgilchrist and @rabernat |
I think I found that the problem went away if I chunked the arrays with dask. So that is one temporary workaround. Are you using dask arrays, or are you calling this on a "loaded" numpy array? |
@rabernat Not a dask array explicitly, but DataArrays from xArray should be more like dask arrays no? |
I just ran into this, and somehow the error is triggered based on the amount of dimensions given in import xarray as xr
import numpy as np
import dask.array as dsa
from xhistogram.xarray import histogram
da = xr.DataArray(dsa.random.random((12, 35, 576, 720), chunks=(3, 35, 576, 720)), dims=['time','z', 'y','x'])
da bins = np.arange(-0.5, 0.5, 0.1)
count = histogram(da, bins=bins, dim=['x','y', 'z'])
count Loading the dataarray into memory triggers the error count.load()
Now if I just count over two instead of 3 dimensions it works! # Now without counting over z
histogram(da, bins=bins, dim=['x','y']).load() As a user that completely confuses me. Note that this did NOT go away by chunking the array as some earlier posts suggested? Do you think this is the same problem? Happy to debug further, but ultimately just thought this might be another test example that could be useful as a test case for future refactors (xarray-contrib/xskillscore#334 (comment)). |
This is indeed the same as the issue that I was having, which I found to be sensitive to which dimensions I was performing the histogram over and, critically, how many I left behind. If I remember correctly, this divide by zero error was occurring whenever I was attempting to histogram over all dimensions except 1, and would go away if I counted over more or fewer dimensions, which is consistent with what you're seeing. I didn't yet test the impacts of chunks. |
This is really helpful info @jbusecke and @gmacgilchrist! I think this issue should be quite easy to fix once we decide what we want to do with the @TomNicholas, did some testing of the effect of changing this argument here: #63 and there have been suggestions to just hard code it as in numpy. I'd be happy to dive into this further, or @TomNicholas is this something you've already resolved as part of the move of |
I had the same problem with a large |
Actually, it worked for me even if I have only one chunk. |
PS: |
Just ran again into this
Was there any progress or a suggested fix here? Just ran into this again with a large dataset. And it again fails when I want to compute the histogram over all dimensions. I can easily get around this by only taking the hist along the unchunked dims and then summing in time, but a fix for this would still be very desirable IMO. |
+1 to fixing this. |
As @rabernat mentioned in the original post. S = slice(None,None)
bins = [np.arange(-10,10,0.1), np.arange(0,2000,2)]
Ha = histogram(anomaly.isel(casts=S), anomaly.isel(casts=S).z, bins=bins)
If I bins = [np.arange(-10,10,0.1), np.arange(0,2000,2)]
Ha = histogram(anomaly.chunk(casts=1), anomaly.chunk(casts=1).z, bins=bins).load() I just noticed that I am not adding anything new to the problem, but will keep this post here anyway. haha |
Hi everyone! We would absolutely welcome a PR to fix this long-standing bug. |
Is anyone working on that right now or have an idea on how to fix that? |
@iuryt the problem is in the Alternatively one could just remove the whole In the meantime one can just turn off the block-level optimization by passing |
- See related comments in hdrake/sectionate@39c3009 - Temporary bug-fix for xhistogram call (see xgcm/xhistogram#16) - Re-ran all notebooks
Hi, I came across this same problem and have figured out the origin, at least in my case. |
I am also having problems with this hard-coded heuristic when using with xskillscore package |
gives
Smaller sized arrays work fine.
The text was updated successfully, but these errors were encountered: