-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spei, proper use of loc
parameter in scipy dists
#1653
Conversation
I'm pretty much done here I think, but I would like to test a bit more situations where we have negative values. I need to find realistic cases, not too extreme I suppose, although I don't see why the method would not work with very negative values. I have to see! Shifting the values like:
for
I say error because I assume that the SPI should not change if we apply a constant change to all precipitations |
So I tried to obtain a proper estimation of It becomes a mess. Even with a significantly worse approximation, the expression is quite messy. It can be done, but maybe there is a simple way I'm not seeing.
|
I will implement a very rudimentary estimation of the If we want to use a specific In any cases, we should keep the zero inflated distributions for SPI/precips. I have a good idea of what would be good for many examples, I just need to figure out a proper way to organize things around these ideas. I think adding a |
I had to change some results in test_indices. The most significant changes are for the fisk distribution since I changed significantly the approximations. We now have a more rigorous implementation of the method of moments (I used and <x^2>/^2. The choice of starting parameters can significantly affect even the full-fledged numerical approach (e.g. "ML" method) |
I've been exploring the sensitivity to initial guesses in our procedure. Consider what is done in freq = "D"
window = 1
dist = "fisk"
method = "ML"
ds = open_dataset("sdba/CanESM2_1950-2100.nc", cache=True, cache_dir=Path("/home/eridup1/tank/xclim_open_dataset/")).isel(location=1)
pr = ds.pr.sel(time=slice("1998", "2000"))
pr_cal = ds.pr.sel(time=slice("1950", "1980"))
params = xci.stats.standardized_index_fit_params(
pr_cal, freq=freq, window=window, dist=dist, method=method, zero_inflated=True,
)
spi = xci.standardized_precipitation_index(pr, params=params)
print(spi.isel(time=slice(-11, -1, 2))) I used two ways to estimate # loc=0
>> [-0.12736419 1.30823431 0.67651811 0.10446318 -0.50246943]
# loc = (x1 * xn - x2**2) / (x1 + xn - 2 * x2)
# x1 is the smallest precip in the dataset, x2 the second smallest, xn the largest
>> [-0.15894928 1.30822495 0.44984607 0.14669885 -0.50273746] It makes me think in some cases, the same minimum is found, in other cases no. The difficulty of optimization for 3-parameter They mention |
missing 's' Co-authored-by: Vasco Schiavo <[email protected]>
I did the computation for 20 values of loc0 between 0 and the guess I outlined above. There is some variability in the spi output, and that's true for any day of year: doy=361 (the 0.104 vs. 0.147 spi values above):another random doy I took, 180The worst doy mentioned above, the one with 0.68 vs. 0.45BONUS: using
|
I've been battling with R trying to obtain a confirmation through another package that indeed, MLE is difficult for 3-parameters function. I've had good success working with 2-params distributions that come with library(FAdist)
library(fitdistrplus)
data <- rgamma3(30, shape = 0.5, scale = 1e-4, thres=0) # Using Gamma distribution for sample data
params = fitdistr(data,dgamma3, start=list(shape=0.5,scale=1e-4, thres=0)) Maybe there is something simple I'm missing, I will simply wrap up the PR now |
Pull Request Checklist:
number
) and pull request (:pull:number
) has been addedWhat kind of change does this PR introduce?
gamma, fisk
. Now, instead, we properly use theloc
parameter in those distributions that generalize the distributions and allow negative values. For instance, thegamma
disitribution is a 2-parameter function, but inscipy
it's generalized to 3 parameters, with aloc
parameter. This extends the support ofgamma
fromThe code was already compatible with this idea, just had to change
_fit_start
so it can compute estimates of the parameters in the cases with negatives values.Does this PR introduce a breaking change?
Yes, removing the offset from SPEI.
Other information: