Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPEI : can't figure it out #1564

Closed
1 task done
GiuliaIsNotAvailable opened this issue Dec 18, 2023 · 5 comments
Closed
1 task done

SPEI : can't figure it out #1564

GiuliaIsNotAvailable opened this issue Dec 18, 2023 · 5 comments
Labels
support Questions and help for users/developers

Comments

@GiuliaIsNotAvailable
Copy link

GiuliaIsNotAvailable commented Dec 18, 2023

Setup Information

  • Xclim version: 0.46.0
  • Python version: 3.10.13 (Jupyter notebook)
  • Operating System: Windows 11

Context

Hello, I have been trying to calculate the SPEI for a long time unsuccessfully, because every time I tried to fix an error I got a new one . I've been using agERA5 dataset, rearranged from GRIB to netCDF, adding the "pev" variable, that is the potential evapotranspiration calculated with the Penman-Monteith equation. The netCDF file contains several variables with daily values (for february 2023, as an example only), including those necessary for computing the SPEI: in particular the "ppn" (that is the daily total precipitation [mm]) and the "pev" (namely the daily potential evapotranspiration [mm]) in order to compute the daily hydroclimatic balance [mm] (precipitation minus evapotraspiration) named "bic_daily". Below the first (simplest) version of the python code i wrote:

import xclim.indices as indices
import xarray as xr
from xclim.indicators import atmos
agERA5_202302 = xr.open_dataset('202302_agERA5.nc')
agERA5_202302['bic_daily'] = agERA5_202302['ppn'] - agERA5_202302['pev']
bic_daily_foramonth = agERA5_202302['bic_daily']
spei_feb = indices.standardized_precipitation_evapotranspiration_index(wb = bic_daily_foramonth, freq = None, window = 1, dist = 'gamma', method = 'APP')

I got this error: KeyError: 'units' (same error if I change "indices" with "atmos"). The error des not depends on the parameters inside the SPEI function.

I decided to try with some changes. Below the new python code:

agERA5_202302 = xr.open_dataset('202302_agERA5.nc').to_dataframe().reset_index()
agERA5_202302 = agERA5_202302.rename(columns={'time':'day'})
agERA5_202302 = agERA5_202302.set_index(['day','latitude','longitude'])
agERA5_202302['bic_daily'] = agERA5_202302['ppn'] - agERA5_202302['pev']
agERA5_202302 = agERA5_202302.to_xarray()
bic_daily_foramonth = agERA5_202302['bic_daily']

If I run:
spei_feb = atmos.standardized_precipitation_evapotranspiration_index(wb = bic_daily_foramonth, freq = None, window = 1, dist = 'gamma', method = 'APP')
I got: ValueError: Inputs have different frequencies. Got : [].To mute this, set xclim's option data_validation='log'.
If I run:
spei_feb = atmos.standardized_precipitation_evapotranspiration_index(wb = bic_daily_foramonth, freq = None, window = 1, dist = 'gamma', method = 'APP')
I got: KeyError: 'units'

I decided to try again with some other changes. Below the last python code:

agERA5_202302 = xr.open_dataset('202302_agERA5.nc').to_dataframe().reset_index()
agERA5_202302 = agERA5_202302.rename(columns={'time':'day'})
agERA5_202302 = agERA5_202302.set_index(['day','latitude','longitude'])
agERA5_202302['bic_daily'] = agERA5_202302['ppn'] - agERA5_202302['pev']
agERA5_202302 = agERA5_202302.to_xarray()
bic_daily_foramonth = agERA5_202302['bic_daily']
bic_daily_foramonth.attrs['units'] = 'mm / day'
spei_feb = indices.standardized_precipitation_evapotranspiration_index(wb = bic_daily_foramonth, freq = None, window = 1, dist = 'gamma', method = 'APP')

And I got this: AttributeError: 'DataArray' object has no attribute 'time'

I do not understand if the problem comes from the data, because I have not found examples of code for the SPEI computation
I'm sorry for being wordy. Any suggestions?
Thank you in advance

Giulia

Dataset.zip

Code of Conduct

  • I agree to follow this project's Code of Conduct
@GiuliaIsNotAvailable GiuliaIsNotAvailable added the support Questions and help for users/developers label Dec 18, 2023
@coxipi
Copy link
Contributor

coxipi commented Dec 18, 2023

Hi @GiuliaIsNotAvailable,

Sorry to hear you rushed using SPEI. As you perform an operation on an arithmetic operation on your datasets in xarray, (agERA5_202302['ppn'] - agERA5_202302['pev']), you will lose units, unless you specify to keep them:

# a dataset with units
pr.attrs["units"]
>>> 'kg m-2 s-1'

# arithmetic op kills units
out = 0.5*pr
out.attrs["units"]
>>> KeyError: 'units'

# specify that ops should not change units
with xr.set_options(keep_attrs=True):
    out = (0.5*pr)
out.attrs["units"]
>>> 'kg m-2 s-1'

If you wrap your computation of agERA5_202302['bic_daily'] with this, you should eliminate this problem, but let me know.

As for the AttributeError: 'DataArray' object has no attribute 'time', it comes from the step where you changed time for day, agERA5_202302.rename(columns={'time':'day'}).

I guess if you solve the problem in the first step, you won't be doing this change time -> day that you did when trying to resolve the first error. Nonetheless, let me just explain a bit more this point in all cases. In principle, if it's just a change of name, then many functions in xclim would allow you to specify dim = "day" to specify your working dimension, and it could be the case for SPEI, but it's not currently implemented. In all cases, it expects that coordinates of your time dimension are CFTtime, numpy.datetime, etc, values. You could in principle compute a SPEI-like quantity for a non-time dimension, or just with a time as an integer quantity, but we rely on it being a time (CFTtime, numpy.datetime, etc), for instance we allow resampling frequencies to be either "D" or "M" for a daily or monthly SPEI. In turn, we can also use groupby operations relying on time.month for instance. Even if you don't want resampling and you specify freq=None, the time is still used to ensure we have a daily or monthly dataset.

@GiuliaIsNotAvailable
Copy link
Author

Hi @coxipi, thank for your answer. I did what you said, but I get "KeyError: 'units'". I think I have to set some attributes manually because in the xarray.datarray I use for the SPEI computation there are no attribute at all. I don't understand which kind of attributes I need to set.

@coxipi
Copy link
Contributor

coxipi commented Jan 8, 2024

Hi! Can you show the code you use to get this error? It seems you might have invoked out.attrs["units"] when it's not defined. Perhaps it's not defined in your initial datasets?

@GiuliaIsNotAvailable
Copy link
Author

Yes it's not defined in my initial dataset. How can I define it? And how can I define every attributes I need to compute the SPEI? My code is exactly the code I wrote above, plus the code you advise me to add.

@coxipi
Copy link
Contributor

coxipi commented Jan 8, 2024

water budget is pr - pet. pr and pet both should have units. Since your pr and pet datasets don't have units, be extra careful that the implicit units are indeed the same.

Now, suppose the units are kg m-2 s-1 for both datasets, then you can simply assign units to your water budget

agERA5_202302['bic_daily'] = agERA5_202302['ppn'] - agERA5_202302['pev']
bic_daily_foramonth = agERA5_202302['bic_daily']
bic_daily_foramonth.attrs["units"] = "kg m-2 s-1"

You could also define units for agERA5_202302['ppn'] and agERA5_202302['pev'], then use the wrapper above so that units aren't lost in the subtraction, this would do the same.

agERA5_202302['ppn'].attrs["units"] = "kg m-2 s-1"
agERA5_202302['pev'].attrs["units"] = "kg m-2 s-1"

If you find out that the implicit units aren't the same, you can use our conversion tools to change units of one datasets

agERA5_202302['ppn'].attrs["units"] = "kg m-2 s-1"
agERA5_202302['pev'].attrs["units"] = "kg mm-2 d-1"
agERA5_202302['pev'] = xclim.core.units.convert_units_to(agERA5_202302['pev'], agERA5_202302['ppn']) # 'pev' is now in "kg m-2 s-1"

Cheers

@Ouranosinc Ouranosinc locked and limited conversation to collaborators Jan 10, 2024
@coxipi coxipi converted this issue into discussion #1589 Jan 10, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
support Questions and help for users/developers
Projects
None yet
Development

No branches or pull requests

2 participants