Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support freq="W" in standardized indices #1952

Merged
merged 10 commits into from
Oct 17, 2024
Merged

Support freq="W" in standardized indices #1952

merged 10 commits into from
Oct 17, 2024

Conversation

coxipi
Copy link
Contributor

@coxipi coxipi commented Oct 11, 2024

Pull Request Checklist:

  • This PR addresses an already opened issue (for bug fixes / features)
    • This PR fixes #xyz
  • Tests for the changes have been added (for bug fixes / features)
    • (If applicable) Documentation has been added / updated (for bug fixes / features)
  • CHANGELOG.rst has been updated (with summary of main changes)
    • Link to issue (:issue:number) and pull request (:pull:number) has been added

What kind of change does this PR introduce?

  • Add support for weekly standardized indices

Does this PR introduce a breaking change?

No

Other information:

I realize that xclim and climate_indices treat zero-inflated distributions differently. I will investigate which library does it correctly and include changes if needed

EDIT: I believe xclim has the correct implementation. The probability of zeroes should be determined in the fitting procedure, not by using the full dataset. The idea is when you compute a CDF, a zero value should be mapped to the prob_of_zero in the calibration period. That is the same logic as using the fitting params of the distribution in the calibration period to compute the CDF in the full dataset.

I confirmed this is also how it's done in the case of the R package (SPEI), although I find some steps a bit weird, but I'm not too familiar with R.

@coxipi coxipi changed the title add support for freq="W" Support freq="W" in standardized indices Oct 11, 2024
CHANGELOG.rst Outdated Show resolved Hide resolved
@coxipi
Copy link
Contributor Author

coxipi commented Oct 16, 2024

I could not get "max-lik" to work with the R package, just got NaNs. But I hacked the xclim SPI to be able to use lmoments3 (it should be modified to be able to accept lmoments3) and the results are quite similar
image

This is not the case as I add zeros, the treatment seems different. I really have the impression there are errors in the code, but I'm not too proefficient in R:

      # Calculate CDF on `x` using `f_params`
      cdf_res <- switch(distribution,
        "log-Logistic" = lmom::cdfglo(x[ff], f_params),
        "Gamma" = lmom::cdfgam(x[ff], f_params),
        "PearsonIII" = lmom::cdfpe3(x[ff], f_params)
      )
      # Adjust for `pze` if distribution is Gamma or PearsonIII
      if (distribution == "Gamma" | distribution == "PearsonIII") {
        spei[ff, s] <- qnorm(pze + (1 - pze) * pnorm(spei[ff, s]))
      }

      # Store the standardized values
      spei[ff, s] <- qnorm(cdf_res)

The x above is the full timeseries, and the zeroes were never removed. The zeroes were removed for the calibration that computes f_params. After that, it seems that the #Adjust is just cancelled by the #Store? Or is there something very different with the "<-" relative to "=" when you just append non-null values?

I use the same precip time series, but I artifically added 0's The result I get with xclim seems more heatlhy:
image

@coxipi
Copy link
Contributor Author

coxipi commented Oct 16, 2024

All in all, the weekly resampling seems to work well. On the question of the treatment of zeroes, climate-indices, R package, and xclim, all seem to work differently. I'm starting to think I have the right implementation, but if anyone wants to chip in, I would be glad if someone wants to give a fresh look on this

@github-actions github-actions bot added the indicators Climate indices and indicators label Oct 17, 2024
@coxipi
Copy link
Contributor Author

coxipi commented Oct 17, 2024

If the calendar is not appropriate, there will simply be problems with resampling or groupby ? Is that sufficient for documentation purposes?

@aulemahal
Copy link
Collaborator

I would say yes. Xarray will fail.

The error is ValueError: Invalid frequency string provided. I wish it would say that it's because cftime doesn't support W, but still I don't think we need to add more safeguards.

W (business week) doesn't make sense in non-real calendars, I think users would be aware of that.

Copy link
Collaborator

@aulemahal aulemahal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! But I thought the solution for lmoments3 was the opposite.

CHANGELOG.rst Outdated Show resolved Hide resolved
tests/test_indices.py Outdated Show resolved Hide resolved
xclim/indices/_agro.py Outdated Show resolved Hide resolved
xclim/indices/_agro.py Show resolved Hide resolved
@github-actions github-actions bot added the approved Approved for additional tests label Oct 17, 2024
@coxipi coxipi merged commit 70cbb36 into main Oct 17, 2024
20 of 21 checks passed
@coxipi coxipi deleted the spi_weekly branch October 17, 2024 16:02
@coxipi
Copy link
Contributor Author

coxipi commented Oct 17, 2024

I don't know why #1892 was not closed, I thought it was automatic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Approved for additional tests indicators Climate indices and indicators
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants