Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update dependencies (Jarmo-fix) #50

Merged
merged 38 commits into from
Feb 6, 2024
Merged

Update dependencies (Jarmo-fix) #50

merged 38 commits into from
Feb 6, 2024

Conversation

jkikstra
Copy link
Collaborator

@jkikstra jkikstra commented Nov 30, 2023

Continuation of #47, which itself was a continuation of #46.
It's already (re)based on the current #49. (speaking as 15:50, 30.11.2023)

The idea is that only merging this current branch should do the trick. The other PRs can then be closed.

The PR does the following:

  • update dependencies, most notably pyam, where pyam.IamDataFrame.require_variable() has been replaced by pyam.IamDataFrame.require_data(), noting that it is not a one-on-one replacement. Additionally, there seems to have been some changes in functionality around empty filtered outputs.

  • following this, rewrite reclassify_waste_and_other_co2_ar6(), and delete require_var_allyears which was previously only used in reclassify_waste_and_other_co2_ar6()

  • also delete drop_broken_stuff which does not seem necessary anymore (would be great if @znicholls can check if this is OK)

  • building on Fix if all emissions data starts in 2015, in :func:add_year_historical_percentage_offset #49, there's no issues anymore

  • add a test test_add_completeness_category, which ensures that add_completeness_category() runs

  • rewrite add_completeness_category(). NOTE: this causes a minor change in the functionality, where we don't count the variables anymore (this functionality wasn't really used anyway, and more of a legacy of explorations during an earlier projects: so I deleted the auxiliary function count_variables_very_high() which is not necessary anymore. I am not aware of anyone using this "completeness_category" functionality, and we didn't guarantee its continuation

  • update some test-data expected files, because a new error occurred regarding the "exclude" column in the output meta file (which may have had to do with updated pyam behaviour, too?)

  • Tests added

  • Documentation added

  • Example added (in the documentation, to an existing notebook, or in a new notebook)

  • Description in CHANGELOG.rst added (single line such as: (`#XX <https://github.com/iiasa/climate-assessment/pull/XX>`_) Added feature which does something)

jkikstra and others added 20 commits November 30, 2023 10:56
See if moving away from 3.9 solves the GitHub Actions `black` linter error.
Change in behaviour:
- for the "very-high" meta classification, there's no need anymore to report at least 9 variables. Instead, I added sulfur. The reason for this is (a) the code is easier, and (b) there was also little basis for the original and I don't think anyone used it.
- auxiliary function `count_variables_very_high()` is deleted
This function now actually throws an error, and does not seem to be necessary anymore?

@znicholls probably wrote this bit, so might want to double-check.
@jkikstra
Copy link
Collaborator Author

jkikstra commented Nov 30, 2023

Woops - still got a remaining local test failing - will dive back in - so hold your horses for the review for a second

@jkikstra jkikstra marked this pull request as draft November 30, 2023 14:57
@jkikstra jkikstra marked this pull request as ready for review November 30, 2023 17:04
@phackstock
Copy link
Contributor

@jkikstra and @znicholls are there any updates on this? Would be great to have this resolved before the end of the year. If there's anything I can do to help please just let me know :)

…ith --update-expected-files"

This reverts commit 16a1491.
@jkikstra
Copy link
Collaborator Author

jkikstra commented Dec 11, 2023

Potentially my previous "update-expected-files" (16a1491) may have caused some issues now that you added back in your hack, we can try reverting it?

Ah ok yep I would try reverting that then and see if it behaves. I think we should be able to do this MR without having to update the expected output (or at least I think that should be the goal as upgrading the packages shouldn't change the numbers!).

Indeed, The numbers shouldn't change.
But I don't care much for the exclude column, which we don't use in the output anyway currently. So whether or not it is there, doesn't matter to us, which is what I hoped to resolve initially with the updating.

Now tried with a fresh conda environment (using python 3.10), but still getting these KeyError issues. I'll dig a bit deeper and try also with 3.10, since that's what the GHA workflow was running on.
EDIT: I don't think it's a python version issue.

@jkikstra
Copy link
Collaborator Author

Hmm okay so I keep being stranded at getting this "KeyError" issue on my laptop for tests\integration\test_harmonization.py, which refers back to what I think should be a standard operation, coming just after the drop_broken_stuff() which is why last time I took that out:

src\climate_assessment\harmonization\__init__.py:333: in run_harmonization
    equiv_rows = scenarios_harmonized["variable"].isin(["Emissions|HFC"])

Besides that, there are still a few other test failures too.

So Philip, would gladly take up your offer - let's connect tomorrow if you have time (I'll write you on Teams).

@jkikstra
Copy link
Collaborator Author

jkikstra commented Dec 13, 2023

Okay, @phackstock and I sat down. There's a few things.

Local installation issues on my Windows laptop
The KeyError issue I got above is mysteriously linked to me having done pip install -e . only.
It goes away once I managed to install with pip install -e .[tests,deploy,linter,notebook]
We have no clue why yet. We also have no clue why I could even run pytest without it giving an error, on a freshly installed python 3.10 interpreter, in a fresh conda environment, not having installed tests and not having pytest show up when I do pip freeze. Philip suggested that perhaps conda somehow was using a global version of the package, but we also don't know how that would have worked.
Philip, on Linux, could not reproduce this.

Additionally, I had to do `` instead of pip install -e .[dev] because the latter created a conflict between `sphinx` and `awscli`.
Again, Philip, on Linux, could not reproduce this.

Meta issue: exclude column

  • FAIL: test_postprocess.py::test_postprocess
    In a recent pyam update, exclude changed from a column to an attribute.
    This should have affected only the places where we have a meta, so especially

A quick deleting of the column in one file indeed seems to have resolved this issue.
However, it revealed a new mysterious issue, where the data columns (i.e. 1995, 1996, ..., 2100) in the newly produced data now were not integer anymore (as expected) but rather strings (i.e. '1995', '1996', ..., '2100').

Way to resolve:

  • delete the exclude column in the test-data files where it is present
  • find out and resolve the string/integer issue: follow pyam's integer default

Data issue: different climate outcomes

  • FAIL: test_workflow.py::test_workflow_ciceroscm
  • FAIL: test_workflow.py::test_workflow_fair
  • FAIL: test_clim.py::test_clim

Somehow, it looks like climate output values are different from what is expected. We see large differences in e.g. 2100 temperature in cicero (about 0.8K), and noticeable differences in fair (about 0.05K).

Way to resolve:
Bit unclear still. Here's my thoughts on where to start.
Because none of the climate models have changed, we would first need to check if there's an issue with the emissions input.
It might be worth starting with a look at the "model9"-"1point5" pathway in ex2.csv, for either cicero or fair.
I.e., using the "_alloutput.xlsx" (e.g. "ex2_harmonized_infilled_alloutput.xlsx") test-data files, look at the infilled emissions (e.g. AR6 climate diagnostics|Infilled|Emissions|CO2|Energy and Industrial Processes), and compare to what the testing pipeline produces now, before feeding into the climate models.
If there's no difference in the emissions, we should worry again.

@phackstock will now take over this branch.

@znicholls
Copy link
Collaborator

znicholls commented Dec 13, 2023

Local installation issues on my Windows laptop

Yikes, nice detective work.

Meta issue: exclude column

Weird. Do whatever in terms of removing exclude and changing to int or not (it's easy enough to git diff to check that only that column changes)

If there's no difference in the emissions, we should worry again

Yes. I guess if there is a difference in the emissions, we have to fix that. If there isn't a difference in the emissions, something has changed in the post-processing. That is possible with pandas 2.0. @phackstock if you find that the emissions are in fact identical ping me and I can go diving into emulator land. If only the climate tests are failing (and all the other processing tests e.g. harmonisation and infilling related tests are passing) then this suggests to me already a problem in the emulators/post-processing, but I will wait until the emissions checks are done before diving in just in case my instinct is wrong.

@znicholls znicholls mentioned this pull request Jan 20, 2024
4 tasks
@phackstock
Copy link
Contributor

Fixed the first issue on your list @jkikstra. The issue ended up being both the extra exclude column in the reference data as well as the year column names being read in as numbers rather than strings.
Let's see if I can get the remaining tests to work as well.

@phackstock
Copy link
Contributor

Fixed the pyam.IamDataFrame.reset_index. From what I can see there's now just tow types of errors remaining:

  1. The openscmrunner.run
  2. The actual failing climate model tests

@znicholls you can take over here, fixing 1. should be a matter of seconds for you, fingers crossed for 2.

@znicholls
Copy link
Collaborator

Alrighty we're done. The CICERO test is failing, but that is because we fixed a bug in CICERO's heat uptake handling in Openscm-runner way back (https://github.com/openscm/openscm-runner/blob/main/docs/source/changelog.md#openscm-runner-v093-2022-01-19).

So, if you have a linux machine and want to just re-run the failing test to update it, then go for it. Otherwise, I can do that Tuesday.

As usual, a weird pandas bug was the culprit. This package is also just such a mess though, maintenance is going to be an ongoing headache.

@phackstock
Copy link
Contributor

@znicholls thanks a lot for your work 👍 amazing that we'll now be compatible with the lastest version of the Scenario Explorer processing.
If you could fix the final test tomorrow that would be great. I'll approve the PR now so that you can just go ahead with the merge afterwards.
This update would probably also warrant a new release.

phackstock
phackstock previously approved these changes Feb 5, 2024
@jkikstra jkikstra dismissed phackstock’s stale review February 5, 2024 14:49

The merge-base changed after approval.

znicholls
znicholls previously approved these changes Feb 6, 2024
Copy link
Collaborator

@znicholls znicholls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are green (https://github.com/iiasa/climate-assessment/actions/runs/7797092361/job/21263037772), let's go

This is a breaking change and will mean that we'll have to remove the nightly CICERO tests (as they are now no longer able to be reproduced, which is kind of fine as the results were wrong anyway).

@jkikstra jkikstra dismissed znicholls’s stale review February 6, 2024 09:22

The merge-base changed after approval.

@znicholls znicholls merged commit f353891 into main Feb 6, 2024
3 checks passed
@znicholls znicholls deleted the fix/47 branch February 6, 2024 09:23
@phackstock phackstock mentioned this pull request Feb 15, 2024
4 tasks
@znicholls znicholls mentioned this pull request Apr 4, 2024
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants