Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Raise warning on staging server when we cannot delete variables used in charts #3716

Merged
merged 2 commits into from
Dec 13, 2024

Conversation

Marigold
Copy link
Collaborator

@Marigold Marigold commented Dec 11, 2024

There's an error that sometimes comes up on staging servers

ValueError: Variables used in charts will not be deleted automatically:

This issue occurs when someone renames the short names of indicators and updates some charts in production. If you then merge the master branch to the staging server, which is still using old charts from the time it was created, you'll encounter this error.

This PR adds a check that converts the error into a warning if the changes on the staging server are unrelated to the charts. The error disappears in production after the merge.

@Marigold Marigold marked this pull request as ready for review December 11, 2024 19:10
@owidbot
Copy link
Contributor

owidbot commented Dec 11, 2024

Quick links (staging server):

Site Dev Site Preview Admin Wizard Docs

Login: ssh owid@staging-site-warning-charts

chart-diff: ✅ No charts for review.
data-diff: ✅ No differences found
Legend: +New  ~Modified  -Removed  =Identical  Details
Hint: Run this locally with etl diff REMOTE data/ --include yourdataset --verbose --snippet

Automatically updated datasets matching weekly_wildfires|excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk are not included

Edited: 2024-12-11 19:12:49 UTC
Execution time: 18.12 seconds

@Marigold Marigold requested a review from lucasrodes December 11, 2024 19:14
@Marigold
Copy link
Collaborator Author

@lucasrodes added those commits 😉

Copy link
Member

@lucasrodes lucasrodes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say LGTM, but there is one line I don't fully get.

# It's possible that we merged changes to ETL, but the staging server still uses old charts. In
# that case, we first check that the charts were really modified on our staging server.
modified_charts = ChartDiffsLoader(config.OWID_ENV.get_engine(), production_or_master_engine()).df
return bool(set(modified_charts.index) & set(rows.chartId))
Copy link
Member

@lucasrodes lucasrodes Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried following the logic here, but didn't really come to an understanding. Could you help me understand it?

So, my logic is:

  1. Before we had something like: "raise error if there are ghost variables (i.e. rows)"

  2. Now, you are proposing to evaluate bool(set(modified_charts.index)) in addition. This should return True whenever modified_charts is not empty, right? (In that case wouldn't it be enough to just use not modified_charts.empty?).

  3. Still, if modified_charts is not empty, why is that a potential sign to raise an error? Isn't modified_charts containing a row per each chart that has been modified (regardless of it being approved, etc.)?

I think I'm missing something here. Thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you are proposing to evaluate bool(set(modified_charts.index)) in addition. This should return True whenever modified_charts is not empty, right? (In that case wouldn't it be enough to just use not modified_charts.empty?).

set(modified_charts.index) contains chart ids that have been modified on staging (either via config, data or metadata) and set(rows.chartId) contains problematic chart ids. Previously, we raised an error if set(rows.chartId) wasn't empty, but the problem was that sometimes it was empty because of race conditions between staging & prod.

bool(set(modified_charts.index) & set(rows.chartId)) returns True if there are any problematic charts AND those charts have been also modified on the staging server.

I could have also written it as:

if rows and any(row.chartId in modified_charts.index):
  return True

Perhaps it's me who's missing something and I misunderstood your question.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh my god. I miss-read one of the parents. I read bool(set(modified_charts.index)) & set(rows.chartId).

NVM, I got it, makes total sense.

@Marigold Marigold merged commit b50f483 into master Dec 13, 2024
7 of 8 checks passed
@Marigold Marigold deleted the warning-charts branch December 13, 2024 08:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants