-
-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 Raise warning on staging server when we cannot delete variables used in charts #3716
Conversation
Quick links (staging server):
Login: chart-diff: ✅No charts for review.data-diff: ✅ No differences foundLegend: +New ~Modified -Removed =Identical Details
Hint: Run this locally with etl diff REMOTE data/ --include yourdataset --verbose --snippet Automatically updated datasets matching weekly_wildfires|excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk are not included Edited: 2024-12-11 19:12:49 UTC |
@lucasrodes added those commits 😉 |
7e2e24b
to
58c4323
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say LGTM, but there is one line I don't fully get.
# It's possible that we merged changes to ETL, but the staging server still uses old charts. In | ||
# that case, we first check that the charts were really modified on our staging server. | ||
modified_charts = ChartDiffsLoader(config.OWID_ENV.get_engine(), production_or_master_engine()).df | ||
return bool(set(modified_charts.index) & set(rows.chartId)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried following the logic here, but didn't really come to an understanding. Could you help me understand it?
So, my logic is:
-
Before we had something like: "raise error if there are ghost variables (i.e.
rows
)" -
Now, you are proposing to evaluate
bool(set(modified_charts.index))
in addition. This should returnTrue
whenevermodified_charts
is not empty, right? (In that case wouldn't it be enough to just usenot modified_charts.empty
?). -
Still, if
modified_charts
is not empty, why is that a potential sign to raise an error? Isn'tmodified_charts
containing a row per each chart that has been modified (regardless of it being approved, etc.)?
I think I'm missing something here. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, you are proposing to evaluate bool(set(modified_charts.index)) in addition. This should return True whenever modified_charts is not empty, right? (In that case wouldn't it be enough to just use not modified_charts.empty?).
set(modified_charts.index)
contains chart ids that have been modified on staging (either via config, data or metadata) and set(rows.chartId)
contains problematic chart ids. Previously, we raised an error if set(rows.chartId)
wasn't empty, but the problem was that sometimes it was empty because of race conditions between staging & prod.
bool(set(modified_charts.index) & set(rows.chartId))
returns True if there are any problematic charts AND those charts have been also modified on the staging server.
I could have also written it as:
if rows and any(row.chartId in modified_charts.index):
return True
Perhaps it's me who's missing something and I misunderstood your question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh my god. I miss-read one of the parents. I read bool(set(modified_charts.index)) & set(rows.chartId)
.
NVM, I got it, makes total sense.
There's an error that sometimes comes up on staging servers
This issue occurs when someone renames the short names of indicators and updates some charts in production. If you then merge the master branch to the staging server, which is still using old charts from the time it was created, you'll encounter this error.
This PR adds a check that converts the error into a warning if the changes on the staging server are unrelated to the charts. The error disappears in production after the merge.