-
-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(data-table): respect tolerance #2859
Conversation
// sanity check: start time should always be <= end time | ||
start.time !== undefined && | ||
end.time !== undefined && | ||
start.time <= end.time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arguably, this could also be <
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the quick fix, @sophiamersmann! A few rambling comments to your question of why this behaviour was different:
- Visually this used to be a lesser problem because in the past the table showed the year the data point was from. With the redesign we made this more subtle and only show the circled i indicator. It used to be that there were two lines in such cases, where the first one was in gray the year and the second was the value. This I think was acceptable but now it is not obvious enough anymore that the data is not from the year the column says above
- The functionality of this was never great. If you sort by values in the first column in a long run time series like population for year -10000, you used to get the highest value for whatever the first datapoint was for any entity (go to https://ourworldindata.org/grapher/population?tab=table&country=CHN~IND~USA~BRA~NGA~IRN~ZAF and sort by first column - West Germany in 1800. Obviously 😵💫 )
- I think the two different code paths where a performance tuning. Computing the relevant values on the first and last year only is obviously better for this case and the current solution might use substantially more RAM for long run time series - but I'd say lets wait and see if this causes any issues (or maybe add a comment that maybe this should be optimized again for the table case where we want to compute changes)
If you compare the 12 most important charts then I think the new behaviour is clearly better in all of them:
Current behaviour:
https://ourworldindata.org/12-key-metrics
Your fixes:
http://staging-site-fix-data-table-tolerance/12-key-metrics
Old grapher (pre redesign):
https://staging.owid.cloud/admin/posts/preview/24902
So all in all:
thank you for taking the time to write all of this down! shipping it 🚀 |
fixes Tables do not respect tolerance and show misleading data #2831
there are two code paths: "range" mode and "point" mode
in "point" mode, all is good
the issue is with the "range" mode
to fix the issue, we simply grab the closest available data point within tolerance for both the start and the end year
I can see that the special code path for "range" mode was a deliberate choice, but I'm not sure why (to show the delta columns as often as possible?). What am I missing?