Tracking issue: Data Pages grapher tasks #2668

Marigold · 2023-09-27T13:44:08Z

Tasks

Bugs to fix

This datapage does not show a chart with data: https://owid.cloud/admin/datapage-preview/814888
This datapage does not render: https://owid.cloud/admin/datapage-preview/520469

Smaller rendering issues

Citation rendering should show the url of the data page and time of access should be the current timestamp

Bigger stretch goals

Switch to a new markdown parser/renderer for proper bullet point etc support

Quality of life features

Add a link to the data page to go to the metadata.yaml file on github and show the full data path including column shortname

Open questions

Should description_short be shown always somewhere?
If we have neither description_short nor description_key nor description_producer, should we just show an empty section under the chart or do something else with the design?
The text from description_processing is not being surfaced anywhere in the data pages. This text is crucial to signal that the data being displayed is not directly the source's data but with our touch. The field description_processing does appear in data pages. It wasn't so far because of a bug that Mojmir has fixed.

Linking from topic tags to topic page urls is not waterproof ATM - decide if this should be explicit by adding a new column in the tags table or similar, or implicit by making the slugify logic better. Initial report text was: Topic tags redirect you to non-existing pages, example redirects to http://staging-site-mojmir/co2-greenhouse-gas-emissions which doesn't exist
Currently, description_key must include the info in description_short. I think this shouldn't be the case. It should be possible that there is some overlap, but description short and key should be separate, and both important fields. Both should be shown in sources tab and data pages. I'd propose:
(A) (Preferred) They should appear in separate places: Both in the data page and the sources tab, we first show the short description, followed by the bullet points of key info.
(B) The description short is rendered as the first point of the key info.
Regarding processing levels, for me, "processed" and "adapted" are ambiguous, and in fact "adapted" clearly sounds "less processed" than "processed". I'd propose (for minor and major processing):
(A) (Preferred) "With minor processing by Our World in Data" & "With major processing by Our World in Data".
(B) "Imported by Our World in Data" & "Processed by Our World in Data".
What should happen when there are multiple producers?
(A) (Preferred) "[Main source] and other sources - Processed by Our World in Data"
(B) "Various source - Processed by Our World in Data"?

Sources tab items

Origin.description doesn't render line breaks in Sources and processing tab
Combined Data published by from sources and origins might be non-unique - Duplicate mention below as "Names are repeated in This data is based on the following sources, because I used two files from the same author. Should I set this differently?"
In the sources tab, the "link" field shows only links for the first origin, e.g. this chart (that shows only the link for population). Also, the "retrieved" field is also the one for population; in this case, there's an ambiguity, but maybe we should pick the latest date of all origins.
The text from description_key is not being surfaced in the sources tab.

Obsolete points

Currently, for charts using old indicators with sources, we include the dataset description at the bottom of the sources tab (e.g. this chart). For new charts, we don't show the dataset description anywhere. I think that, while we still don't have the new grapher, we should keep showing the dataset description at the bottom of the sources tab (otherwise we are missing a lot of relevant info that is shown nowhere).

I'll see what tasks I can do myself and where we need some help. cc @pabloarosado

The text was updated successfully, but these errors were encountered:

lucasrodes · 2023-09-28T13:29:12Z

On

Origin.description doesn't render line breaks in Sources and processing tab

I've recently realised that the line breaks are not rendered in the metadata preview in a notebook. Does this mean that the issue is coming from ETL instead? And more specifically, from how origin.description is stored internally?

Marigold · 2023-09-28T15:11:30Z

@lucasrodes nice catch, I fixed it in one of my PRs. It should render markdown as HTML and show in a notebook (it'll fix line breaks too). Grapher rendering still won't work though. It's an issue on the grapher side.

pabloarosado · 2023-09-28T15:44:39Z

In the sources tab, the "link" field shows only links for the first origin, e.g. this chart (that shows only the link for population). Also, the "retrieved" field is also the one for population; in this case, there's an ambiguity, but maybe we should pick the latest date of all origins.

paarriagadap · 2023-09-28T15:45:07Z

I copy from here:
Some issues I've found with the metadata-based data pages:

Charts and data pages are picking the wrong (random?) year for the sources. 1970 in this case.
Bullet points from description_key are not picked.
I have the text How the producer of this data - undefined - describe this data?
Jump lines and bullet points are not respected in the content of the same section
Names are repeated in This data is based on the following sources, because I used two files from the same author. Should I set this differently?
Jump lines are not respected either there,
Citations have a similar problem because of the name repetition.

You can take a look here.

lucasrodes · 2023-09-28T17:03:13Z

The text from description_processing is not being surfaced anywhere in the data pages. This text is crucial to signal that the data being displayed is not directly the source's data but with our touch.
The text from description_key is not being surfaced in the sources tab.

Marigold · 2023-09-29T09:44:10Z

Topic tags redirect you to non-existing pages, example redirects to http://staging-site-mojmir/co2-greenhouse-gas-emissions which doesn't exist

danyx23 · 2023-10-02T14:53:38Z

Hi all! I'll edit the main issue description at the top to include all the points that you all added as comments so that this is easier to scan and reply to inline

lucasrodes · 2023-10-05T10:05:58Z

Issue:

Collapse snapshot origins into a single one when they refer to the same data product.

Summary:

Sometimes, we rely on multiple snapshots of the same data product to build a dataset. Take this example: http://staging-site-lucas/admin/datapage-preview/818629. Here, we display life expectancy from two data products: UN WPP and HMD. However, we got the data from UN WPP from three different snapshots.

Therefore, in the "Sources and processing" section of the data page, we list four different entries:

Three of these are equivalent because they refer to the same data product. Why are there three? Because there is a snapshot for "Both Sexes", "Females" and "Males".

We should reduce these three to just one entry, maybe by checking that the origin.description field is equivalent, or the origin.title field, etc.

Marigold · 2023-10-09T14:07:35Z

Order of FAQs gets lost on insert to MySQL. Table posts_gdocs_variables_faqs uses only gdocId, variableId and fragmentId columns, nothing about ordering. We should probably add a new column order to that table.

Example: http://staging-site-mojmir/admin/datapage-preview/419298#faqs

pabloarosado · 2023-10-11T08:32:15Z

danyx23 · 2023-10-12T20:10:38Z

This issue was becoming too unwieldy - I have broken it up into several follow up issues that are split by topic:

github-actions bot added the needs triage label Sep 27, 2023

danyx23 self-assigned this Oct 12, 2023

danyx23 added priority 2 - important and removed needs triage labels Oct 12, 2023

danyx23 closed this as completed Oct 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking issue: Data Pages grapher tasks #2668

Tracking issue: Data Pages grapher tasks #2668

Marigold commented Sep 27, 2023 •

edited by danyx23

Loading

lucasrodes commented Sep 28, 2023

Marigold commented Sep 28, 2023

pabloarosado commented Sep 28, 2023 •

edited

Loading

paarriagadap commented Sep 28, 2023 •

edited by Marigold

Loading

lucasrodes commented Sep 28, 2023 •

edited by pabloarosado

Loading

Marigold commented Sep 29, 2023

danyx23 commented Oct 2, 2023

lucasrodes commented Oct 5, 2023 •

edited

Loading

Marigold commented Oct 9, 2023

pabloarosado commented Oct 11, 2023 •

edited

Loading

danyx23 commented Oct 12, 2023

Tracking issue: Data Pages grapher tasks #2668

Tracking issue: Data Pages grapher tasks #2668

Comments

Marigold commented Sep 27, 2023 • edited by danyx23 Loading

Tasks

Bugs to fix

Smaller rendering issues

Bigger stretch goals

Quality of life features

Open questions

Sources tab items

Obsolete points

lucasrodes commented Sep 28, 2023

Marigold commented Sep 28, 2023

pabloarosado commented Sep 28, 2023 • edited Loading

paarriagadap commented Sep 28, 2023 • edited by Marigold Loading

lucasrodes commented Sep 28, 2023 • edited by pabloarosado Loading

Marigold commented Sep 29, 2023

danyx23 commented Oct 2, 2023

lucasrodes commented Oct 5, 2023 • edited Loading

Marigold commented Oct 9, 2023

pabloarosado commented Oct 11, 2023 • edited Loading

danyx23 commented Oct 12, 2023

Marigold commented Sep 27, 2023 •

edited by danyx23

Loading

pabloarosado commented Sep 28, 2023 •

edited

Loading

paarriagadap commented Sep 28, 2023 •

edited by Marigold

Loading

lucasrodes commented Sep 28, 2023 •

edited by pabloarosado

Loading

lucasrodes commented Oct 5, 2023 •

edited

Loading

pabloarosado commented Oct 11, 2023 •

edited

Loading