Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🎉 (entity selector) sort by external indicators #3466

Merged
merged 2 commits into from
May 3, 2024

Conversation

sophiamersmann
Copy link
Member

@sophiamersmann sophiamersmann commented Apr 9, 2024

Cycle 2024.2: Entity selector | Designs

Summary

Offers to sort by "Population" and "GDP per capita", even if the chart doesn't include population or GDP per capita indicators.

Details

  • If the chart has a Population or GDP per capita indicator, then we re-use that data
  • If we need to download additional indicators, then we do that on demand, i.e. selecting "Population" or "GDP per capita" triggers their download
  • Indicator IDs for population and GDP per capita are hard-coded (but the data team might come up with a better solution)
  • "Population" and "GDP per capita" are only offered for selection if entities are detected to include countries or regions
    • This is done by checking whether any of the available entities are listed in the regions.json file
    • Testing the available entities against the regions.json file is not perfect since the default population and GDP per capita indicators that we are using have data for a few entities that are not listed in regions.json (see details below)
    • However, we only need a single matching entity to trigger sorting by population or GDP per capita, so in practice this works well
    • If we wanted to be more correct here, we could also download population and GDP per capita metadata when the entity selector is opened and then check the actual population/GDP per capita entities against the entities that are available for the chart
Entities of the population or GDP per capita indicator that are not included in the `regions.json` file
  • For the population indicator:
    - Africa (UN)
    - Asia (UN)
    - Europe (UN)
    - High-income countries
    - Latin America and the Caribbean (UN)
    - Low-income countries
    - Lower-middle-income countries
    - Northern America (UN)
    - Oceania (UN)
    - Upper-middle-income countries
  • For the GDP per capita indicator:
    - East Asia and Pacific (WB)
    - Europe and Central Asia (WB)
    - High-income countries
    - Latin America and Caribbean (WB)
    - Low-income countries
    - Lower-middle-income countries
    - Middle East and North Africa (WB)
    - Middle-income countries
    - North America (WB)
    - South Asia (WB)
    - Sub-Saharan Africa (WB)
    - Upper-middle-income countries

SVG tester

The SVG tester fails due to the changes in #3373

@sophiamersmann sophiamersmann force-pushed the entity-selector-external-sort branch from 9b0e9a9 to 33a2d00 Compare April 9, 2024 14:24
@sophiamersmann sophiamersmann force-pushed the entity-selector-external-sort branch from 33a2d00 to 85b3c66 Compare April 9, 2024 14:55
@sophiamersmann sophiamersmann mentioned this pull request Apr 9, 2024
11 tasks
@sophiamersmann sophiamersmann force-pushed the entity-selector-external-sort branch 2 times, most recently from 5977c2b to 4fac008 Compare April 10, 2024 10:57
@sophiamersmann sophiamersmann changed the base branch from grapher-sizing to entity-selector-location April 10, 2024 10:57
@sophiamersmann sophiamersmann force-pushed the entity-selector-external-sort branch from 4fac008 to 3071934 Compare April 10, 2024 15:27
@sophiamersmann sophiamersmann force-pushed the entity-selector-location branch from e779013 to 265c819 Compare April 10, 2024 16:06
@sophiamersmann sophiamersmann force-pushed the entity-selector-external-sort branch from 3071934 to ac6a353 Compare April 10, 2024 16:06
@sophiamersmann sophiamersmann force-pushed the entity-selector-location branch from 265c819 to d1b0586 Compare April 11, 2024 07:46
@sophiamersmann sophiamersmann force-pushed the entity-selector-external-sort branch from ac6a353 to 9a05e58 Compare April 11, 2024 07:46
@sophiamersmann sophiamersmann force-pushed the entity-selector-location branch from d1b0586 to 686872e Compare April 11, 2024 07:53
@sophiamersmann sophiamersmann force-pushed the entity-selector-external-sort branch from 9a05e58 to d6a6db6 Compare April 11, 2024 07:53
@sophiamersmann sophiamersmann force-pushed the entity-selector-location branch from 686872e to ae96066 Compare April 11, 2024 08:30
@sophiamersmann sophiamersmann force-pushed the entity-selector-location branch from 34799bd to c4170b8 Compare April 15, 2024 08:20
@sophiamersmann sophiamersmann force-pushed the entity-selector-external-sort branch from 0997d6c to 182bd44 Compare April 15, 2024 08:20
@sophiamersmann sophiamersmann force-pushed the entity-selector-external-sort branch from 182bd44 to 28a40e3 Compare April 15, 2024 09:03
@sophiamersmann sophiamersmann force-pushed the entity-selector-location branch from c4170b8 to e956f8d Compare April 15, 2024 12:20
@sophiamersmann sophiamersmann force-pushed the entity-selector-external-sort branch from 28a40e3 to 0deb912 Compare April 15, 2024 12:20
Copy link
Member

@marcelgerber marcelgerber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this works well!

const activePopulationColumn = this.numericalChartColumns.find(
(column) =>
isPopulationVariableId(column.slug) ||
!!makeColumnLabel(column).match(/\bpopulation\b/i)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this regex will also match "Population density", "Female population", "Share of population in poverty", etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah... but I can't think of a better way to detect population/GDP per capita indicators. Alternatively, we could rely on isPopulationVariableId only and ask d&r for a similar list of GDP per capita indicators. Do you reckon that would be better?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For population, I think it is best to rely on isPopulationVariableId only and not use a regex since we'd get many spurious matches (Datasette)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agreed.
Or otherwise, match only /Population( \(historical estimates\))?/, which currently finds exactly these four.

| CoreColumn
| undefined {
const activeGdpPerCapitaColumn = this.numericalChartColumns.find(
(column) => !!makeColumnLabel(column).match(/\bgdp per capita\b/i)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar here: This'll also match any variable that contains % of GDP per capita, for example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GDP per capita is different because we have lots of GDP per capita variations like GDP per capita (constant 2015 US$) or Real GDP per capita (2011$) that are often simply referred to as "GDP per capita" in charts (Datasette; example chart). In such cases, using a second, slightly different GDP per capita version in the entity selector would be confusing. So, I think matching the name makes sense here. But I updated the regex to make sure that we're not matching "GDP per capita" if it's in parens, like Government expenditure per student, primary (% of GDP per capita).

There are some indicators that we'll match incorrectly (like Growth rate of GDP per capita or Affordability of cigarettes: percentage of GDP per capita required to purchase 2000 cigarettes of the most sold brand), which means that we'll fail to offer "GDP per capita" as a sorting option, which I think is acceptable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, agreed. Good on you for pointing out that the outcome of a false positive is simply that we're not offering GDP per capita for sorting in that case, which is totally accepting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I don't quite like here is how set-in-stone these two variables are: There are many methods that have "population" or "gdp" in their name, and all of that makes it quite hard to add any other such indicators in the future.

More ideally, we would have a more configurable set; something like

const INDICATORS_TO_LOAD = [{
    key: "population",
    name: "Population",
    indicatorId: POPULATION_INDICATOR_ID_USED_IN_ENTITY_SELECTOR,
    columnNameRegex: /\bpopulation\b/i
}, // ...
]

could work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had refactored the code at some point to allow for a list of external indicators like that 👆🏻, but it made the code harder to read, and at the same time, I think it's quite unlikely that we'll keep adding external indicators. I don't have a strong opinion though – happy to refactor if you want me to!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier today, I was thinking that we will probably unify the explorer entity selector with the grapher entity selector at some point.
It's a bit weird at the moment that the explorer entity selector is shown on the left, that it is also visible on the map and table tabs, etc.

Explorers currently have their own options for sorting, which can be set in the spreadsheet. Basically, they can be any column of any csv file that's being used in the explorer, including ones that haven't yet been fetched.

CleanShot 2024-04-18 at 12 43 51@2x

I'm not sure what the future of explorers entails, exactly - I think we want to move away a bit from csv's as a data source.
But still, supporting the current explorer feature set would be easier if it was easier to set/extend the set of sort indicators.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not something for now, though, just something to keep in mind for the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! That would def make it easier :)

@sophiamersmann sophiamersmann force-pushed the entity-selector-location branch from e956f8d to 9bb9dc9 Compare April 17, 2024 10:35
@sophiamersmann sophiamersmann force-pushed the entity-selector-external-sort branch from 0deb912 to ae2c1d3 Compare April 17, 2024 10:35
@sophiamersmann sophiamersmann force-pushed the entity-selector-location branch from 9bb9dc9 to b719487 Compare April 18, 2024 08:09
@sophiamersmann sophiamersmann force-pushed the entity-selector-external-sort branch from ae2c1d3 to 4c54ffd Compare April 18, 2024 08:09
@sophiamersmann sophiamersmann force-pushed the entity-selector-location branch from b719487 to 8d26977 Compare April 26, 2024 12:23
@sophiamersmann sophiamersmann force-pushed the entity-selector-external-sort branch from 4c54ffd to c40d2b4 Compare April 26, 2024 12:23
@sophiamersmann sophiamersmann force-pushed the entity-selector-location branch from 8d26977 to 797f3fd Compare April 26, 2024 17:03
@sophiamersmann sophiamersmann force-pushed the entity-selector-external-sort branch from c40d2b4 to 705194c Compare April 26, 2024 17:03
@sophiamersmann sophiamersmann force-pushed the entity-selector-location branch from 797f3fd to 0829e33 Compare May 1, 2024 13:06
@sophiamersmann sophiamersmann force-pushed the entity-selector-external-sort branch from 705194c to 4f2c915 Compare May 1, 2024 13:07
@sophiamersmann sophiamersmann force-pushed the entity-selector-location branch from 0829e33 to 94b6a61 Compare May 2, 2024 14:03
@sophiamersmann sophiamersmann force-pushed the entity-selector-external-sort branch from 4f2c915 to 354fcd8 Compare May 2, 2024 14:03
@owidbot
Copy link
Contributor

owidbot commented May 2, 2024

Quick links (staging server):

Site Admin Wizard

Login: ssh owid@staging-site-entity-selector-external-sort

SVG tester: Number of differences (default views): 1237

Number of differences (all views): 389

Edited: 2024-05-02 14:45:35 UTC
Execution time: 1.18 seconds

Copy link
Member Author

sophiamersmann commented May 3, 2024

Merge activity

@sophiamersmann sophiamersmann force-pushed the entity-selector-location branch from 94b6a61 to a56d68f Compare May 3, 2024 08:15
Base automatically changed from entity-selector-location to master May 3, 2024 08:17
@sophiamersmann sophiamersmann force-pushed the entity-selector-external-sort branch from 354fcd8 to 8004a9d Compare May 3, 2024 08:18
@sophiamersmann sophiamersmann merged commit e186b8e into master May 3, 2024
11 of 17 checks passed
@sophiamersmann sophiamersmann deleted the entity-selector-external-sort branch May 3, 2024 08:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants