Skip to content
This repository has been archived by the owner on Sep 1, 2020. It is now read-only.

Encode absolute count to circle size #12

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Encode absolute count to circle size #12

wants to merge 1 commit into from

Conversation

trvrb
Copy link
Member

@trvrb trvrb commented May 25, 2019

@jameshadfield, @jotasolano ---

I believe the choropleths were not presenting the core thing they needed to present, that is the count of encounters in a region. Incidence is the primary data variable we want to display. Sex, age, vaccination, etc... are secondary colorings. There is no easy way in the existing app framework to display a very simple thing:

  • subset to H3N2 and look at counts of H3N2 cases across the map

The map always displays the "secondary" data variable (vaccination, etc...).

This PR addresses this by swapping the choropleth for a circle sized according to absolute total count from this region. There is certainly more to do here in terms of piecharts vs circles and color embeddings, but I believe this is separable from choropleth vs circle.

Here are two examples of changes in app behavior:

Neighborhood / vaccination via choropleth:

1-old

Neighborhood / vaccination via circle:

1-new

Census tract / sex via choropleth:

2-old

Census tract / sex via circle:

2-new

You can especially see in the census tract version that many of choropleth colorings are based on a single data point and are overly emphasized. By having these just be tiny circles these census tracts are properly deemphasized. I can see meaning in the latter, but not in the former.

I believe the chloropleths were not presenting the core thing they needed to present, that is the count of encounters in a region. Incidence is the primary data variable we want to display. Sex, age, vaccination, etc... are secondary colorings.

This keeps the same color encoding as previously but rather than coloring the full geoJSON path, we instead color a circle. This circle is sized according to absolute total count from this region.
@trvrb trvrb changed the title Encoding absolute count to circle size Encode absolute count to circle size May 25, 2019
@jotasolano
Copy link
Contributor

Hi Trevor, please see my answers below:

I believe the choropleths were not presenting the core thing they needed to present, that is the count of encounters in a region. Incidence is the primary data variable we want to display.

I agree with you! When we look at our boards for "competitive analysis", the incidence is the primary metric that is shown (or at least the most general). In the inVision mockups, this is shown in the first slide, and you could get to it by selecting the "prevalence" variable under the "modeling" mode (https://invis.io/J3QDSNMT2D9#/353703459_1_Dynamics_modeling_prevalence). The missing table there is the one in Observable:

image

There is no easy way in the existing app framework to display a very simple thing[...]

Maybe what we did in that InVision mockup is the way we could implement this ☝️, although I'm still curious to test this flow and see whether people consider that "incidence" belongs to "modeling" (semantics/mental-model wise).

This PR addresses this by swapping the choropleth for a circle sized according to absolute total count from this region.

I think this is a step in the right direction! However, I do have concerns about multiple variable encoding and the cognitive load required for a good-enough interpretation of the charts: When I look at the last two pictures (incidence in choropleth vs as radius of circles) I have a hard time interpreting the latter. This is probably because the circles are encoding two variables (size of the population as radius and incidence as color). It's also likely that the color scale is not ideal, as a single color ramp would be more adequate for these type of data (e.g. light red to dark red).

It's also known that color perception is affected by its area and neighboring colors, so having very small circles can also pose a problem for the interpretation of the data. I think most incidence visualizations solve this by not using raw values and instead normalizing the numbers. I guess the question we need to ask here is: am I interested in looking at raw values and total incidence, or am I interested in being able to compare incidence across every deme/region? Perhaps we could provide a toggle between a normalized choropleth and a bubble map with absolute values?

Finally, the heat map I implemented should work almost out of the box with incidence data, and we'd get an incidence table much like the one we have in Observable. This would accompany the map nicely, I think.

@jameshadfield
Copy link
Member

The map always displays the "secondary" data variable (vaccination, etc...).

Nore that this is called the "Primary" variable in the viz. I think all that's missing is the correct options for this variable. I.e. the default option should be "incidence" or "counts".

Would be happy to discuss things in person next week.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants