Skip to content

Commit

Permalink
Update language and charts for better flow
Browse files Browse the repository at this point in the history
  • Loading branch information
joelostblom committed Oct 7, 2023
1 parent 2e3809b commit 5978764
Show file tree
Hide file tree
Showing 2 changed files with 27 additions and 23 deletions.
40 changes: 23 additions & 17 deletions doc/user_guide/encodings/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -198,44 +198,50 @@ Effect of Data Type on Axis Scales
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Similarly, for x and y axis encodings, the type used for the data will affect
the scales used and the characteristics of the mark. For example, here is the
difference between a ``quantitative`` , ``ordinal`` and ``temporal`` scale for an column
difference between a ``ordinal``, ``quantitative``, and ``temporal`` scale for an column
that contains integers specifying a year:

.. altair-plot::

pop = data.population()
# Convert interger to string data type
pop.year = pop.year.astype(str)

base = alt.Chart(pop).mark_line().encode(
alt.Y('mean(people):Q').title('total population')
base = alt.Chart(pop).mark_bar().encode(
alt.Y('mean(people):Q').title('Total population')
).properties(
width=150,
height=200
width=140,
height=140
)

alt.hconcat(
base.encode(x='year:Q').properties(title='year=quantitative'),
base.encode(x='year:O').properties(title='year=ordinal'),
base.encode(x='year:T').properties(title='year=temporal')
base.encode(x='year:O').properties(title='ordinal'),
base.encode(x='year:Q').properties(title='quantitative'),
base.encode(x='year:T').properties(title='temporal')
)

Because quantitative values do not have an inherent width, the bars do not
Because values on quantitative and temporal scales do not have an inherent width, the bars do not
fill the entire space between the values.
This view also makes clear the missing year of data that was not immediately
apparent when we treated the years as categories.
These scales clearly show the missing year of data that was not immediately
apparent when we treated the years as ordinal data,
but the axis formatting is undesirable in both cases.

To plot the year data as four digit format; i.e. without thousand separator,
we recommend converting integer to string data type first, then storing the data as temporal,
since directly convert integer to temporal data type would result in an error.
To plot four digit integers as years with proper axis formatting,
i.e. without thousands separator,
we recommend converting the integers to strings first,
and the specifying a temporal data type in Altair.
While it is also possible to change the axis format with ``.axis(format='i')``,
it is preferred to specify the appropriate data type to Altair.

.. altair-plot::

pop['year'] = pop['year'].astype(str)

base.mark_bar().encode(x='year:T').properties(title='temporal')

This kind of behavior is sometimes surprising to new users, but it emphasizes
the importance of thinking carefully about your data types when visualizing
data: a visual encoding that is suitable for categorical data may not be
suitable for quantitative data or temporal data, and vice versa.


.. _shorthand-description:

Encoding Shorthands
Expand Down
10 changes: 4 additions & 6 deletions doc/user_guide/times_and_dates.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,14 +50,12 @@ example, we'll limit ourselves to the first two weeks of data:
y='temp:Q'
)

(notice that for date/time values we use the ``T`` to indicate a temporal
Notice that for date/time values we use the ``T`` to indicate a temporal
encoding: while this is optional for pandas datetime input, it is good practice
to specify a type explicitly; see :ref:`encoding-data-types` for more discussion.

If you want to plot integers as four digit year format stored as temporal data,
please see the :ref:`type-axis-scale`).


If you want Altair to plot four digit integers as years,
you need to cast them as strings before changing the data type to temporal,
please see the :ref:`type-axis-scale` for details.

For date-time inputs like these, it can sometimes be useful to extract particular
time units (e.g. hours of the day, dates of the month, etc.).
Expand Down

0 comments on commit 5978764

Please sign in to comment.