Bokeh Journal

News and updates for all things Bokeh.

Chartify: A Quick Review

04 December 2018

Chartify is a new plotting library that was recently open-sourced by Spotify Labs. You can read their announcement article here. Chartify is intended to make it easy for Python users to create standard chart types, including line, bar and area charts, and is built on top of Bokeh. As a Bokeh core contributor, I quickly experimented with Chartify to see what it’s like.

tl;dr I’m impressed. Chartify offers a clean API to ingest tidy data and generate a variety of visually pleasing charts, while also exposing the underlying Bokeh figure for further customization. I’m excited about this addition to the Python data visualization ecosystem.

Horizontal Histogram

(taken from the Chartify Examples Notebook)

Why does Chartify build on Bokeh?

Bokeh is a tool for creating web-based, interactive visualizations and offers a lot of primitives (like lines and circles) that users combine into highly customized visualizations. However, using primitives means that users may be required to do extra data manipulation to create their desired plot. For example, there’s no from bokeh import StackedBarChart. Users can certainly create such a chart using Bokeh, but doing so requires figuring out how to transform their data into beginning and end positions for the stacked bars. Chartify aims to abstract away this data transformation step for users making standard chart types.

Using Tidy Data:

Chartify consumes tidy data, a data formatting concept that originated in the R ecosystem. You can read the whole explanation here, but synopsis is that a tidy dataset is one that is structured where:

  • Each variable forms a column
  • Each observation forms a row
  • Each type of observational unit forms a table

To fully understand, it might be easier to look at an example of each:

Tidy Data Example:

date country fruit unit_price quantity total_price
0 2017-10-21 US Banana 0.303711 4 1.214846
1 2017-05-30 JP Banana 0.254109 4 1.016436
2 2017-05-21 CA Banana 0.268635 4 1.074539
3 2017-09-18 BR Grape 2.215277 2 4.430554
4 2017-12-08 US Banana 0.308337 5 1.541687

Untidy Data Example:

country BR CA GB JP US
fruit
Apple 57 144 177 65 165
Banana 30 222 113 232 479
Grape 54 86 59 52 81
Orange 74 207 97 75 409

You can see that each row in the tidy dataset contains a unique observation, composed of values for each variable. In the untidy dataset, each row corresponds to the summary of a different type of fruit and not unique observations.

Data analysis tools like Pandas are generally designed to consume data that matches this standard. Since Chartify is a Python library, you can read about about tidying data in Pandas from Pandas core contributor Tom Augspurger here. This is especially relevant because Chartify ingests tidy Pandas DataFrames for plotting, which is hugely valuable because users don’t have to do any special data transformation in order create visualizations.

The Chartify API

Chartify users create a chartify.Chart object and specify one of a few enumerated axis types for the x and y axes. The resulting Chart object will contain a set of appropriate plotting methods for your axis pair type. For example, using a "datetime" x-axis and linear y-axis means that a line chart is a good idea and bar chart is not, because bar charts are typically intended for categorical data. I think this is great - Bokeh tries very hard to help users make effective visualizations by having nice defaults and I think these opinionated guardrails are good.

The allowed axis types:

x_axis_type (enum, str):

  • linear
  • log
  • datetime
  • categorical
  • density

y_axis_type (enum, str):

  • linear
  • log
  • categorical
  • density

As of release 2.3.5, Chartify offers the following chart types for the corresponding x and y axis types:

X Axis Below/Y Axis Right linear/log/datetime categorical density
linear/log/datetime line, scatter, text, area bar, lollipop, parallel kde, histogram
categorical bar, lollipop, parallel heatmap kde, histogram
density kde, histogram kde, histogram hexbin

(Note: both area and bar include stacked area and bar charts)

While there’s endless the potential to add more, I think Chartify more than covers the necessary charts for general report generation.

Using the “chartify.Chart.plot” methods

Users pass their tidy dataframe into their chosen plotting method and specify which column names correspond visualization properties using keyword arguments. In this case, I created a grouped bar chart by specifying the "country" and "fruit" columns for the groupings and the "quantity" column for the data value. Additionally, I passed in optional kwargs to set the bar colors and ordering.

quantity_by_fruit_and_country = (tidy_data.groupby(
    ['fruit', 'country'])['quantity'].sum().reset_index())

ch = chartify.Chart(blank_labels=True, x_axis_type='categorical', y_axis_type='linear')
ch.set_title("Fruit by Country")
ch.set_subtitle("Change categorical order with 'categorical_order_by'.")
ch.plot.bar(
    data_frame=quantity_by_fruit_and_country,
    categorical_columns=['country', 'fruit'],
    numeric_column='quantity',
    color_column='country', ## optional
    categorical_order_by='labels', ## optional
    categorical_order_ascending=True ## optional
)
ch.axes.set_xaxis_tick_orientation('vertical')
ch.show()

I’ve very excited that Chartify exposes the Bokeh Figure object that it creates on the chart’s .figure property. This means users get the wonderful functionality of a nice charting API while also being able to drop down to Bokeh-level APIs to further customize their plots. In this example, I modified a Chartify scatter plot to add a custom HoverTool and make the figure size be responsive. (You can test this by hovering over the plot and dragging the browser window larger and smaller.)

from bokeh.models import HoverTool

ch = chartify.Chart(blank_labels=True, x_axis_type='datetime', y_axis_type='linear')
ch.plot.scatter(
    data_frame=tidy_data,
    x_column="date",
    y_column="total_price",
    size_column='quantity',
    color_column='fruit')

hover = HoverTool(tooltips=[
    ("Total Price (M $)", "@total_price"),
    ("Quantity Sold (M Units)", "@quantity"),
])

### access Bokeh.Figure object
ch.figure.add_tools(hover)
ch.figure.sizing_mode = 'scale_width'

ch.show()

Beyond the Chart.plot methods and accessing the Bokeh figure via Chart.figure, Chartify also offers interfaces to modify plots styles, add annotations, and format the axes:

the chartify.Chart methods:

  • Styling (.style)
  • Plotting (.plot)
  • Callouts (.callout)
  • Axes (.axes)
  • Bokeh figure (.figure)

You can views more demonstrations of these in Chartify’s examples notebook here.

Summation

Chartify offers a pleasant high-level interface for ingesting tidy data and generating a variety of visually pleasing charts, while also exposing the underlying Bokeh object for further customization. I’m excited about this addition to the Python data visualization ecosystem.