Skip to content

Commit

Permalink
Feature radial heatmap (#2155)
Browse files Browse the repository at this point in the history
  • Loading branch information
mansenfranzen authored and jlstevens committed Mar 19, 2018
1 parent ddc3585 commit 6a127d0
Show file tree
Hide file tree
Showing 18 changed files with 1,913 additions and 175 deletions.
Binary file added examples/assets/nyc_taxi.csv.gz
Binary file not shown.
128 changes: 128 additions & 0 deletions examples/gallery/demos/bokeh/nyc_radial_heatmap.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Most examples work across multiple plotting backends equivalent, this example is also available for:\n",
"\n",
"* [Matplotlib - radial_heatmap](../matplotlib/radial_heatmap.ipynb)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"\n",
"import holoviews as hv\n",
"hv.extension(\"bokeh\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Declaring data\n",
"\n",
"### NYC Taxi Data\n",
"\n",
"Let's dive into a concrete example, namely the New York - Taxi Data ([For-Hire Vehicle (“FHV”) records](http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml)). The following data contains hourly pickup counts for the entire year of 2016. \n",
"\n",
"**Considerations**: Thinking about taxi pickup counts, we might expect higher taxi usage during business hours. In addition, public holidays should be clearly distinguishable from regular business days. Furthermore, we might expect high taxi pickup counts during Friday and Saterday nights.\n",
"\n",
"**Design**: In order model the above ideas, we decide to assign days with hourly split to the *radial segments* and week of year to the *annulars*. This will allow to detect daily/hourly periodicity and weekly trends. To get you more familiar with the mapping of segemnts and annulars, take a look at the following radial heatmap:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# load example data\n",
"df_nyc = pd.read_csv(\"../../../assets/nyc_taxi.csv.gz\", parse_dates=[\"Pickup_date\"])\n",
"\n",
"# create relevant time columns\n",
"df_nyc[\"Day & Hour\"] = df_nyc[\"Pickup_date\"].dt.strftime(\"%A %H:00\")\n",
"df_nyc[\"Week of Year\"] = df_nyc[\"Pickup_date\"].dt.strftime(\"Week %W\")\n",
"df_nyc[\"Date\"] = df_nyc[\"Pickup_date\"].dt.strftime(\"%Y-%m-%d\")\n",
"\n",
"heatmap = hv.HeatMap(df_nyc, [\"Day & Hour\", \"Week of Year\"], [\"Pickup_Count\", \"Date\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Plot"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**At first glance**: First, let's take a closer look at the mentioned segments and annulars. **Segments** correspond to *hours of a given day* whereas **annulars** represent entire *weeks*. If you use the hover tool, you will quickly get an idea of how segments and annulars are organized. **Color** decodes the pickup values with blue being low and red being high.\n",
"\n",
"**Plot improvements**: The above plot clearly shows systematic patterns however the default plot options are somewhat disadvantageous. Therefore, before we start to dive into the results, let's increase the readability of the given plot:\n",
"\n",
"- **Remove annular ticks**: The information about week of year is not very important. Therefore, we hide it via `yticks=None`.\n",
"- **Custom segment ticks**: Right now, segment labels are given via day and hour. We don't need hourly information and we want every day to be labeled. We can use a tuple here which will be passed to `xticks=(\"Friday\", ..., \"Thursday\")`\n",
"- **Add segment markers**: Moreover, we want to aid the viewer in distingushing each day more clearly. Hence, we can provide marker lines via `xmarks=7`.\n",
"- **Rotate heatmap**: The week starts with Monday and ends with Sunday. Accordingly, we want to rotate the plot to have Sunday and Monday be at the top. This can be done via `start_angle=np.pi*19/14`. The default order is defined by the global sort order which is present in the data. The default starting angle is at 12 o'clock.\n",
"\n",
"Let's see the result of these modifications:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%opts HeatMap [radial=True width=600 height=600 yticks=None xmarks=7 ymarks=3 start_angle=np.pi*19/14]\n",
"%%opts HeatMap [xticks=(\"Friday\", \"Saturday\", \"Sunday\", \"Monday\", \"Tuesday\", \"Wednesday\", \"Thursday\")]\n",
"\n",
"heatmap"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"After tweaking the plot defaults, we're comfortable with the given visualization and can focus on the story the plot tells us.\n",
"\n",
"**There are many interesting findings in this visualization:**\n",
"\n",
"1. Taxi pickup counts are high between 7-9am and 5-10pm during weekdays which business hours as expected. In contrast, during weekends, there is not much going on until 11am. \n",
"2. Friday and Saterday nights clearly stand out with the highest pickup densities as expected. \n",
"3. Public holidays can be easily identified. For example, taxi pickup counts are comparetively low around Christmas and Thanksgiving.\n",
"4. Weather phenomena also influence taxi service. There is a very dark blue stripe at the beginning of the year starting at Saterday 23rd and lasting until Sunday 24th. Interestingly, there was one of the [biggest blizzards](https://www.weather.gov/okx/Blizzard_Jan2016) in the history of NYC."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
128 changes: 128 additions & 0 deletions examples/gallery/demos/matplotlib/nyc_radial_heatmap.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Most examples work across multiple plotting backends equivalent, this example is also available for:\n",
"\n",
"* [Bokeh - radial_heatmap](../bokeh/radial_heatmap.ipynb)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"\n",
"import holoviews as hv\n",
"hv.extension(\"matplotlib\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Declaring data\n",
"\n",
"### NYC Taxi Data\n",
"\n",
"Let's dive into a concrete example, namely the New York - Taxi Data ([For-Hire Vehicle (“FHV”) records](http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml)). The following data contains hourly pickup counts for the entire year of 2016. \n",
"\n",
"**Considerations**: Thinking about taxi pickup counts, we might expect higher taxi usage during business hours. In addition, public holidays should be clearly distinguishable from regular business days. Furthermore, we might expect high taxi pickup counts during Friday and Saterday nights.\n",
"\n",
"**Design**: In order model the above ideas, we decide to assign days with hourly split to the *radial segments* and week of year to the *annulars*. This will allow to detect daily/hourly periodicity and weekly trends. To get you more familiar with the mapping of segemnts and annulars, take a look at the following radial heatmap:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# load example data\n",
"df_nyc = pd.read_csv(\"../../../assets/nyc_taxi.csv.gz\", parse_dates=[\"Pickup_date\"])\n",
"\n",
"# create relevant time columns\n",
"df_nyc[\"Day & Hour\"] = df_nyc[\"Pickup_date\"].dt.strftime(\"%A %H:00\")\n",
"df_nyc[\"Week of Year\"] = df_nyc[\"Pickup_date\"].dt.strftime(\"Week %W\")\n",
"df_nyc[\"Date\"] = df_nyc[\"Pickup_date\"].dt.strftime(\"%Y-%m-%d\")\n",
"\n",
"heatmap = hv.HeatMap(df_nyc, [\"Day & Hour\", \"Week of Year\"], [\"Pickup_Count\", \"Date\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Plot"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**At first glance**: First, let's take a closer look at the mentioned segments and annulars. **Segments** correspond to *hours of a given day* whereas **annulars** represent entire *weeks*. If you use the hover tool, you will quickly get an idea of how segments and annulars are organized. **Color** decodes the pickup values with blue being low and red being high.\n",
"\n",
"**Plot improvements**: The above plot clearly shows systematic patterns however the default plot options are somewhat disadvantageous. Therefore, before we start to dive into the results, let's increase the readability of the given plot:\n",
"\n",
"- **Remove annular ticks**: The information about week of year is not very important. Therefore, we hide it via `yticks=None`.\n",
"- **Custom segment ticks**: Right now, segment labels are given via day and hour. We don't need hourly information and we want every day to be labeled. We can use a tuple here which will be passed to `xticks=(\"Friday\", ..., \"Thursday\")`\n",
"- **Add segment markers**: Moreover, we want to aid the viewer in distingushing each day more clearly. Hence, we can provide marker lines via `xmarks=7`.\n",
"- **Rotate heatmap**: The week starts with Monday and ends with Sunday. Accordingly, we want to rotate the plot to have Sunday and Monday be at the top. This can be done via `start_angle=np.pi*19/14`. The default order is defined by the global sort order which is present in the data. The default starting angle is at 12 o'clock.\n",
"\n",
"Let's see the result of these modifications:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%opts HeatMap [radial=True fig_size=300 yticks=None xmarks=7 ymarks=3 start_angle=np.pi*19/14]\n",
"%%opts HeatMap [xticks=(\"Friday\", \"Saturday\", \"Sunday\", \"Monday\", \"Tuesday\", \"Wednesday\", \"Thursday\")]\n",
"\n",
"heatmap"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"After tweaking the plot defaults, we're comfortable with the given visualization and can focus on the story the plot tells us.\n",
"\n",
"**There are many interesting findings in this visualization:**\n",
"\n",
"1. Taxi pickup counts are high between 7-9am and 5-10pm during weekdays which business hours as expected. In contrast, during weekends, there is not much going on until 11am. \n",
"2. Friday and Saterday nights clearly stand out with the highest pickup densities as expected. \n",
"3. Public holidays can be easily identified. For example, taxi pickup counts are comparetively low around Christmas and Thanksgiving.\n",
"4. Weather phenomena also influence taxi service. There is a very dark blue stripe at the beginning of the year starting at Saterday 23rd and lasting until Sunday 24th. Interestingly, there was one of the [biggest blizzards](https://www.weather.gov/okx/Blizzard_Jan2016) in the history of NYC."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
126 changes: 126 additions & 0 deletions examples/reference/elements/bokeh/RadialHeatMap.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<div class=\"contentcontainer med left\" style=\"margin-left: -50px;\">\n",
"<dl class=\"dl-horizontal\">\n",
" <dt>Title</dt> <dd> HeatMap Element (radial) </dd>\n",
" <dt>Dependencies</dt> <dd>Bokeh</dd>\n",
" <dt>Backends</dt> <dd><a href='./RadialHeatMap.ipynb'>Bokeh</a></dd> <dd><a href='../matplotlib/RadialHeatMap.ipynb'>Matplotlib</a></dd>\n",
"</dl>\n",
"</div>"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"import holoviews as hv\n",
"hv.extension('bokeh')\n",
"\n",
"%opts HeatMap [radial=True width=800 height=800 tools=[\"hover\"]]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A radial ``HeatMap`` is well suited to discover **periodic patterns** and **trends** in **time series** data and other cyclic variables. A radial HeatMap can be plotted simply by activating the ``radial`` plot option on the ``HeatMap`` element. \n",
"\n",
"Here we will create a synthetic dataset of a value varying by the hour of the day and day of the week:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"days = 31\n",
"hours = 24\n",
"size = days*hours\n",
"\n",
"def generate_hourly_periodic_data(x):\n",
" periodic_weekly = np.sin(x*2*np.pi / (24*7))\n",
" periodic_daily = np.sin(x*2*np.pi / 24)\n",
" noise = np.random.random(size=x.size) \n",
" return periodic_weekly + periodic_daily + noise\n",
"\n",
"x = np.linspace(0, size, size)\n",
"y = generate_hourly_periodic_data(x)\n",
"\n",
"date_index = pd.date_range(start=\"2017-10-01\", freq=\"h\", periods=size)\n",
"kdim_segment = date_index.strftime(\"%H:%M\")\n",
"kdim_annular = date_index.strftime(\"%A %d\")\n",
"\n",
"df = pd.DataFrame({\"values\": y, \"hour\": kdim_segment, \"day\": kdim_annular}, index=date_index)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As with a regular ``HeatMap`` the data should consist of two index variables or key dimensions and one or more value dimensions. Here we declare the 'hour' and 'day' as the key dimensions. For a radial HeatMap to make sense the first key dimension, which will correspond to the radial axis, should be periodic. Here the variable is 'hour', starting at midnight at the top:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%opts HeatMap [radial=True]\n",
"hv.HeatMap(df, [\"hour\", \"day\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The resulting plot is quite bare so we may want to customize it, there are a number of things we can do to make the plot clearer:\n",
"\n",
"1. Increase the inner padding with the ``radius_inner`` option.\n",
"2. Increase the number of ticks along the radial axis using ``xticks``\n",
"3. Add radial separator marks with the ``xmarks`` option.\n",
"4. Change the colormap using the ``cmap`` style option."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%opts HeatMap [radial=True xmarks=8 ymarks=4] (cmap='viridis')\n",
"hv.HeatMap(df, [\"hour\", \"day\"])"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading

0 comments on commit 6a127d0

Please sign in to comment.