Creating a line chart
A line chart often represents trend over time, and typically requires numeric values plotted as lines over date-related fields.
In this recipe, we will create a time series graph (or line chart) that shows the number of floods over time. In addition, we will show the number of fatalities and represent this as the width of the line graph. This will enable us to see more obviously when the most flood-related fatalities occurred.
Getting ready
To follow this recipe, open B05527_01 – STARTER.twbx
. Use the worksheet called Line
, and connect to the MasterTable (FlooddataMasterListrev)
data source.
How to do it...
The following are the steps to create a time series graph in Tableau:
- From Dimensions, drag Began to Rows. Tableau will automatically choose the YEAR level for this date field.
- Right-click on the Null column heading in YEAR(Began), and select Exclude.
- From Measures, drag Number of Records to Columns.
- From Measures, drag Dead to Size in the Marks card.
- Click on the top-right drop-down arrow of the size legend and choose Edit title….
- Edit the title of the size legend to display Fatalities.
- From Measures, drag Dead to Color in the Marks card.
- Click on the top-right drop-down arrow of the color legend and choose Edit title….
- Edit the title of the color legend to display Fatalities.
- Click on Color in the Marks card. Under the Effects section, select the middle marker.
- Click on Size in the Marks card. Drag the size control to the right to make the marks in your chart bigger.
- Annotate your marks. In this sample, I have chosen to annotate 2003 (the year with the highest number of floods) and 1991, 2004, and 2008 (the years with the highest numbers of fatalities). To annotate the marks, you can right-click on the mark in your chart and select Annotate, and then Mark.
The annotation for the year 2003 looks like the following:
- Hide the Began field label at the top of the chart by right-clicking it, and selecting Hide field label for columns.
- Right-click on the Number of Records axis, and change the title to # Floods.
How it works...
By default, Tableau creates a line chart—also called a time series chart—when a Date (or Date & Time) field is placed in either the Rows or Columns shelf. A Date field is presented in the side bar with a small calendar icon, while a Date & Time field will be a small calendar icon with a clock. Line charts are best used when you want to see patterns over time.
In this recipe, when we dragged the Began field onto Rows, the field showed up as YEAR(Began) instead of just the Began field.
Date fields in Tableau have a natural hierarchy—meaning you can roll up to a higher date level like YEAR, or drill down to a lower level like DAY. In this case, we see a blue YEAR(Began) pill, which indicates Tableau presented the years as a series of discrete headers.
Note
Discrete and continuous fields are discussed in more detail in Appendix C, Working with Tableau 10.
When the year field was rendered, the first column value that appeared was Null. A Null means there is no value, that is, this data is missing for some records. One way to remove it is to right-click the column header and select Exclude. Exclude creates a filter, but it is a negation filter. Whatever is checked will be excluded. When you investigate the settings for this filter, you will find that Null is checked, but the bottom checkbox for Exclude is checked.
In this recipe, we also used SUM(Dead) to be presented both in Color and Size. The darker the color and the thicker the line indicates the more fatalities. Adding this visual cue in the line graph allows us to easily identify when the most fatalities occurred, without having to hover over each point in the graph and look at the details.
There's more...
Tableau is a great visualization tool that gives us a lot of flexibility over how we want to display our graphs. However, it can also be a double-edged sword. Nothing stops us from creating all kinds of chart just because we can. However, we shouldn't. Just because we can does not mean we should. We should identify first our data points, and evaluate the best (or most effective) way to represent them.
Take, for example, the following line chart. The line chart has a vertical sales axis. At the bottom of the graph, we can find the discrete categories.
Is the line chart the best way to represent this information? No. In fact, this is incorrect and grossly misleading. When your audience first looks at this graph, the initial perception is that we are showing the fluctuations of sales over time. There is also a notion of relatedness, that is, that the category headers are somehow related. This isn't true. Categories are nominal. There is no inherent order in these classifications. When we move the category labels around, it shouldn't affect the message that is being relayed by the graph. This particular graph can be best represented as a bar chart.
See also
- Please refer to the Creating a bar chart recipe in this chapter