The obligatory bar chart example

No introductory chapter on D3 would be complete without a basic bar chart example. They are to D3 as Hello World is to everything else, and 90% of all data storytelling can be done in its simplest form with an intelligent bar or line chart. For a good example of this, look at the kinds of graphics the Financial Times or The Economist includes in their articles--they frequently summarize the entire piece with a simple line chart or histogram. Coming from a newsroom development background (Full disclosure: I work on the Interactive Graphics desk at the Financial Times), many of my examples will be related to some degree to current events or possible topics worth visualizing with data. The news development community has been really instrumental in creating the environment for D3 to flourish, and it's increasingly important for aspiring journalists to have proficiency in tools such as D3.

The first dataset that we'll use is the UK Electoral Commission's result set from the UK Brexit referendum. We will draw a bar chart depicting voter turnout by region.

The source for this data is at http://www.electoralcommission.org.uk/find-information-by-subject/elections-and-referendums/past-elections-and-referendums/eu-referendum/electorate-and-count-information.

We'll create a bar for each region in the UK. The first step is to get a basic container setup, which we can then be populated with all of our delicious new JavaScript code. We can either jump straight into the code, or we can set up some stuff to make life easier for us later on. Let's go with the first route for now. There's a lot of new-fangled JavaScript stuff coming at you really soon, so let's keep it light for the moment.

First open lib/main.js and write your very first line of D3:

const chart = d3.select('body') 
.append('svg') 
.attr('id', 'chart');

This selects the HTML <body> element, appends a <svg> element, and gives it the ID of #chart. We'll be using this pattern a lot throughout the book.

Before we get any further, it's worth pointing out our first new-fangled modern JavaScript feature:
The const keyword is used to define a variable that won't change dramatically. By dramatically I mean that you can still modify it somewhat (for instance, adding elements to an array or modifying an object), but you'll throw an error if you try to reassign it. It will also throw an error if you try to use it before it's declared. Unlike constants in other languages, JavaScript constants only apply to the current function scope (they're not global unless you make them that). This is really useful when using a functional programming style, as it prevents weird bugs caused by variable hoisting (an unusual JavaScript language feature, whereby variables are ultimately interpreted at the top of each function closure instead of where you actually define them). For more on const, visit http://mdn.io/const.
Another new way to define variables in JavaScript is using let, which is like var but, like const, has block scope, meaning that it is limited to the block, statement, or expression where it's used. This also helps prevent weird bugs. For more information, visit http://mdn.io/let.
When should you use each? Use const if you're not going to reassign a variable and use let if you will. I try to avoid reassigning variables, so I will usually use const in this book. However, while you still can use the var keyword to assign variables in modern JavaScript, you really shouldn't--always use let or const instead, defaulting to let if you're not sure which will work in a given situation.

Yeah! Let's open this up in the browser; ensure that the development server is running (npm start if it isn't) and go to http://localhost:8080

Uncaught Error: Cannot find module "d3"

Whups. Okay, that could have gone better...

You're getting this error because we haven't imported D3, yet. If you've used D3 before, you might be more familiar with it attached to the window global object. This is what happens when you include d3.js via a <script> tag. We're not doing that, however--we're JavaScript rockstarninjaciraptors; we use the new hotness, ES2015 module imports!

Go back to main.js. At the top of the file, type this:

import * as d3 from 'd3';

Let's unpack this a bit. Import statements must be at the top of every file (so no sneaky Node.js-style require() calls inside your functions), because it allows for static analysis. This lets new JavaScript tools be more effective. They always start with the import token.

Next is the curly bracket bit. In an ES2015 module, there are two types of exports:

Named: This is where you give the export a title that needs to be imported specifically (though it can be renamed), and it is inside curly brackets.
Default: There can be only one of these per module, and it can be referred to as anything when importing. We'll see this a bit later on.

What we do above is import all of the D3 microlibs under the namespace d3.

If you go back to your browser and switch quickly to the Elements tab, you'll notice a new SVG with an ID of #chart at the bottom of the page. There's progress!

Loading in data

Go back to main.js. We need to get our data in somehow, and I'll show you far better ways of doing this later on, but let's work through the pain and do this the bad, old way--using XMLHttpRequest:

const req = new window.XMLHttpRequest(); 
req.addEventListener('load', mungeData); 
req.open('GET', 'data/EU-referendum-result-data.csv'); 
req.send();

This instantiates a new XMLHttpRequest object, tells it to load the data from the data/directory and then passes it to the soon-to-be-written mungeData() function once loaded.

Note how we had to use the ugly new keyword to instantiate it? Note how it took four lines and a new variable declaration? Note how we have to handle our response in a callback? Eww! We'll improve upon this in later chapters. The only advantage of doing things this way is that it works in nearly any browser without polyfilling, but there are so many better ways of doing this, all of which we will touch upon in Chapter 4, Making Data Useful.

The CSV file we're loading in has a row for each constituency in the UK, containing everything from what percentage voted for what to what the voter turnout was to how many ballots were invalid or spoiled. What we will do is turn that into an array of objects depicting the mean percentage for each broader region that voted for leaving.

It's time to create our mungeData() function. We will use d3.csvParse() (from the d3-dsv microlib) to parse our CSV data string in an object and then use some features from the d3-array microlib to manipulate that data:

function mungeData() { 
  const data = d3.csvParse(this.responseText); 
  const regions = data.reduce((last, row) => { 
    if (!last[row.Region]) last[row.Region] = [];
    last[row.Region].push(row); 
    return last; 
  }, {}); 
  const regionsPctTurnout = Object.entries(regions) 
    .map(([region, areas]) => ({ 
      region, 
      meanPctTurnout: d3.mean(areas, d => d.Pct_Turnout), 
    })); 

  renderChart(regionsPctTurnout); 
}

Hey, there's another ES2015 feature! Instead of typing function() {} endlessly, you can now just put () => {} for anonymous functions. Other than being six keystrokes less, the fat arrow doesn't bind the value of this to something else. This won't impact us very much because we're using a functional style of programming; but if we were using classes, this would be a lifesaver. For more on this, visit http://mdn.io/Arrow_functions.

We transform our data in three steps here:

First, we convert it into an array of objects using d3.csvParse() and assign the result to data.
Then, we transform the array into an object keyed by the region, such that the object's keys are the regions, and the values are an array of associated constituencies.
Lastly, Object.entries converts an object into a multidimensional array consisting of elements comprising key-value pairs, which we can then reduce into a new object comprising each region's name and the mean of each constituency's voter turnout percentage.

You may have noted that the function signature for the call to Array.prototype.map is a little unusual:

.map(([region, areas]) => {

Here, we use a new ES2015 feature called destructuring assignment to give each element in our array a temporary name. Normally, the callback signature is the following:

function(item, index, array) {}

However, because we know item is an array with two elements, we can give each of them a nickname, making our code easier to read (we don't use the index or array arguments this particular time, but if we did, we'd just put those arguments after the destructuring bit).

Lastly, we pass our fully munged data to an as-of-yet-unwritten function, renderChart(), which we'll add next.

We can also simply add the above to this:

  const regionsPctTurnout = d3.nest() 
    .key(d => d.Region) 
    .rollup(d => d3.mean(d, leaf => leaf.Pct_Turnout)) 
    .entries(data);

d3.nest() is part of the d3-collection microlib, which we'll cover in--you guessed it--Chapter 4, Making Data Useful. D3 is a very un-opinionated library, which means you can accomplish many tasks in a variety of ways--there often really isn't a proper way of doing things. I will try to expose a variety of ways to accomplish tasks throughout the book; feel free to choose whichever you prefer.

Twelve (give or take a few) bar blues

With that done, let's render some data.

Create a new function in main.js, renderChart():

function renderChart(data) { 
  chart.attr('width', window.innerWidth) 
    .attr('height', window.innerHeight); 
}

All this does is take our earlier chart variable and set its width and height to that of the window. We're almost at the point of getting some bars onto that graph; hold tight!

First, however, we need to define our scales, which decide how D3 maps data values to pixel values. Put another way, a scale is simply a function that maps an input range to an output domain. This can be annoying to remember, so I'm going to shamelessly steal an exercise from Scott Murray's excellent tutorial on scales from Interactive Data Visualization for the Web:

When I say "input," you say "domain." Then I say "output," and you say "range." Ready? Okay:
Input! Domain!
Output! Range!
Input! Domain!
Output! Range!
Got it? Great.

It seems silly, but I frequently find myself muttering the above when I have a deadline and am working on a chart late at night. Give it a go!

Next, add this code to renderChart():

const x = d3.scaleBand()
  .domain(data.map(d => d.region))
  .rangeRound([50, window.innerWidth - 50])
  .padding(0.1);

The x scale is now a function that maps inputs from a domain composed of our region names to a range of values between 50 and the width of your viewport (minus 50), with some spacing defined by the 0.1 value given to .padding(). What we've created is a band scale, which is like an ordinal scale, but the output is divided into sections. We'll talk more about scales later on in the book.

In this example, we use a uniform value of 50 for our margins, which we pass to our scales and elsewhere. Any arbitrary number passed in code is often referred to as a magic number, insomuch that, to anyone reading your code, it just looks like a random value that magically makes it work. This is bad; don't do this--it makes your code harder to read, and it means that you have to find and replace every value if you want to change it. I only do so here to demonstrate this fact. Throughout the rest of the book, we'll define things, such as margins more intelligently; stay tuned!

Still inside renderChart(), we define another scale named y:

const y = d3.scaleLinear()
  .domain([0, d3.max(data, d => d.meanPctTurnout)])
  .range([window.innerHeight - 50, 0]);

Similarly, the y scale is going to map a linear domain (which runs from zero to the max value of our data, the latter of which we acquire using d3.max) to a range between window.innerHeight (minus our 50 pixel margin) and 0. Inverting the range is important because D3 considers the top of a graph to be y=0. If ever you find yourself trying to troubleshoot why a D3 chart is upside down, try switching the range values in one of your scales.

Now, we define our axes. Add this just after the preceding line, inside renderChart:

const xAxis = d3.axisBottom().scale(x); 
const yAxis = d3.axisLeft().scale(y);

We've told each axis what scale to use when placing ticks and which side of the axis to put the labels on. D3 will automatically decide how many ticks to display, where they should go, and how to label them. Since most D3 elements are objects and functions at the same time, we can change the internal state of both scales without assigning the result to anything. The domain of x is a list of discrete values. The domain of y is a range from 0 to the d3.max of our dataset, the largest value.

Now, we will draw the axes on our graph:

chart.append('g')
  .attr('class', 'axis')
  .attr('transform', 
    `translate(0, ${window.innerHeight - 50})`)
  .call(xAxis);

Hot new ES2015 feature alert! Above, the transform argument is in backticks (`), which are template literal strings. They're just like normal strings, except for two differences: you can use newline characters in them, and you can also run arbitrary JavaScript expressions in them via the ${} syntax. Above, we merely echo out the value of window.innerHeight, but you can write any expression that returns a string-like value, for instance, using Array.prototype.join to output the contents of an array; it's really handy!

We've appended an element called g to the graph, given it the axis CSS class, and moved the element to a place in the bottom-left corner of the graph with the transform attribute.

Finally, we call the xAxis function and let D3 handle the rest.

The drawing of the other axis works exactly the same, but with different arguments:

  chart.append('g') 
    .attr('class', 'axis') 
    .attr('transform', 'translate(50, 0)') 
    .call(yAxis);

Now that our graph is labeled, it's finally time to draw some data:

  chart.selectAll('rect') 
    .data(data) 
    .enter() 
    .append('rect') 
    .attr('class', 'bar') 
    .attr('x', d => x(d.region)) 
    .attr('y', d => y(d.meanPctTurnout)) 
    .attr('width', x.bandwidth()) 
    .attr('height', d =>
        (window.innerHeight - 50) - y(d.meanPctTurnout));

Okay, there's plenty going on here, but this code is saying something very simple. This is what is says:

For all rectangles (rect) in the graph, load our data
Go through it
For each item, append a rect
Then, define some attributes to it

Ignore the fact that there aren't any rectangles initially; what you're doing is creating a selection that is bound to data and then operating on it. I can understand that it feels a bit weird to operate on nonexistent elements (this was personally one of my biggest stumbling blocks when I was learning D3), but it's an idiom that shows its usefulness later on when we start adding and removing elements due to changing data.

The x scale helps us calculate the horizontal positions, and bandwidth() gives the width of the bar. The y scale calculates vertical positions, and we manually get the height of each bar from y to the bottom. Note that whenever we needed a different value for every element, we defined an attribute as a function (x, y, and height); otherwise, we defined it as a value (width).

Let's add some flourish and make each bar grow out of the horizontal axis. Time to dip our toes into animations!

Modify the code you just added to resemble the following; I've highlighted the lines that are different:

chart.selectAll('rect') 
  .data(data) 
  .enter() 
  .append('rect') 
  .attr('class', 'bar') 
  .attr('x', d => x(d.region)) 
.attr('y', window.innerHeight - 50) 
  .attr('width', x.bandwidth()) 
  .attr('height', 0) 
    .transition() 
    .delay((d, i) => i * 20) 
    .duration(800) 
    .attr('y', d => y(d.meanPctTurnout)) 
    .attr('height', d =>
        (window.innerHeight - 50) - y(d.meanPctTurnout));

The difference is that we statically put all bars at the bottom (window.innerHeight - 50) with a height of zero and then entered a transition with .transition(). From here on, we define the transition that we want.

First, we wanted each bar's transition delayed by 20 milliseconds using i*20. Most D3 callbacks will return the datum (or whatever datum has been bound to this element, which is typically set to d) and the index (or the ordinal number of the item currently being evaluated, which is typically i) while setting the this argument to the currently selected DOM element. If we were using, say, classes, this last point would be fairly important; otherwise, we'd be evaluating the rect SVGElement object instead of whatever context we actually want to use. However, because we're mainly going to use factory functions for everything, figuring out which context is assigned to this is far less of a worry.

This gives the histogram a neat effect, gradually appearing from left to right instead of jumping up at once. Next, we say that we want each animation to last just shy of a second, with .duration(800). At the end, we define the final values for the animated attributes--y and height are the same as in the previous code--and D3 will take care of the rest.

Save your file and refresh. If everything went according to plan, you should have a chart that looks like the :

According to this, voter turnout was fairly high during the EU referendum, with the south-west having the highest turnout. Hey, look at this; we kind of just did some data journalism here! Remember that you can look at the entire code on GitHub at http://github.com/aendrew/learning-d3-v4/tree/chapter1 if you didn't get something similar to the preceding screenshot.

We still need to do just a bit more, mainly using CSS to style the SVG elements.

We could have just gone to our HTML file and added CSS, but then that means opening that yucky index.html file. Also, where's the fun in writing HTML when we're learning some newfangled JavaScript?

First, create an index.css file in your styles/ directory:

html, body { 
  padding: 0; 
  margin: 0; 
} 

.axis path, .axis line { 
  fill: none; 
  stroke: #eee; 
  shape-rendering: crispEdges; 
} 

.axis text { 
  font-size: 11px; 
} 

.bar { 
  fill: steelblue; 
}

Then, just add the following line to the top of main.js:

import * as styles from 'styles/index.css';

I know. Crazy, right? No <style> tags needed!

It's worth noting anything involving require() or import that isn't a JS file is the result of a Webpack loader. Although the author of this text is a fan of Webpack, all we're doing is importing the styles into main.js with Webpack instead of requiring them globally via a <style> tag. This is cool because, instead of uploading a dozen files when deploying your finished code, you effectively deploy one optimized bundle. You can also scope CSS rules to be particular to when they're being included and all sorts of other nifty stuff; for more information, refer to https://github.com/webpack-contrib/css-loader.

Looking at the preceding CSS, you can now see why we added all those classes to our shapes. We can now directly reference them when styling with CSS. We made the axes thin, gave them a light gray color, and used a smaller font for the labels. The bars should be light blue. Save this and wait for the page to refresh. We've made our first D3 chart:

I recommend fiddling with the values passed to .width and .height to get a feel of the power of D3. You'll notice that everything scales and adjusts to any size without you having to change other code. Smashing!