Blog

Jan 13, 2017  mode_comment

Drawing a bar chart with d3.js for JS beginners (Part II)

Back

D3.js is a powerful data visualizing JavaScript (JS) library developed by Mike Bostock.
In this post, I will go over how to read a .csv file with d3 and set up a basic bar chart using the .csv data. Please take a look at part 1 of this post to read about d3 and JS basics.



1. Reading a file with d3.js


The file I will be working with is a comma separate value (.csv) file that contains information and statistics on recent graduates. Feel free to click the link and download the file by right clicking on Raw and Save Link as, or use some awesome dataset you already have!

Screen Shot of the file I'll be working with

In the previous post, we have explored how to bind data to HTML elements and use them by writing JS functions that handle the data. Reading and working with data from an external file follows a similar concept, only with small differences. First, to read a .csv file, we will use the d3.csv() method.

d3.csv() is an asynchronous method, meaning that the rest of your code is executed even while JavaScript is simultaneously waiting for the file to finish downloading into the browser. -Scott Murray

As noted by Scott Murray, the author of many data viz books, d3.csv() is an asynchronous method so we must make sure the data is downloaded completely before we do anything with the data. The easiest way to accomplish this is to call any method that references the data within the d3.csv() callback function:

Code 1

var dataset;
d3.csv("my_file.csv", function(error, data){
    if (error){  // if there was an error
        console.log(error); // print the error. console.log() is a JS function equivalent to C's printf() or Python's print() function
    }

    dataset = data;
    //some function(s) that uses the data must be here
});

Let's break down what is happening in the above code. First, d3.csv() is being called to load a .csv file. It takes two arguments: your .csv file and an anonymous JS function. As previously mentioned, d3 loads data as attributes so data must be wrapped in a JS function in order to be manipulated.


Each row is loaded as an Object, and the columns are mapped as attributes
Here, the anonymous JS function takes two arguments: error and data. JS will automatically know that the first argument is an error parameter and will set the parameter to whatever error message if there was a problem loading the data. The second parameter is fairly intuitive, it's a variable that represents our data handed off to the JS function. Again, you could name this variable anything you want.

Inside our anonymous JS funciton, if (error){ console.log(error); } prints the error message to the web dev tool console if there was an error and dataset = data; stores the loaded .csv data to the variable dataset declared outside of the d3 function.

Also note that d3.csv() was smart enough to map the very first row of the .csv file as the attribute names. Now we are ready to use the data and the data manipulation code will be inserted where I have commented //some function(s) ...



2. Let's draw a bar chart


*Drumroll* We finally get to use our data! How EXCITING !!

The standard way to draw on the web is to use Scalable Vector Graphics (SVG). Simply think of SVG as a blank canvas you would draw on. Using d3, you can plot various shapes, text, and lines onto this canvas. Now let's write some code inside our d3.csv() function in code 1.

Code 2

var dataset;
d3.csv("../files/recent-grads.csv", function(error, data){
    if(error){ console.log(error); } // print error if file load fails

    dataset = data; // store the .csv data into the variable dataset

    // Add the svg canvas
    var svg = d3.select("body")
        .append("svg") // append the svg canvas to the body element
        .attr("width", 700)
        .attr("height", 100); // set our canvas size using the attr(ibute) method

    // Let's draw on the canvas
    svg.selectAll("div") // Note that I now select the variable svg, which is defined above, instead of d3.select("body").selectAll("p")..
        .data(dataset.filter(function(d){ return d.Major_category == "Engineering"; })) // select rows that have Engineering as the Major_category
        .enter()
        .append("rect") // rect is a SVG shape, for a rectangle
        .attr("width", 20) // set the width of our bars (rectangle)
        .attr("height", 60); // set the hegith of our bars
});

Here, we wrote new code to add the svg canvas and to draw on the canvas. Notice how the d3.select("body").append("svg").. is stored into var svg.

var svg = d3.select("body").append("svg").attr("width",700).attr("height",100);

This is done to make selecting the svg element easier next time, i.e. write svg.selectAll("div")

The .attr() method sets attributes of HTML elements and .filter() allows us to filter the data the function is being applied on. We can see that we filtered to select rows where the Major_category corresponded to Engineering by writing a d3.filter method and a JS function:

dataset.filter(function(d){ return d.Major_category == "Engineering"; })

Inside our anonymous JS function, we refer to our Major_category column by using the JS data structure notation: almost all JS values have properties and these properties can be selected by value.property or value["property"]. The D3 API reference goes over how the d3.filter() function works.

Now let's see the output of code 2:

Output of Code 2
Yay! We have a bar! But wait... it's ugly, it's black, and there really is just one bar. Is this really a bar chart? Before we can call this a bar chart, we will need to write a couple more lines of code.

Code 3

var dataset;
d3.csv("../files/recent-grads.csv", function(error, data){
    if(error){ console.log(error); } // print error if file load fails

    dataset = data; // store the .csv data into the variable dataset

    // Add the svg canvas
    var svg = d3.select("body")
        .append("svg") // append the svg canvas to the body element
        .attr("width", 700)
        .attr("height", 100); // set our canvas size using the attr(ibute) method

    // Let's draw on the canvas
    svg.selectAll("div") // Note that I now select the variable svg, which is defined above, instead of d3.select("body").selectAll("p")..
        .data(dataset.filter(function(d){ return d.Major_category == "Engineering"; })) // filter and grab only the rows with the Engineering Major_category
        .enter()
        .append("rect") // rect is a SVG shape, for a rectangle
        .attr("width", 20) // set the width of our bars (rectangle)
        .attr("x", function(d, i){ return i*21; })
        .attr("height", function(d){ return d.ShareWomen * 100.; })
        .attr("fill", "teal");
});

This is where things get really exciting! I made 4 lines of change going from code 2 to code 3. I removed .attr("height", 60) and added the code lines I highlighted above. One issue with code 2 was that all our rect elements were being drawn on top of each other because the x position wasn't being specified. This is why it looked like we had one bar on our chart. To set the x positions so that the bars can be parallel to each other, I wrote a JS function that assigns x coordinates to our bars:

.attr("x", functon(d,i){return i*21;})

This JS function is slightly different from what we've been writing earlier: it takes 2 arguments, d and i. D3 automatically knows the first argument is the data and the second argument is a counter, so i*21 is returning 1*21,2*21,3*31... as we call the JS function on our dataset (21 is chosen to be barwidth + 1 because we want the bars to have some space inbetween each other).

Now we need to set the bar heights! This is done by:

.attr("height", function(d, i){ return d.ShareWomen *100; })

Notice how you can pass in extra arguments even though it's not used in the function (i in this case). JS won't complain. It's also perfectly OK to not pass in i here. This function reads the ShareWomen attribute of the JS object (each row for us) and returns that number * 100 for the height attribute (*100 is an arbitrary scale I defined).

Finally, we write .attr("fill", "teal") to fill our rectangles with the color teal.

Now let's see the output of code 3:

Output of Code 3
This looks amazing! Except for the fact that our bar chart is upside down. This is because unlike Cartesian coordinates, the Java graphical coordinate system defines the 0,0 origin as the upper left corner.

Java graphical coordinate system


To make our bars look like they're growing from the bottom up, we can simply shift our bars vertically down by starting from a y coordinate of (svg canvas height - bar height) and end at the height of the canvas.

Code 4

var dataset;
d3.csv("../files/recent-grads.csv", function(error, data){
    if(error){ console.log(error); } // print error if file load fails

    dataset = data; // store the .csv data into the variable dataset

    // Add the svg canvas
    var svg = d3.select("body")
        .append("svg") // append the svg canvas to the body element
        .attr("width", 700)
        .attr("height", 100); // set our canvas size using the attr(ibute) method

    // Let's draw on the canvas
    svg.selectAll("div") // Note that I now select the variable svg, which is defined above, instead of d3.select("body").selectAll("p")..
        .data(dataset.filter(function(d){ return d.Major_category == "Engineering"; })) // filter and grab only the rows with the Engineering Major_category
        .enter()
        .append("rect") // rect is a SVG shape, for a rectangle
        .attr("width", 20) // set the width of our bars (rectangle)
        .attr("x", function(d, i){ return i*21; })
        .attr("y", function(d){ return 100 - d.ShareWomen*100; })
        .attr("height", function(d){ return d.ShareWomen * 100.; })
        .attr("fill", "teal");
});


Output of code 4:

Output of Code 4



3. Wrapping up part 2


We have finally drawn a bar chart! Thanks for following this far!

In this post, we used our knowledge of D3 and JS basics gained in part 1 to start drawing our bar chart on a svg canvas. We covered some new d3 methods and a little bit more on JS functions. For more information on JS concepts we covered in this post, please look at this awesome, FREE, interactive e-book on JavaScript. Finally, if there is any error or clarifications I need to make in this post, do not hesitate to shoot me an email.

In the next post, I will go over how to make our bar chart prettier by adding labels and axes with a more structured framework for our d3 code. Hope to see you there!


About

I am a computational scientist finishing my PhD at U of Penn. I picked up programming coming into graduate school and after years of computational research, I'm amazed by what data can do. I love to use data analytics to find trends, which when exposed, empower people to make informed decisions about the world they live in. I'm also the co-founder of Penn Data Science Group.