How to turn CSV data into interactive visualizations with R and rCharts

Once your data are in the right format, just a couple of lines of R code can generate a robust chart or graph from your spreadsheet.

R for interactive graphics

JavaScript is great for creating interactive Web graphics, but not everyone wants to become a JavaScript (or jQuery) expert to get a quick look at the data. With R and the rCharts package, just a few lines of code let you turn a data file into interactive graphics. If you don't know R, no worries: We assume no prior knowledge. (To get up to speed on the basics, though, head to our Beginner's Guide to R.)

Slides 2 through 6 demonstrate how to format data for visualizing; slides 7 through 9 are the actual graphing steps.

Step one: If you haven't already, download and install R and the RStudio IDE. (RStudio isn't required but is a nice R environment.)

The rCharts package

Add-on package rCharts makes it easy to create interactive graphics within R while accessing the power of several different JavaScript libraries. To install and load rCharts, run the following code in the RStudio console (the bottom left panel within RStudio):

install.packages('devtools')
library('devtools')
install_github('rCharts', 'ramnathv')
library('rCharts')


You need to run install commands only once, but you'll need the library commands in each new R session.

Read data into R

Now it's time to find some data. I downloaded an Excel file called the Percentage of individuals using the Internet from the ITU (International Telecommunications Union). It needs minimum cleanup: Delete the first title row and notes columns, and save to a CSV file. (I called mine pctInternetUse.csv.)

Load it into R with:

mydata <- read.csv("pctInternetUse.csv")

Make sure to either include the path along with the file name, or save the file to your R working directory. What's your current working directory? Use the function getwd() to see.

Data tweaks

The commands:

str(mydata)

and

head(mydata)

show some basic info about the mydata data frame. The years all show as X2000, X2001 etc. because R doesn't want to use numbers as column names; the country title came in as X. I'll change all this with the names() function:

names(mydata) <- c("Country", "2000", "2001", "2002", "2003", "2004", "2005", "2006", "2007", "2008", "2009", "2010", "2011", "2012")

Another look with str(mydata) shows the column names that I want.

Graph just a few data points

Since 200+ countries will make for a crowded chart, we'll extract just a few countries -- the U.S., Japan and Norway -- using subset():

mydata3 <- subset(mydata, Country=="United States" | Country == "Japan" | Country == "Finland")

Finally, one last data-formatting issue: There's no "year" variable for the x axis. Right now, each year is its own column. We could have fixed this in Excel, but let's do it in R with the reshape2 package.

Reshape the data

Install and load reshape2:

install.packages("reshape2")
library("reshape2")


Now, create reshaped data frames where each row starts with a country and all the other year columns "melt" into single variable and value columns:

mydata3_reshaped <- melt(mydata3, id="Country")

Look at the new data frame with str(mydata3_reshaped) and it's been reformatted. The last piece is to rename the columns from "value" and "variable":

names(mydata3_reshaped) <- c("Country", "Year", "Percent")

Formatting the data was a major part of the process.

Note that the interactive R console is at the bottom left of the RStudio window. At the top left is a window where you can run commands from a file by selecting lines and clicking run or control-enter.

Generate a graph with one line of code

At last, time to visualize! We'll start with a simple graph of one country, so let's create another data frame with just U.S. data from mydata3_us:

mydataus <- subset(mydata3_reshaped, Country=="United States")

Creating an actual interactive graphic takes just one line of code!

rCharts supports several different JavaScript libraries. I'll start with the Morris.js and associated rCharts mPlot() function:

plot1 <- mPlot(x="Year", y="Percent", type="Line", data=mydataus)

View the graphic by typing:

plot1

in the R console.

Want a bar chart?

plot2 <- mPlot(x="Year", y="Percent", type="Bar", data=mydataus)

Grouped bar chart

This slightly more elaborate grouped bar chart uses hPlot() to access the Highcharts library (paid license needed for commercial use):

plot3 <- hPlot(x="Year", y="Percent", group="Country", type="column", data=mydata3_reshaped)

To see the HTML being generated, use:

plot3$print("chart3")

to view html in the console ("chart3" sets the chart id). To save the HTML to a file, use sink():

sink("mychartfile.html") # sends output to file

plot3$print("chart3")

sink() # returns output to console

More visualization options

Here's a similar grouped bar chart to the previous slide, but this one uses the NVD3 library:

plot4

Several more libraries are supported -- even Leaflet for mapping -- as are other graphic types such as bubble charts.

Unfortunately, you'll soon discover that not all libraries have the same R-command formats within rCharts. For more info and examples, see the project page on creator Ramnath Vaidyanathan's GitHub site.

For more on R graphics, see Painless data visualization with R.