Google Analytics is a useful tool for measuring website usage -- everything from simple page views to the kind of complex ad campaign tracking marketers might need. However, I find the user interface to be, well, less than ideal. The good news is that Google Analytics provides a robust API that enables you to tap into your data programmatically, meaning you can conveniently pull and package data in ways that might not be as easy to do on the Web.
You don't need to know R to follow along with the steps here. In fact, after extracting data, you can save it to a CSV file to use in Excel, if you prefer.
Step one: Get R
First, if it's not on your system already, download and install R from the R Project for Statistical Computing website. When you run the R application, you'll see a console window where you can type in text commands. And, of course, make sure you've got a Google Analytics account and some data to work with.
There are several R packages available that have functions specifically designed for Google Analytics, including ganalytics, RGoogleAnalytics and rga ("R Google Analytics"). I'll be using rga for this tutorial, but any of them would work.
Like ganalytics, rga resides on GitHub. To easily install any of the Google Analytics packages from GitHub, first install and load the R package devtools by typing the following commands into the R console window:
Then install and load rga from package author Bror Skardhamar's account:
(You only have to run the first three commands once per machine, but you need to load
library(rga) each time you open R.)
Step two: Allow rga to access your Google Analytics account
On a Mac, authentication is as easy: Create an instance of the Google Analytics API authentication object by typing the following in your R console window:
That will open a browser window that asks you to give rga permission to access your Google data. When you accept, you'll be given a code to cut and paste back into your R console window where it says, "Please enter code here."
In Windows, I find that adding a line of code before opening an rga instance helps with any authentication errors:
options(RCurlOptions = list(cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl")))
Next, you need to find the profile ID for your Google account, which is not found in the tracking code that you add to a website to allow Google Analytics to monitor your site. Instead, on your Google Analytics Admin page, go to View Settings and you'll see the ID under "View ID."
Or, run the command
in your R terminal window to get a list of all available profiles in your account; the profile ID will be listed in the first column.
Whichever way you find it, save that value in a variable so you don't have to keep typing it. You can use a command like:
id <- "1234567"
(Replace the number with your actual ID, and make sure to put it between quote marks.) This stores your profile ID as the variable "id."
Step 3: Extract data
Now we're ready to start pulling some data using the ga instance we just created. The getData method will actually extract data from your Google Analytics account that you can then store in another new R variable. If you want to see all available methods for your ga object, run:
You can query the Google API for metrics and dimensions. Metrics are things like page views, visits and organic searches; dimensions include information like traffic sources and visitor type. (See Google's Dimensions & Metrics Reference for full details.)