Beginner's guide to R: Easy ways to do basic data analysis

Part 3 of our hands-on series covers pulling stats from your data frame, and related topics.

So you've read your data into an R object. Now what?

Examine your data object

Before you start analyzing, you might want to take a look at your data object's structure and a few row entries. If it's a 2-dimensional table of data stored in an R data frame object with rows and columns -- one of the more common structures you're likely to encounter -- here are some ideas. Many of these also work on 1-dimensional vectors as well.

Many of the commands below assume that your data are stored in a variable called mydata (and not that mydata is somehow part of these functions' names).

[This story is part of Computerworld's "Beginner's guide to R." To read from the beginning, check out the introduction; there are links on that page to the other pieces in the series.]

If you type:

head(mydata)

R will display mydata's column headers and first 6 rows by default. Want to see, oh, the first 10 rows instead of 6? That's:

head(mydata, n=10)

Or just:

head(mydata, 10)

Note: If your object is just a 1-dimensional vector of numbers, such as (1, 1, 2, 3, 5, 8, 13, 21, 34), head(mydata) will give you the first 6 items in the vector.

To see the last few rows of your data, use the tail() function:

tail(mydata)

Or:

tail(mydata, 10)

Tail can be useful when you've read in data from an external source, helping to see if anything got garbled (or there was some footnote row at the end you didn't notice).

To quickly see how your R object is structured, you can use the str() function:

str(mydata)

This will tell you the type of object you have; in the case of a data frame, it will also tell you how many rows (observations in statistical R-speak) and columns (variables to R) it contains, along with the type of data in each column and the first few entries in each column.

R's str function
Results of the str() function on the sample data set PlantGrowth.

For a vector, str() tells you how many items there are -- for 8 items, it'll display as [1:8] -- along with the type of item (number, character, etc.) and the first few entries.

Various other data types return slightly different results.

1 2 3 4 5 6 Page
FREE Computerworld Insider Guide: IT Certification Study Tips
Join the discussion
Be the first to comment on this article. Our Commenting Policies