Beginner's guide to R: Syntax quirks you'll want to know

Part 5 of our hands-on guide covers some R mysteries you'll need to understand.

1 2 Page 2
Page 2 of 2

R command line differs from the Unix shell

When you start working in the R environment, it looks quite similar to a Unix shell. In fact, some R command-line actions behave as you'd expect if you come from a Unix environment, but others don't.

Want to cycle through your last few commands? The up arrow works in R just as it does in Unix -- keep hitting it to see prior commands.

The list function, ls(), will give you a list, but not of files as in Unix. Rather, it will provide a list of objects in your current R session.

Want to see your current working directory? pwd just throws an error; what you want is getwd().

rm(my_variable) will delete a variable from your current session.

R does include a Unix-like grep() function. For more on using grep in R, see this brief writeup on Regular Expressions with The R Language at regular-expressions.info.

Terminating your R expressions

R doesn't need semicolons to end a line of code (although it's possible to put multiple commands on a single line separated by semicolons, you don't see that very often). Instead, R uses line breaks (i.e., new line characters) to determine when an expression has ended.

What if you want one expression to go across multiple lines? The R interpreter tries to guess if you mean for it to continue to the next line: If you obviously haven't finished a command on one line, it will assume you want to continue instead of throwing an error. Open some parentheses without closing them, use an open quote without a closing one or end a line with an operator like + or - and R will wait to execute your command until it comes across the expected closing character and the command otherwise looks finished.

Syntax cheating: Run SQL queries in R

If you've got SQL experience and R syntax starts giving you a headache -- especially when you're trying to figure out how to get a subset of data with proper R syntax -- you might start longing for the ability to run a quick SQL SELECT command query your data set.

You can.

The add-on package sqldf lets you run SQL queries on an R data frame (there are separate packages allowing you to connect R with a local database). Install and load sqldf, and then you can issue commands such as:

sqldf("select * from mtcars where mpg > 20 order by mpg desc")

This will find all rows in the mtcars sample data frame that have an mpg greater than 20, ordered from highest to lowest mpg.

Most R experts will discourage newbies from "cheating" this way: Falling back on SQL makes it less likely you'll power through learning R syntax. However, it's there for you in a pinch -- or as a useful way to double-check whether you're getting back the expected results from an R expression.

Examine and edit data with a GUI

And speaking of cheating, if you don't want to use the command line to examine and edit your data, R has a couple of options. The edit() function brings up an editor where you can look at and edit an R object, such as

edit(mtcars)

Invoking R's data editing window
Invoking R's data editing window with the edit() function.

This can be useful if you've got a data set with a lot of columns that are wrapping in the small command-line window. However, since there's no way to save your work as you go along -- changes are saved only when you close the editing window -- and there's no command-history record of what you've done, the edit window probably isn't your best choice for editing data in a project where it's important to repeat/reproduce your work.

In RStudio you can also examine a data object (although not edit it) by clicking on it in the workspace tab in the upper right window.

Saving and exporting your data

In addition to saving your entire R workspace with the save.image() function and various ways to save plots to image files, you can save individual objects for use in other software. For example, if you've got a data frame just so and would like to share it with colleagues as a tab- or comma-delimited file, say for importing into a spreadsheet, you can use the command:

write.table(myData, "testfile.txt", sep="\t")

This will export all the data from an R object called myData to a tab-separated file called testfile.txt in the current working directory. Changing sep="\t" to sep="c" will generated a comma-separated file and so on.

Next: More resources for boosting your R skills.

This article, Beginner's guide to R: Syntax quirks you'll want to know, was originally published at Computerworld.com.

Copyright © 2017 IDG Communications, Inc.

1 2 Page 2
Page 2 of 2
9 steps to lock down corporate browsers
  
Shop Tech Products at Amazon