Beginner's guide to R: Syntax quirks you'll want to know

transparent binary code binary code computer coding technical programming 000000123354
Credit: iStockphoto

Part 5 of our hands-on guide covers some R mysteries you'll need to understand.

I mentioned at the outset that R syntax is a bit quirky, especially if your frame of reference is, well, pretty much any other programming language. Here are some unusual traits of the language you may find useful to understand as you embark on your journey to learn R.

[This story is part of Computerworld's "Beginner's guide to R." To read from the beginning, check out the introduction; there are links on that page to the other pieces in the series.]

Assigning values to variables

In pretty much every other programming language I know, the equals sign assigns a certain value to a variable. You know, x = 3 means that x now holds the value of 3.

Not in R. At least, not necessarily.

In R, the primary assignment operator is <- as in:

x <- 3

But not:

x = 3

To add to the potential confusion, the equals sign actually can be used as an assignment operator in R -- but not all the time. When can you use it and when can you not?

The best way for a beginner to deal with this is to use the preferred assignment operator <- and forget that equals is ever allowed. Hey, if it's good enough for Google's R style guide -- they advise not using equals to assign values to variables -- it's good enough for me.

(If this isn't a good enough explanation for you, however, and you really really want to know the ins and outs of R's 5 -- yes, count 'em, 5 -- assignment options, check out the R manual's Assignment Operators page.)

One more note about variables: R is a case-sensitive language. So, variable x is not the same as X. That applies to pretty much everything in R; for example, the function subset() is not the same as Subset().

c is for combine (or concatenate, and sometimes convert/coerce.)

When you create an array in most programming languages, the syntax goes something like this:

myArray = array(1, 1, 2, 3, 5, 8);

Or:

int myArray = {1, 1, 2, 3, 5, 8};

Or maybe:

myArray = [1, 1, 2, 3, 5, 8]

In R, though, there's an extra piece: To put multiple values into a single variable, you need the c() function, such as:

my_vector <- c(1, 1, 2, 3, 5, 8)

If you forget that c, you'll get an error. When you're starting out in R, you'll probably see errors relating to leaving out that c() a lot. (At least I certainly did.)

1 2 3 4 5 6 Page
FREE Computerworld Insider Guide: IT Certification Study Tips
Join the discussion
Be the first to comment on this article. Our Commenting Policies