Hi. I’m Sharon Machlis at IDG, here with episode 35 of Do More With R: Export your data to Excel – with formatting or multiple tabs!
I’m going to use two packages for this: rio and openxlsx. I’ll also load dplyr, because I almost always end up using dplyr.
For data, I’m going to use the state.x77 data set that comes with R. It’s a matrix; I’ll turn it into data frame with as.data.frame() so I can add a column with state names, which are character strings. The next lines of code add the state names. That last line moves the State name column to position number 1. I also divided the Illiteracy column by 100 so that a number like 2.1 – which stands for 2.1 percent – will be .021. That’s just so I can show some Excel percent formatting.
If you only want to send your data to a single Excel sheet without formatting, rio makes it incredibly easy. It’s just export(data frame name, “file name ending in dot xlsx”). The file name can be anything, as long as it ends in .xlsx. (rio::export() can create a lot of other file types also, if you change the file name extension to something like dot csv or dot json).
You can see the new Excel file here. It’s basic with no formatting, but it’s got all the data.
When it creates an Excel file, rio::export() is a wrapper for the openxlsx package’s write.xlsx() function. To add arguments when I create my file, I’ll switch to write.xlsx().
Next, I’ll add simple header formatting. An easy way to do that is to create a headline style object with the createStyle() function. Then I can use that style in the write.xlsx command. You see it here in the headerStyle argument.
There are a couple of ways to format the cells. If you’re happy with Excel’s out-of-the-box formats like “Number”, “Currency”, and “Percentage”, you can set those with a class() command. In the third line, I set the class of the Illiteracy column to “percentage.” Now if I create an Excel file, that column will have Excel’s Perentage format. You can see the Illiteracy column now is in percent style.
One more cool option for write.xlsx: You can create an Excel table, complete with dropdown filters and sorting and table styles.
It’s as easy as adding asTable = TRUE and then choosing any of Excel’s available table styles (here I use Light 2). Now I’ve got a table!
Unfortunately, I’m almost never happy with Excel’s default formats. I don’t want 2 decimal places after the percent column, I want one. And for the Income column, I want to add a dollar sign but no decimal places.
For more specific styling in cells below the header, I need a slightly more robust openxlsx process. Each step is simple, but there’s just a bit more to it than a single write command.
Let me show you some cod examples. First you create a workbook object. Next you add a worksheet to that workbook. Then write data to the sheet. There are a couple of different ways to add styling. One is what I showed before, adding a class to a data frame column. Another way is to create a specific style – here I made a percent format with one decimal place. Then I need to use addStyle() to apply the style to worksheet number one in my workbook (I can specify sheets by number or name). For body styling, I also need to specify the exact rows and columns I want for addStyle(). In the last line here, I want my style in column 3, for rows 2 through the last row of my data (that’s the nrow(mydf)). And, last step: Save the workbook with saveWorkbook(). Let me show you an example with the states data frame.
This code creates a workbook, adds a worksheet, writes data to the sheet”. Next, I create some styles. The dollar style adds a dollar sign and comma, and rounds to no decimal places because there’s no .00 in the format. The percent style adds a percent sign and one decimal place (but no comma). And the basic comma format has a comma and no decimal places.Then I apply each style with addStyle()
Finally, I save the workbook to a file called with_granular_styling.xlsx. I added overwrite = TRUE – that makes sure I get a new version of my file if the file already exists. See I’ve got commas in Population and Area, dollar sign but no cents in Income, and one decimal place for Illiteracy.
Last thing I want to show you is how to write data to multiple tabs. Here I’ve created 4 new data frames from the original states data, by region: Northeast, West, South, and NorthCentral. Next, I made a list of those data frames, and named each item in the list. Now if I want to quickly export those to a single Excel file, one data frame per tab, without worrying about styling, I can use the write.xlsx() function. My data frame list is the first argument and an Excel file name is the 2nd argument. And, there’s my data, one data frame per tab!
That’s it for this episode, thanks for watching! For more R tips, head to the Do More With R page at go dot infoworld dot com slash more with R, all lowercase except for the R.
You can also find the Do More With R playlist on the YouTube IDG Tech Talk channel -- where you can subscribe so you never miss an episode. Hope to see you next time!