Closed Captioning Closed captioning available on our YouTube channel

R tip: How to use Git and GitHub with R projects

InfoWorld | Aug 13, 2019

See how to use Git version control and sync with GitHub -- all right within RStudio

Copyright © 2019 IDG Communications, Inc.

Similar
Hi. I’m Sharon Machlis at IDG Communications, here with episode 33 of Do More With R: Use Git and GitHub with Your R Projects.
When a viewer suggested I cover git version control, my first reaction was: I don’t have an hour. But thanks to RStudio and the usethis package, git with R has become easier. Let me show you.
There are a couple of non-R initial system set-up tasks that I won’t be doing. One is installing git on your system. You can find git at git dash s.c.m dot com slash downloads . Or, google “install git.” You need this if you want git version control on your local system.
If you want to sync your local project with a GitHub repository, you’ll also need to set up a free GitHub account at github dot com slash join.
If you have trouble with these set-up steps, I highly recommend checking out “Happy Git and GitHub for the useR” at happygitwithr.com by Jenny Bryan and Jim Hester. Check out chapters 4, 6, and 7 for setup, including telling git your name and email address, which is important. If you forget to do that, RStudio will prompt you with the git commands you need. Those need to be run in a terminal window, not the RStudio R console.
Now to the R and RStudio part. First you need to tell RStudio you want to use git. In RStudio go to Tools > Git / SVN. Check the box to enable version control. And then make sure to tell RStudio where your “git executable” is. On a Mac, it’s likely to be at usr/bin/git or usr/local/bin/git. On Windows, look for the git.exe file somewhere like C:/Program Files/git/bin.
Next, I’ll create a new project in RStudio, and I’ll make sure to check “create a git repository.” I’ll add a little test R code.
To use git version control on this file, I need to “commit” that file. That “registers” it with git. Any future commits save the changes made since your previous commit.
So why would you do this? One reason: Because git lets you revert back to a previous version of the file! No need to save multiple versions of a file: scriptworking.R, scriptfinal.R, scriptreallyfinal….
To commit a file with git in RStudio, go to the git tab in the top right pane. First, select one or more files by checking the box next to their names. That “stages” the file, which roughly means you’ve marked it as ready to be committed. To actually commit the file, click the Commit button. That opens up the commit window. Type in a commit message – a note explaining a little about this commit. In this case, it’s my first commit. Then click the second Commit button. You can now close the Commit window I’m going to make a change to the file that I won’t like, one that I’m going to want to undo. Let me save the file and commit that. Obviously this would be easy to correct without git, but changes you’d like to roll back can of course be a lot more complex – and involve more than one file.
How can you get back to an earlier version of a file you committed?
First you need the git ID of that commit, called an S.H.A, or SHA. Go back to the Git tab and click the Diff button and then History. You can see all your project commits here with their commit messages (Hopefully you see why having descriptive commit messages is useful here!). I’ll copy the SHA I want to revert to.
For the git rollback command, you need the RStudio Terminal, not the R Console. Go to Tools > Terminal to open a terminal tab. Then type git space reset space dash dash hard, space, and the SHA
Seriously, how cool is that? The file is back to its previous commit.
Now let’s say I’d like to create a GitHub repository for this committed version of my project files, so I can sync back and forth between my local system and the cloud – all with version control. Especially useful if I’m collaborating with someone else, but also useful as a cloud backup.
I can do this right in R, with the usethis package.
I’ll load usethis and then look at the help file for the use_github function. Scroll down and you’ll see you need to set up a GitHub Personal Access Token – and there’s a link you can click to do just that.
Click the button at the top right to generate a token for your project. I just use the top group for controlling a repo as my token permissions. Then scroll down and generate the token.
You get one crack at saving that token, so remember to copy it before you close the page!
If you’re not worried about anything super-sensitive in your GitHub, the best way to store this is in your R environment file. Run usethis’s edit_r_environment() function < type edit_r_environment() > and add your token string as GITHUB_PAT. You need to restart R for this to take effect

Now see what happens when I reload usethis and run use_github
I’ve got a GitHub repo that will easily sync with my local system. If you’ve done this before on the command line, no more manually setting up origins and masters.
If I make and commit another change, I can now push it up to the GitHub repo in the cloud by using the push button.
If I go to my repo and refresh the page you can see my change is there.
If there’s a change in the cloud version, if I use the pull button, my local file is updated.
There’s a ton more you can do with git and GitHub, including setup so you don’t have to keep entering your user name and password if that’s happening to you. And, there are other cloud git services besides GitHub, such as GitLab and BitBucket. But that’s all I have time for this episode, thanks for watching! For more R tips, head to the Do More With R page at go dot infoworld dot com slash more with R, all lowercase except for the R
You can also find the Do More With R playlist on the YouTube IDG Tech Talk channel.
Hope to see you next episode!
Popular
Featured videos from IDG.tv