Sometimes data you want is available on a Web page, but not in form you can easily download. That's where Web-scraping comes in. Most general-purpose computer languages have a library for easily collecting data from an HTML page. R does too -- a new package called rvest by Hadley Wickham, modeled after Python's Beautiful Soup.
Watch how easy it is to import data from a Web page into R. Code from the video is below.
Note: If you don't have rvest installed on your system, you can download and install it with
install.packages("rvest"). Get SelectorGadget at SelectorGadget.com.
Note that CSS can change on Web pages -- in fact, the best CSS for the National Weather Service forecast has already changed in the few weeks since I recorded this video. Another good reason to use SelectorGadget, which makes it easy to find the CSS pattern you want.
library("rvest") htmlpage <- html("http://forecast.weather.gov/MapClick.php?lat=42.31674913306716&lon=-71.42487878862437&site=all&smap=1#.VRsEpZPF84I") forecasthtml <- html_nodes(htmlpage, "#detailed-forecast-body b , .forecast-text") forecast <- html_text(forecasthtml) paste(forecast, collapse =" ")