This event has ended. Visit the official site or create your own event on Sched.
Click here to return to main conference site. For a one page, printable overview of the schedule, see this.
Monday, June 27 • 1:00pm - 2:15pm
Extracting data from the web APIs and beyond (Part 1)

Log in to save this to your schedule, view media, leave feedback and see who's attending!

Instructors: Karthik Ram, Garrett Grolemund and Scott Chamberlain

No matter what your domain of interest or expertise, the internet is a treasure trove of useful data that comes in many shapes, forms, and sizes, from beautifully documented fast APIs to data that need to be scraped from deep inside of 1990s html pages. In this 3 hour tutorial you will learn how to programmatically read in various types of web data from experts in the field (Founders of the rOpenSci project and the training lead of RStudio). By the end of the tutorial you will have a basic idea of how to wrap an R package around a standard API, extract common non-standard data formats, and scrape data into tidy data frames from web pages.

Background Knowledge Familiarity with base R and ability to write functions.

R with latest versions of httr, rvest, and curl. It would also be helpful to have a recent release of R and RStudio

Target Audience
Any R user with an interest in retrieving data from the web.

Website for materials: All material for the tutorial will be posted at: http://ropensci.github.io/user2016-tutorial/ (including instructions on packages that you'll need to install ahead of time).

More information and code available on our GitHub repository

avatar for Scott Chamberlain

Scott Chamberlain

rOpenSci, University of California, Berkeley, United States of America
avatar for Garrett  Grolemund

Garrett Grolemund

Educator, RStudio
avatar for Karthik Ram

Karthik Ram

co-founder, rOpenSci
Karthik Ram is a co-founder of ROpenSci, and a data science fellow at the University of California's Berkeley Institute for Data Science. Karthik primarily works on a project that develops R-based tools to facilitate open science and access to open data.

Monday June 27, 2016 1:00pm - 2:15pm PDT
Campbell Rehearsal Hall