Loading…
This event has ended. Visit the official site or create your own event on Sched.
Click here to return to main conference site. For a one page, printable overview of the schedule, see this.
Tuesday, June 28 • 2:30pm - 3:30pm
Writing a dplyr backend to support out-of-memory data for Microsoft R Server

Log in to save this to your schedule, view media, leave feedback and see who's attending!

Poster #10

Over the last two years, the dplyr package has become very popular in the R community for the way it streamlines and simplifies many common data manipulation tasks. A feature of dplyr is that it’s extensible; by defining new methods, one can make it work with data sources other than those it supports natively. The dplyrXdf package is a backend that extends dplyr functionality to Microsoft R Server’s xdf files, which are a way of overcoming R’s in-memory limitations. dplyrXdf supports all the major dplyr verbs, pipeline notation, and provides some additional features to make working with xdfs easier. In this talk, I’ll share my experiences writing a new back-end for dplyr, and demonstrate how to use dplyr and dplyrXdf to carry out data wrangling tasks on large datasets that exceed the available memory.

Speakers
HO

Hong Ooi

Microsoft


Tuesday June 28, 2016 2:30pm - 3:30pm PDT
Sponsor Pavilion