Loading…
This event has ended. Visit the official site or create your own event on Sched.
Click here to return to main conference site. For a one page, printable overview of the schedule, see this.
Wednesday, June 29 • 11:40am - 11:45am
Chunked, dplyr for large text files

Log in to save this to your schedule, view media, leave feedback and see who's attending!

During a data analysis project it may happen that a new version of the raw data comes available or that data changes are made outside of your control. `daff` is a R package that helps to keep track of such changes. It can find differences in values between data.frames, store these differences, render them and apply them as a patch to a new data.frame. It can also merge two versions of a data.frame having a common parent version. It wraps the daff.js library of Paul Fitzpatrick (http://github.com/paulfitz/daff) using the V8 package.

Moderators
avatar for Joseph Rickert

Joseph Rickert

Program Manager, Microsoft
Joseph is a Program Manager at Microsoft having come to Microsoft with the acquisition of Revolution Analytics. He is a data scientist and R language evangelist passionate about analyzing data and teaching people about R. He is a regular contributor to the Revolutions blog and an... Read More →

Speakers
avatar for Edwin  de Jonge

Edwin de Jonge

Statistics Netherlands
Edwin de Jonge is a research and statistical consultant working at Statistics Netherlands for more than 25 years. He has a background in theoretical and computational physics. He has a long experience in methodological research, including data cleaning, visualization and network analysis... Read More →


Wednesday June 29, 2016 11:40am - 11:45am PDT
SIEPR 120