This event has ended. Visit the official site or create your own event on Sched.
Click here to return to main conference site. For a one page, printable overview of the schedule, see this.
Back To Schedule
Wednesday, June 29 • 11:24am - 11:42am
Grid Computing in R with Easy Scalability

Log in to save this to your schedule, view media, leave feedback and see who's attending!

Parallel computing is useful for speeding up computing tasks and many R packages exist to aid in using parallel computing. Unfortunately it is not always trivial to parallelize jobs and can take a significant amount of time to accomplish, time that may be unavailable. My presentation will demonstrate an alternative method that allows for processing of multiple jobs simultaneously across any number of servers using Redis message queues. This method has proven very useful since I began implementing it at my company over two years ago. In this method, a main Redis server handles communication with any number of R processes on any number of servers. These processes, known as workers, inform the server that they are available for processing and then wait indefinitely until the server passes them a task. In this presentation, it will be demonstrated how trivial it is to scale up or down by adding or removing workers. This will be demonstrated with sample jobs run on workers in the Amazon cloud. Additionally, this presentation will show you how to implement such a system yourself with the rminions package I have been developing. This package is based on what I have learned over the past couple of years and contains functionality to easily start workers, queue jobs, and even perform R-level maintenance (such as installing packages) on all connected servers simultaneously!

avatar for Dirk  Eddelbuettel

Dirk Eddelbuettel

Debian and R Projects

avatar for Jonathan  Adams

Jonathan Adams

Systems & Actuarial Analyst, ARMtech Insurance Services

Wednesday June 29, 2016 11:24am - 11:42am PDT
Econ 140