Name: Resource-Aware Scheduling Strategies for Parallel Machine Learning R Programs though RAMBO
Start: 2016-06-29T11:06:00-0700
End: 2016-06-29T11:24:00-0700

Click here to return to main conference site. For a one page, printable overview of the schedule, see this.

Back To Schedule

Resource-Aware Scheduling Strategies for Parallel Machine Learning R Programs though RAMBO

We present resource-aware scheduling strategies for parallel R programs leading to efficient utilization of parallel computer architectures by estimating resource demands. We concentrate on applications that consist of independent tasks. The R programming language is increasingly used to process large data sets in parallel, which requires a high amount of resources. One important application is parameter tuning of machine learning algorithms where evaluations need to be executed in parallel to reduce runtime. Here, resource demands of tasks heavily vary depending on the algorithm configuration. Running such an application in a naive parallel way leads to inefficient resource utilization and thus to long runtimes. Therefore, the R package “parallel” offers a scheduling strategy, called “load balancing”. It dynamically allocates tasks to worker processes. This option is recommended when tasks have widely different computation times or if computer architectures are heterogeneous. We analyzed memory and CPU utilization of parallel applications with our TraceR profiling tool and found that the load balancing mechanism is not sufficient for parallel tasks with high variance in resource demands. A scheduling strategy needs to know resource demands of a task before execution to efficiently map applications to available resources. Therefore, we build a regression model to estimate resource demands based on previous evaluated tasks. Resource estimates like runtime are then used to guide our scheduling strategies. Those strategies are integrated in our RAMBO (Resource-Aware Model-Based Optimization) Framework. Compared to standard mechanisms of the parallel package our approach yields improved resource utilization.

Moderators

Dirk Eddelbuettel

Debian and R Projects

Speakers

Helena Kotthaus

Department of Computer Science 12, TU Dortmund University, Dortmund, Germany

Wednesday June 29, 2016 11:06am - 11:24am PDT
Econ 140

Contributed talk, Performance

Attendees (66)

T
A
W
M
D
L
V
M
R
Y
R
S
View All →

user2016

Log in to save this to your schedule, view media, leave feedback and see who's attending!

Dirk Eddelbuettel

Helena Kotthaus

Attendees (66)