user2016: Full Schedule

Click here to return to main conference site. For a one page, printable overview of the schedule, see this.

1:00pm PDT

Continuous Integration and Teaching Statistical Computing with R

In this talk we will discuss two statistical computing courses taught as part of the undergraduate and masters curriculum in the Department of Statistical Science at Duke University. The primary goal of these courses is to teach advanced R along with modern software development practices. In this talk we will focus in particular on our adoption of continuous integration tools (github and wercker) as a way to automate and improve the feedback cycle for students as they work on their assignments. Overall, we have found that these tools, when used appropriately, help reduce learner frustration, improves code quality, reduces instructor workload, and introduces powerful tools that are relevant long after the completion of the course. We will discuss several of the classes' open-ended assignments and explore instances where continuous integration made sense and well as cases where it did not.

Moderators

Przemyslaw Biecek

University of Warsaw

Speakers

Colin Rundel

Assistant Professor of the Practice, Dept of Statistical Science, Duke University

Tuesday June 28, 2016 1:00pm - 1:18pm PDT
SIEPR 130

Contributed talk, Teaching

1:18pm PDT

Integrated R labs for high school students

The Mobilize project developed a year-long high school level Introduction to Data Science course, which has been piloted in 27 public schools in the Los Angeles Unified School District. The curriculum is innovative in many ways, including the use of R and the associated curricular support materials. Broadly, there are three main approaches to teaching R. One has users learning to code in their browser (Code School and DataCamp), another has them working directly in the R console (swirl), and a final approach is to have students follow along with an external document (OpenIntro). The integrated R labs developed by Mobilize bridge between working at the console and following an instructional document. Through the mobilizr package, students can load labs written to accompany the course directly into the Viewer pane in RStudio, allowing them to work through material without ever leaving RStudio. By providing the labs as part of the curricular materials we reduce the burden on teachers and allow students to work at their own pace. We will discuss the functionality of the labs as they stand, as well as developments in the .Rpres format that could allow for even more interactive learning.

Moderators

Przemyslaw Biecek

University of Warsaw

Speakers

Amelia McNamara

Smith College

Tuesday June 28, 2016 1:18pm - 1:36pm PDT
SIEPR 130

Contributed talk, Teaching

1:36pm PDT

Introducing Statistics with intRo

intRo is a modern web-based application for performing basic data analysis and statistical routines as well as an accompanying R package. Leveraging the power of R and Shiny, intRo implements common statistical functions in a powerful and extensible modular structure, while remaining simple enough for the novice statistician. This simplicity lends itself to a natural presentation in an introductory statistics course as a substitute for other commonly used statistical software packages, such as Excel and JMP. intRo is currently deployed at the URL http://www.intro-stats.com. In this talk, we describe the underlying design and functionality of intRo, including its extensible modular structure, illustrate its use with a live demo, and discuss future improvements that will enable a wider adoption of intRo in introductory statistics courses.

Moderators

Przemyslaw Biecek

University of Warsaw

Speakers

Andee Kaplan

Iowa State University

Tuesday June 28, 2016 1:36pm - 1:54pm PDT
SIEPR 130

Contributed talk, Teaching

1:54pm PDT

A first-year undergraduate data science course

In this talk we will discuss an R based first-year undergraduate data science course taught at Duke University for an audience of students with little to no computing or statistical background. The course focuses on data wrangling and munging, exploratory data analysis, data visualization, and effective communication. The course is designed to be a first course in statistics for students interested in pursuing a quantitative major. Unlike most traditional introductory statistics courses, this course approaches statistics from a model-based, instead of an inference-based, perspective, and introduces simulation-based inference and Bayesian inference later in the course. A heavy emphasis is placed on reproducibility (with R Markdown) and version control and collaboration (with git/GitHub). We will discuss in detail course structure, logistics, and pedagogical considerations as well as give examples from the case studies used in the course. We will also share student feedback and assessment of the success of the course in recruiting students to the statistical science major.

Moderators

Przemyslaw Biecek

University of Warsaw

Speakers

Mine Cetinkaya-Rundel

Duke University

Tuesday June 28, 2016 1:54pm - 2:12pm PDT
SIEPR 130

Contributed talk, Teaching

2:12pm PDT

Teaching R to 200 people in a week

Across disciplines, scholars are waking up to the potential benefits of computational competence. This has created a surge in demand for computational education which has gone widely underserved. Software Carpentry and similar efforts have worked to fill this gap with short, intensive introductions to computational tools, including R. Such an approach has numerous advantages; however, it is labor intensive, with student:instructor ratios typically below ten, and it is diffuse, introducing three major tools in two days. I recently adapted Software Carpentry strategies and tactics to provide a deeper introduction to R over the course of a week with a student:instructor ratio above 50. Here, I reflect on what worked and what I would change, with the goal of providing other educators with ideas for improving computational education. Aspects of the course that worked well include live coding during lectures, which builds in flexibility, demonstrates the debugging process, and forces a slower pace; multiple channels of feedback combined with flexibility to adapt to student needs and desires; and iterative, progressively more-open-ended exercises to solidify syntactical understanding and relate functions, idioms, and techniques to larger goals. Aspects of the course that I would change and caution other educators about include increasing the frequency and shortening the duration of student exercises, delaying the introduction of non-standard evaluation, and avoiding any prerequisite statistical understanding. These and other suggestions will benefit a variety of R instructors, whether for intensive introductions, traditional computing courses, or as a component of statistics courses.

Moderators

Przemyslaw Biecek

University of Warsaw

Speakers

Michael Levy

PhD Candidate, University of California, Davis

Network analysis, environmental social science, R users' groups, teaching R and stats

Tuesday June 28, 2016 2:12pm - 2:30pm PDT
SIEPR 130

Contributed talk, Teaching

10:30am PDT

swirl-tbp: a package for interactively learning R programming and data science through the addition of 'template-based practice' problems in swirl

The R package 'swirl' allows users to learn R programming by completing interactive lessons within the R console. Lessons (written in the YAML mark-up language) can include educational content such as text, graphics, and links, and multiple choice or open-ended questions. If a user does not answer a question correctly, hints may be provided until the correct answer is given. Although 'swirl' is a valuable package for learning important concepts in R and data science, users are limited in their ability to practice these concepts as 'swirl' lessons are static, so that a user repeating a lesson will see the same questions each time. This motivates an extension to 'swirl' that includes 'template-based problems' that would allow a user to practice on an endless supply of problems for a given topic.

We describe and implement a new package, 'swirl-tbp', that introduces 'template-based practice' problems to the 'swirl' framework. Specifically, 'swirl-tbp' extends 'swirl' by allowing instructors to include template-based problems in 'swirl' lessons. Template-based problems are problems that include numbers, variable names, or other features that are randomly generated at run-time. As a result, a user can be provided with an endless supply of practice problems that differ, e.g., with respect to the numbers used. This allows users to repeatedly practice problems in order to reinforce concepts and practice their problem-solving skills. We demonstrate the utility of 'swirl-tbp' by showing template-based problems for practicing basic R programming concepts such as vector creation and statistical concepts such as the calculation of probabilities involving normally distributed random variables.

Moderators

Pierre Lafaye De Micheaux

Université de Montréal

Speakers

Garrett Dancik

Eastern Connecticut State University

Thursday June 30, 2016 10:30am - 10:48am PDT
Barnes & McDowell & Cranston

Contributed talk, Teaching

10:48am PDT

Dynamic Data in the Statistics Classroom

The call for using {\em real} data in the classroom has long meant using datasets which are culled, cleaned, and wrangled prior to any student working with the observations. However, an important part of teaching statistics should include actually retrieving data. Nowadays, there are many different sources of data that are continually updated by the organization hosting the data website. The R tools to download such dynamic data have improved in such a way to make accessing the data possible even in an introductory statistics class. We provide four full analyses on dynamic data as well as an additional six sources of dynamic data that can be brought into the classroom.

Moderators

Pierre Lafaye De Micheaux

Université de Montréal

Speakers

Johanna Hardin

Pomona College

Thursday June 30, 2016 10:48am - 11:06am PDT
Barnes & McDowell & Cranston

Contributed talk, Teaching

11:06am PDT

Using Shiny for Formative Assessments

Shiny has become a popular approach for R developers to create interactive dashboards. Given the rich set of features available in Shiny, it has the capability for data entry and collection. This talk introduces a framework for using Shiny to conduct formative assessments whereby students both complete an assessment online as well as receive immediate feedback and scores on their performance. Examples using multiple choice assessments and Likert type self-report assessments will be provided along with feedback templates using R markdown for rapid development. The implications of this approach in course development will also be discussed.

Moderators

Pierre Lafaye De Micheaux

Université de Montréal

Speakers

Jason M Bryer

Executive Director, Excelsior College

Principal Investigator for FIPSE First in the World grant; Diagnostic Assessment and Achievement of College Skills (www.DAACS.net). Author of many R packages including likert, TriMatch, PSAboot, and multilevelPSA.

Thursday June 30, 2016 11:06am - 11:24am PDT
Barnes & McDowell & Cranston

Contributed talk, Teaching

11:24am PDT

Revolutionize how you teach and blog: add interactivity

R vignettes, blog posts and teaching materials are typically standard web pages generated with R Markdown. DataCamp has developed a framework to make this static content interactive: R code chunks are converted into an R-session backed editor so readers can experiment. This talk will explain the inner workings of the technology, as well as a the tutorial R package that makes the transition to interactive web pages seamless. Some hands-on examples will showcase the remarkable ease with which you can convert R Markdown documents, vignettes and Jekyll-powered blogs into interactive R playgrounds.

Moderators

Pierre Lafaye De Micheaux

Université de Montréal

Speakers

Filip Schouwenaars

DataCamp

Thursday June 30, 2016 11:24am - 11:42am PDT
Barnes & McDowell & Cranston

Contributed talk, Teaching

11:42am PDT

Using Jupyter notebooks with R in the classroom

When teaching statistics to non-programmers, the challenges of programming in R often exceed the challenge presented by new statistics concepts. This presentation will discuss a recent paper comparing methods for teaching programming (Jacobs, Gorman, Rees, and Craig, 2016), including the use of Jupyter notebooks. Jupyter notebooks are run in a server-client Notebook Application that allows editing and running Jupyter notebooks in a web browser. The audience will be able to execute a live Jupyter notebook running R code, demonstrating the most successful approach in their paper.

Moderators

Pierre Lafaye De Micheaux

Université de Montréal

Speakers

Tanya Tickel Schlusser

unaffiliated

Thursday June 30, 2016 11:42am - 12:00pm PDT
Barnes & McDowell & Cranston

Contributed talk, Teaching