MOOC Development: It Takes a Village

Chromebook Data Science

We’re finally able to announce the official launch of our newest MOOC, Chromebook Data Science, a set of 12 courses offered on the Leanpub platform. Jeff Leek has explained this program in a separate blog post in detail, but briefly here, these MOOCs are our attempt to minimize all barriers of entry into data science. These courses are pay what you want, so the entire course set can be taken at no cost. All the learning happens through a web browser, so any laptop or Chromebook can be used to complete the material. And, the content has been developed without the requirement for any background knowledge in computing.

The point of this blog post, however, is to thank and note all of the people outside of our group whose work helped make the development of this content possible.

Thank You

In addition to content developed by members of our group, we have built upon the work of others to generate the content in these courses. As we worked to develop the content, I did my best to keep an exhaustive list of everyone’s work we leaned on to develop this content. This post is my humble attempt to thank all these people.

Big Thanks

It probably goes without saying that much of the content generated has been either directly influenced or indirectly inspired by the work of Hadley Wickham and Jenny Bryan. Specifically, however, we would be remiss not to thank Hadley for both writing R for Data Science and for his contributions to the tidyverse packages. Additionally, we rely heavily on Jenny Bryan’s instructional approach to teaching version control and the googlesheets package throughout the Course Set.

Beyond this, I think the best way to give individual thanks would be by course in our MOOC. This way, each of you knows where your work has been used and can most easily see how we’ve used and attributed your work.

Data Tidying

In the Data Tidying course, learners in this course set are taught within the tidy data framework. These concepts would not be easily accessible and programatically-relatable to new learners without Hadley’s (and others’!) contributions to the tidyverse set of packages.

Additionally, we used examples of data tidying in this course from Miles McBain and Sharla Gelfand to demonstrate what untidy data are and what it looks like once those data have been tidied. Thank you to Sharla and Miles for their wonderful blog posts demonstrating data tidying:

Lastly, in this course we relied heavily on Suzan Baert’s four amazing dplyr tutorials. I’ve attributed her work throughout the lessons and have linked to her blog posts in our courses. If you haven’t looked through them yet, I highly recommend it: Part 1 Part 2 Part 3 Part 4

Data Visualization

Data Visualziation is taught in this course set using ggplot2 exclusively, so more thanks to Hadley for his work and all contributors to the ggplot2 package!

Additionally, we used a graph from a blog post by Lisa Charlotte Rost to demonstrate how to take a plot from exploratory and unpolished to polished and ready for publication. If you’re unfamiliar with Lisa Charlotte Rost’s work in data visualization (spoiler: she’s amazing!), check out the Datawrapper and their blog.

Getting Data

In the Getting Data course, we owe thanks to:

Data Analysis

In the Data Analysis course, we relied heavily on David Robinson’s blogpost, Text analysis of Trump’s tweets confirms he writes only the (angrier) Android half, as a wonderful example of how to formulate a data science question and determine if you have the data you need, and for his contributions to the tidytext package.

These lessons in this course also benefitted from:

Written & Oral Communication in Data Science

In Written and Oral Communication in Data Science, we utilized the work of others as examples of how to communicate effectively as a data scientist:

Getting a Job in Data Science

In Getting a Job in Data Science we’re thankful for contributions from: