SURV675

Modern Workflows in Data Science

Large data, fast pace of production, and collaboration are hallmarks of the new data environment. In this context, researchers must have a good understanding of data workflows and they must ensure consistent and reproducible practices in order to collaborate and consistently produce insights. This course deals with some of these essential topics. We will discuss the main types of workflows in data and survey sciences and how tools such as GitHub can enhance collaboration and insure reproducibility. We will also discuss the use of reproducible documents such as Rmarkdown or Jupyter Notebooks before covering how to work with distributed data using Spark. We will finish the course by discussing the use of dashboards and how to develop such tools using R Shiny.

Spring 2024

0 reviews
Average rating: N/A

Spring 2023

0 reviews
Average rating: N/A

* "W"s are considered to be 0.0 quality points. "Other" grades are not factored into GPA calculation. Grade data not guaranteed to be correct.