Tidyverse workshop for Northwestern Data Science and Programming Workshops Fall 2019
Instructor: Katie Evans
- Collection of packages for data manipulation, exploration, and visualization that share a common syntax
- Intended to make data scientists more productive by guiding them through workflows
- Allows for connections between tools
- dplyr: The dplyr package is the most useful package in R for data manipulation. One of the greatest advantages of the package is that you can use the pipe function (%>%) to combine different functions.
- tidyr: The tidyr package complements dplyr perfectly. It boosts the power of dplyr for data manipulation and pre-processing.
- readr: The readr package is used to import and export data as tibbles in R.
- stringr: The stringr package is used for strings. It provides a cohesive set of functions designed to make working with strings as easy as possible.
- ggplot2: Data scientists universally love using ggplot2 to produce their charts and visualizations!
- lubridate: The lubridate package is the best way to deal with dates and times in R! From converting strings to dates to calculating hours between two time points.
- purrr: The purrr package in R provides a complete toolkit for enhancing R’s functional programming. We can use the functions provide by purrr to avoid many loops with just one line of code.
- forecats: The forecats package is dedicated to dealing with categorical variables or factors.
- broom: The broom package takes the messy output of built-in functions in R and turns them into tidy dataframes