-
Notifications
You must be signed in to change notification settings - Fork 5
/
index.qmd
69 lines (58 loc) · 7.8 KB
/
index.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
---
title: "BST 260 Introduction to Data Science"
---
## Course Information
* Instructor: [Rafael A.Irizarry](https://rafalab.dfci.harvard.edu/)
* Teaching fellows: Corri Sept, Nikhil Vytla, and Yuan Wang
* Location: Kresge 202A and 202B, Harvard School of Public Health
* Date and time: Mon & Wed 9.45 - 11:15am
* Textbooks: <https://github.com/rafalab/dsbook-part-1>, <https://github.com/rafalab/dsbook-part-2>
* Slack: <https://bst260fall2024.slack.com/>
* Canvas: <https://canvas.harvard.edu/courses/143922/pages/Course%20Information>
* GitHub repo: <https://github.com/datasciencelabs/2024>
* Remember to read the [syllabus](syllabus.qmd), [listen to SD](https://www.youtube.com/embed/aL_fP5axQV4).
## Lectures
Lecture slides, class notes, and problem sets are linked below. New material is added approximately on a weekly basis.
| Dates | Topic | Slides | Reading |
|:-------------------|:---------|:----------|:----------|
| Sep 04 | Productivity Tools| [Intro](slides/00-intro.qmd), [Unix](slides/productivity/01-unix.qmd) | Installing R and RStudio on [Windows](https://teacherscollege.screenstepslive.com/a/1108074-install-r-and-rstudio-for-windows) or [Mac](https://teacherscollege.screenstepslive.com/a/1135059-install-r-and-rstudio-for-mac), [Getting Started](http://rafalab.dfci.harvard.edu/dsbook-part-1/R/getting-started.html), [Unix](http://rafalab.dfci.harvard.edu/dsbook-part-1/productivity/unix.html)|
| Sep 09, Sep 11| Productivity Tools| [RStudio](slides/productivity/02-rstudio.qmd), [Quarto](slides/productivity/03-quarto.qmd), [Git and GitHub](slides/productivity/04-git.qmd) | [RStudio Projects, Quarto](https://rafalab.dfci.harvard.edu/dsbook-part-1/productivity/reproducible-projects.html), [Git](http://rafalab.dfci.harvard.edu/dsbook-part-1/productivity/git.html) |
| Sep 16, Sep 19 | R | [R basics](slides/R/05-r-basics.qmd), [Vectorization](slides/R/06-vectorization.qmd) | [R Basics](http://rafalab.dfci.harvard.edu/dsbook-part-1/R/R-basics.html), [Vectorization](http://rafalab.dfci.harvard.edu/dsbook-part-1/R/programming-basics.html#sec-vectorization) |
| Sep 23 | R | [Tidyverse](slides/R/07-tidyverse.qmd), [ggplot2](slides/R/08-ggplot2.qmd) | [dplyr](http://rafalab.dfci.harvard.edu/dsbook-part-1/R/tidyverse.html), [ggplot2](http://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/ggplot2.html) |
| Sep 25 | R | [Tyding data](slides/R/09-tidyr.qmd) | [Reshaping Data](http://rafalab.dfci.harvard.edu/dsbook-part-1/wrangling/reshaping-data.html)
| Sep 30 | Wrangling | [Intro](slides/wrangling/10-intro-to-wrangling.qmd), [Data Importing](slides/wrangling/11-importing-files.qmd), [Dates and Times](slides/wrangling/12-dates-and-times.qmd), [Locales](slides/wrangling/13-locales.qmd), [Data APIs](slides/wrangling/14-data-apis.qmd), [Web scraping](slides/wrangling/15-web-scraping.qmd), [Joining tables](slides/wrangling/16-joining-tables.qmd) | [Importing data](https://rafalab.dfci.harvard.edu/dsbook-part-1/R/importing-data.html), [dates and times](http://rafalab.dfci.harvard.edu/dsbook-part-1/wrangling/dates-and-times.html), [Locales](https://rafalab.dfci.harvard.edu/dsbook-part-1/wrangling/locales.html), [Joining Tables](http://rafalab.dfci.harvard.edu/dsbook-part-1/wrangling/joining-tables.html), [Extracting data from the web](https://rafalab.dfci.harvard.edu/dsbook-part-1/wrangling/web-scraping.html)|
Oct 02 | Data visualization | | [Distributions](http://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/distributions.html), [Dataviz Principles](http://rafalab.dfci.harvard.edu/dsbook-part-1/dataviz/dataviz-principles.html) |
| Oct 07, Oct 09 | Probability | | [Monte Carlo](http://rafalab.dfci.harvard.edu/dsbook-part-2/prob/continuous-probability.html#monte-carlo), [Random Variables & CLT](http://rafalab.dfci.harvard.edu/dsbook-part-2/prob/random-variables-sampling-models-clt.html)|
| Oct 16 | Midterm 1 | | Covers material from Sep 04-Oct 11|
| Oct 21, Oct 23 | Inference | | [Parameters & Estimates](http://rafalab.dfci.harvard.edu/dsbook-part-2/inference/parameters-estimates.html), [Confidence Intervals](http://rafalab.dfci.harvard.edu/dsbook-part-2/inference/confidence-intervals.html) |
| Oct 28, Oct 30 | Statistical Models | | [Data-driven Models](http://rafalab.dfci.harvard.edu/dsbook-part-2/inference/models.html), [Bayesian Statistics](http://rafalab.dfci.harvard.edu/dsbook-part-2/inference/bayes.html), [Hierarchical Models](http://rafalab.dfci.harvard.edu/dsbook-part-2/inference/hierarchical-models.html) |
| Nov 04, Nov 06 | Linear models | | [Regression](http://rafalab.dfci.harvard.edu/dsbook-part-2/linear-models/regression.html), [Multivariate Regression](http://rafalab.dfci.harvard.edu/dsbook-part-2/linear-models/multivariate-regression.html) |
| Nov 13 | Linear models | | [Measurement Error Models](http://rafalab.dfci.harvard.edu/dsbook-part-2/linear-models/measurement-error-models.html), [Treatment Effect Models](http://rafalab.dfci.harvard.edu/dsbook-part-2/linear-models/treatment-effect-models.html), [Association Tests](http://rafalab.dfci.harvard.edu/dsbook-part-2/linear-models/association-tests.html), [Association Not Causation](http://rafalab.dfci.harvard.edu/dsbook-part-2/linear-models/association-not-causation.html) |
| Nov 18, Nov 20 | High dimensional data| | [Matrices in R](https://rafalab.dfci.harvard.edu/dsbook-part-2/highdim/matrices-in-R.html), [Applied Linear Algebra](https://rafalab.dfci.harvard.edu/dsbook-part-2/highdim/linear-algebra.html), [Dimension Reduction](https://rafalab.dfci.harvard.edu/dsbook-part-2/highdim/dimension-reduction.html) |
| Nov 25 | Midterm 2 | | Midterm 2: cover material from Sep 04-Nov 22|
| Dec 02, Dec 04 | Machine Learning | |[Notation and terminology](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/notation-and-terminology.html), [Evaluation Metrics](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/evaluation-metrics.html), [conditional probabilities](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/conditionals.html), [smoothing](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/smoothing.html) |
| Dec 09, Dec 11 | Machine Learning | |[Resampling methods](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/resampling-methods.html), [ML algorithms](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/algorithms.html), [ML in practice](https://rafalab.dfci.harvard.edu/dsbook-part-2/ml/ml-in-practice.html) |
| Dec 16, Dec 18 | Other topics | | |
## Problem sets
| Problem set| Topic | Due Date | Difficulty |
|:-------|:--------------|:---------|:----------|
| [01](psets/pset-01-unix-quarto.qmd) | Unix, Quarto|Sep 11 | easy |
| [02](psets/pset-02-r-vectorization.qmd) | R | Sep 19 |easy |
| [03](psets/pset-03-tidyverse.qmd) | Tidyverse | Sep 27 |medium |
| [04](psets/pset-04-wrangling.qmd) | Wrangling | Oct 4 | medium |
| 05 | Covid 19 data visualization | Oct 11 |medium |
| 06 | Probability | Oct 25 | easy |
| 07 | Predict the election |Nov 04 | hard |
| 08 | Confidence intervals for excess mortality |Nov 15 | hard |
| 09 | Matrices | Nov 22 | easy |
| 10 | Digit reading | Dec 13 | hard |
|Final Project| Your choice | Dec 20 | hard |
## Office hour times
| Meeting | Time | Location |
|---------|----------|------------------------|
| Rafael Irizarry | Mon 8:45-9:45AM | Kresge 203 |
| Corri Sept | Tue | 3:00-4:00PM | Kresge 201 |
| Nikhil Vytla | Wed | 2:00-3:00PM | Kresge 201 |
| Yuan Wang | Fri | 1:00-2:00PM | [Zoom](https://harvard.zoom.us/j/98901120095?pwd=CrhPTA0Nco2rY6ggzYIfq6kQmP4VZn.1) |
## Acknowledgments
We thank [Maria Tackett](https://github.com/matackett) and [Mine Çetinkaya-Rundel](https://github.com/mine-cetinkaya-rundel) for sharing their [web page template](https://github.com/rstudio-conf-2022/teach-ds-course-website/tree/main), which we used in creating this website.