-
Notifications
You must be signed in to change notification settings - Fork 1
/
readme.Rmd
138 lines (75 loc) · 6.78 KB
/
readme.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
---
title: ':milky_way: Material for an Upcoming Course in EDA'
output:
github_document:
toc: true
---
:rocket: Work in progress :construction_worker:
# :notes: Info
The course will be hands-on. We have access to a computer room, but if it is possible, I would **suggest** you to **bring your own laptop**. In this way you will be sure to have R and Rstudio installed on your laptop, and after the workshop you will be ready to start making your own data explorations.
## :hammer: Tools
:floppy_disk: You can install R and Rstudio to your laptop.
- :link: [R - CRAN](https://cran.r-project.org/)
- :link: [Rstudio](https://rstudio.com/products/rstudio/download/#download)
Afterwards, you can install the [Tidyverse :milky_way:](https://www.tidyverse.org/), which collects most of the packages that we will use for our explorations. To install it, open Rstudio and type in your R console:
```{r, eval = FALSE}
install.packages("tidyverse")
```
If you get any :x: error message, we will fix it together :sparkler:.
Otherwise, [Rstudio :cloud: cloud](https://rstudio.cloud/) let's you run Rstudio in cloud computing.
# :snowboarder: Slides
1. :link: [Introduction](https://othomantegazza.github.io/eda-class/slides/00-intro.html#1)
My contact details and not much else...
1. :link: [Meet R](https://othomantegazza.github.io/eda-class/slides/01-meet-r.html#1)
What is an object in R? What is a variable? Why do we need functions?
1. :link: [Load and Manipulate Data - *Tidyverse, part 1*](https://othomantegazza.github.io/eda-class/slides/02-intro-to-tidyverse.html#1)
A quick introduction to the tidyverse, including how to manipulate data with [dplyr](https://dplyr.tidyverse.org/articles/dplyr.html) and how to [pipe](https://magrittr.tidyverse.org/) many steps of your analysis.
1. :link: [Visualize Data - *Tidyverse, part 2*](https://othomantegazza.github.io/eda-class/slides/03-intro-to-the-tidyverse.html#1)
Build a graphical representation of your data with ggplot2.
1. :link: [Clean Data - *Tidyverse, part 3*](https://othomantegazza.github.io/eda-class/slides/04-intro-to-tidyverse.html#1)
Most of the time you'll need to clean and reashape your data with [Tidyr](https://tidyr.tidyverse.org/) and [Janitor](https://sfirke.github.io/janitor/).
1. :link: [More practice - *Tidyverse, part 4*](https://othomantegazza.github.io/eda-class/slides/05-intro-to-the-tidyverse.html#1)
Practice more Exploratory Data Analysis with Open Data from the City of Milan.
1. :link: [Your Turn!](https://othomantegazza.github.io/eda-class/slides/06-your-turn.html)
Pick a dataset and explore it!
# :books: Resources
The R community is active online, and committed to create a friendly and welcoming environment for new everybody.
This includes writing outsanding :book: open access material that you can use to learn R :whale:.
## :rice: R Building Blocks
- :link: [R programming for Data Science - Roger D. Peng](https://bookdown.org/rdpeng/rprogdatascience/) - :tiger: Jump start your R!
- :link: [Advanced R - Hadley Wickham](https://adv-r.hadley.nz/) - :elephant: Everything you wish to know about R.
## :milky_way: R for Data Science
:saxophone: Remember to read the articles on the packages' website!! :saxophone:
- :link: [R for Data Science - Grolemund, Wickham](https://r4ds.had.co.nz/) - :bird: An overview of most data science topics, with great tips.
- :link: [Introduction to Statistical Learning in R - Gareth James et al.,](https://faculty.marshall.usc.edu/gareth-james/ISL/) - :dog: Kick start you statistical models.
Check the [:books: bookdown](https://bookdown.org/) repository for more books on data science, including [:earth_africa: geocomputation](https://geocompr.robinlovelace.net/), [:tophat: forecasting](https://otexts.com/fpp2/) and [:pick: text mining](https://www.tidytextmining.com/)!
## :art: Visualization in R
- :link: [Data Visualization - Kieran Healy](https://socviz.co) - :tropical_fish: Communication oriented data visualization in R.
- :link: [R Graphics Cookbook - Winston - Chang](https://r-graphics.org/) - :octopus: Practical introduction to visualization with ggplot2.
Also, check the [Viz chapters in "R for Data science"](https://r4ds.had.co.nz/data-visualisation.html) (see above) :point_up:.
## :blossom: Life Science
- :link: [HarvardX Biomedical Data Science Open Online Training - Love, Irizarry](https://rafalab.github.io/pages/harvardx.html) - :snail: Full course on R for life science.
- :link: It goes together with [this book](https://rafalab.github.io/dsbook/).
## :hibiscus: Extra
Did I mention that the R community is great? Online you can find wonderful learning material.
### Gina Reynolds' Flipbooks
by [@EvaMaeRay](https://twitter.com/EvaMaeRey)
- :link: [GGplot flipbook](https://evamaerey.github.io/ggplot_flipbook/ggplot_flipbook_xaringan.html#1).
- :link: [Tidyverse in Action](https://evamaerey.github.io/tidyverse_in_action/tidyverse_in_action.html#1)
- :link: [Interactive Maps](https://evamaerey.github.io/little_flipbooks_library/leaflet/leaflet#1)
...and [Others](https://github.com/EvaMaeRey/little_flipbooks_library)
### Dataviz and R blogs
- :link: [Alison Hill - Data Scientist & Professional Educator](https://alison.rbind.io/),
### Data Art and Great Unconventional Viz
- :link: [Fronkonstion - Experiments in R](https://fronkonstin.com/), by [@aschinchon](https://twitter.com/aschinchon).
- :link: [Data Imaginist](https://www.data-imaginist.com/), by [@thomasp85](https://twitter.com/thomasp85).
- :link: [Chi's Impe[r]fect Blog](https://chichacha.netlify.com/), by [@chisatini](https://twitter.com/chisatini)
Check out also the work of [Cédric Scherer](https://twitter.com/CedScherer), [Sil Aarts](https://silaarts.netlify.com/post/config-file/), [Jake Kaupp](https://twitter.com/jakekaupp) and [many other TidyTuesdaers](https://nsgrantham.shinyapps.io/tidytuesdayrocks/) with Neal Grantham's app.
*This is a mostly incomplete list, suggestions are welcome!* :raised_hands:
# :violin: Practice
- :link: [Tidy Tuesday](https://github.com/rfordatascience/tidytuesday) - :fish_cake: Best community, weekly social data exercises in R. (check also the [R4DS learning community](https://www.jessemaegan.com/post/r4ds-the-next-iteration/))
- :link: [Kaggle](https://www.kaggle.com/) - :shaved_ice: Advanced Data Science and Machine Learning community.
- :link: [Data is Beautiful - Reddit](https://www.reddit.com/r/dataisbeautiful/) :oden: - Monthly data visualization competitions.
# :raised_hands: Acknowledgements
I would like to thank the [University of Milano](https://www.unimi.it/it) and to the [PhD School in Molecular abnd Cell Biology](http://eng.dbs.unimi.it/ecm/home/teaching/doctoral-schools/molecular-and-cellular-biology) for financing and hosting this workshop. Thanks to [Accurat](https://www.accurat.it/) for the great support.
:mortar_board: Best!