forked from James-SR/MITx-14.310x
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Module1.Rmd
132 lines (89 loc) · 3.03 KB
/
Module1.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
# Module 1: Introduction to the Course
***
**Module Sections:**
* Welcome to the Course
* Introduction to R
* Introductory Lecture - Data is Beautiful, Insightful, Powerful, Deceitful
* Finger Exercises
* Module 1: Homework
Module Content:
* [Module 1 Slides](./files/M1/Lecture_Slides_01.pdf)
* [Homework 1 Background Paper - The Persistent Effects Of Peru’s Mining Mita](./files/M1/M1Paper.pdf)
* [R Instructions](./files/M1/R_Instructions.pdf)
* [Intro to R Zip File](./files/M1/14_310x_Intro_to_R_.zip.pdf)
* [Citations Data for Homework 1]("./files/CitesforSara.csv")
## Introduction to R
First we setup the environment and install the course files
```{r, eval = FALSE}
library(swirl)
install_course_zip("./files/M1/14_310x_Intro_to_R_.zip",multi=FALSE)
swirl()
```
IF z is a three number vector e.g.
```{r}
z <- c(1.1, 9, 3.14)
```
If we take the square root of z - 1 and assign it to a new variable called my_sqrt e.g.
```{r}
my_sqrt <- sqrt(z - 1)
```
The result is a vector of length three e.g.
```{r}
my_sqrt
```
Next, if we create a new variable called my_div that gets the value of z divided by my_sqrt.
```{r}
my_div <- z / my_sqrt
```
The first element of my_div is equal to the first element of z divided by the first element of my_sqrt, and so on...
```{r}
my_div
```
When given two vectors of the same length, R simply performs the specified arithmetic operation (`+`, `-`, `*`, etc.) element-by-element. If the vectors are of different lengths, R 'recycles' the shorter vector until it is the same length as the longer vector.
If the length of the shorter vector does not divide evenly into the length of the longer vector, R will still apply the 'recycling' method, but will
throw a warning.
```{r}
c(1, 2, 3, 4) + c(0, 10, 100)
```
### Module 1 Homework
This is a sample of some of the homework answers. Some questions were observational or required interpretation of maps for example, as such these are not inluded here.
```{r}
library(tidyverse)
papers <- as_tibble(read_csv("./files/M1/CitesforSara.csv"))
```
#17
install.packages("tidyverse")
library(tidyverse)
papers <- as_tibble(read_csv("/Users/Evangeline/Desktop/CitesforSara.csv"))
View(papers)
papers_select <- select(papers,journal, year, cites, title, au1)
papers_select
Q. 19 Let's take a look at some of the most popular papers.
Using the filter() method, how many records exist where there are greater than or equal to 100 citations?
```{r}
#First lets look at our data
head(papers)
arrange(papers,desc(cites), title)
papers %>%
filter(cites >= 100)
```
Q.20 Use the group_by() function to group papers by journal. How many total citations exist for the journal "Econometrica"?
```{r}
papers %>%
group_by(journal) %>%
summarise(sum(cites))
#or
summarize(group_by
(papers, journal),
SumOfCites = sum(cites))
```
Q.21 How many distinct primary authors (au1) exist in this dataset?
```{r}
papers %>%
summarise(n_distinct(au1))
#or
n_distinct(papers$au1)
```
#21
papers_female <- select (papers,contains("female"))
papers_female