-
Notifications
You must be signed in to change notification settings - Fork 5
/
Part2-Analysis.R
203 lines (156 loc) · 7.23 KB
/
Part2-Analysis.R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
#
# Let's Practice!
# ===============
#
# In this section we will cover:
#
# 1. Working directories
# 2. Reading in data
# 3. Finding help
# 4. Basic data manipulation (renaming, subsetting, rotating)
#
#
# Part I: Working Directories
# ---------------------------
#
# A working directory is the folder on your computer where you are currently
# working. You can find out your current working directory by typing `getwd()`
getwd()
# If you've downloaded and un-zipped this directory to your desktop, you might
# see something like `/Users/<yourname>/Desktop/IntroR`. This is the default
# place where R will begin reading and writing files. For example, you can use
# the function `list.files()` to view the files in your current working
# directory. These are the same files that we downloaded earlier. If you're
# using Rstudio, you can compare the file list with the "Files" tab in the
# bottom right panel.
#
# In order to use `list.files()`, we should provide it with a file path. We can
# provide it a ".", which means "here" to most computer programs.
list.files(".")
# You can see that the first entry in here is "data". This is where we've placed
# the fungicide example data.
list.files("data")
# Part II: Reading in Data
# ------------------------
#
#
# We can use the `read.table()` function to read these data in to R. It's
# important to remember that while in R, these data are simply a copy kept *in
# memory*, not on the disk, so we don't have to worry too much about
# accidentally deleting the data :).
#
# So, how do we actually USE the `read.table()` function? A good first step to
# figuring out how you can use a function is to look at it's help page. The way
# you can do that is by typing either help("function_name") or ?function_name.
stop("
Type ?read.table and answer these three questions:
1. What does it do? (Description)
2. What are the first three arguments and their defaults? (Usage/Arguments)
3. What does it return? (Value)
")
# In order to read our data into R, we will need to provide three things:
#
# 1. The path to the data set : data/FungicideExample.csv
# 2. If the first row are column names : yes
# 3. The separator for each cell in the data : comma
#
# Now that we have these elements, we can read the data into a variable, which
# we can call "fungicide". Once we do this, we can check the dimensions to make
# sure that we have all of the data.
fungicide <- read.csv("data/FungicideExample.csv", header = TRUE, sep = ",")
nrow(fungicide)
ncol(fungicide)
# We can print the data to screen by simply typing its name
fungicide
# We can also use the `str()` function (short for "structure") to have a broad
# overview of what our data looks like. This is useful for data frames with many
# columns.
str(fungicide)
# Part III: Finding the right tool for the job
# --------------------------------------------
#
# The data presented here are from
# http://www.apsnet.org/edcenter/advanced/topics/EcologyAndEpidemiologyInR/DiseaseProgress/Pages/StripeRust.aspx
# and consist of three cultivars of wheat treated with different fungicides
# Jagger wheat is a succeptible variety whereas Cutter and 2137 are both
# resistant varieties. It is assumed that the fungicide will only be able to
# prevent new infections for two weeks after applicaton. With these data, we
# want to answer the following questions:
#
# 1. How does the size of the fungicide effect compare to the effect of
# cultivar?
# 2. Would it be the same as cultivar?
#
# To answer these questions, we will use the summary statistic, Area Under the
# Disease Progress Curve (AUDPC).
#
# So, how do you calculate this? You could code the trapezoid rule yourself, OR
# you could find a package that was designed for this.
stop("
Do an internet search for AUDPC in R. What did you find?
")
# The first thing that likely popped up was the function `audpc()` in the
# *agricolae* package. If we want to use it, we can download the package to our
# computer with the function `install.packages()`. This will install a package
# from CRAN and place it into your R *Library*. Where is your R library? Type
# `.libPaths()` to find out!
#+ eval = FALSE
.libPaths()
install.packages("agricolae", repos = "https://cran.rstudio.com")
# Now that we have the *agricolae* package installed, we can load into our R
# workspace using the `library()` function, which looks in our library and loads
# the functions and data sets within that package. You can find out about what
# functions are avialable with help(package = "agricolae").
#
# To accomplish our task, however, we need to find out how to use the `audpc()`
# function. First we should load the agricolae package
library("agricolae")
# Now we can look up help for `audpc()`
#+ eval = FALSE
?audpc
# This function will take in a data frame of severity and a vector of dates.
# Since our vector of dates is in the first column, we will need to manipulate
# the data frame by subsetting.
jdate <- fungicide$Julian.Date # this is also the first column
jdate
fungicide.audpc <- audpc(evaluation = fungicide[, -1], dates = jdate, type = "relative")
# Well this doesn't look good. What can this error message mean? Let's take a
# look at the audpc help page one more time; this time, we'll look at the
# Examples section. Try to copy and paste the code one line at a time and see
# what happens. Check the evaluaton data relative to the dates and see how it's
# relevant.
stop("
Stop and look at the examples from ?audpc.
")
# We can see from example 3, that the data must be arranged where dates are in
# separate columns. We need to transpose our data. To do this, we can use the
# `t()` function:
t(fungicide[, -1])
# Now we can plug this into the `audpc()` function:
fungicide.audpc <- audpc(evaluation = t(fungicide[, -1]), dates = jdate, type = "relative")
fungicide.audpc
# This gives us a vector with the AUDPC values per treatment. The vector format
# does not make it particularly easy to compare the effect of cultivar or
# treatment. A better format to store these data would be in a table. Since all
# the values are numeric, we can place them into a matrix using the `matrix()`
# function. We are going to put the samples in rows and experiments in columns.
fungicide.res <- matrix(fungicide.audpc, nrow = 3, ncol = 2, byrow = TRUE)
fungicide.res
# We have our result, but there are no labels for the rows and columns. Notice
# how we selected `byrow = TRUE`. This means that the matrix was filled row by
# row. So the first two elements of the vector would go in the first row, the
# next two would go in the second, and so on and so forth.
names(fungicide.audpc)[1:2]
names(fungicide.audpc)[3:4]
names(fungicide.audpc)[5:6]
# Since this is a small matrix, we can easily name these rows and columns
# ourselves using the `rownames()` and `colnames()` functions.
rownames(fungicide.res) <- c("2137", "Cutter", "Jagger")
colnames(fungicide.res) <- c("Control", "Treated")
fungicide.res
# If we wanted to use these data as a table in a paper, we should export it to a
# csv file. We do this using the function `write.table()`
dir.create("results")
write.table(fungicide.res, file = "results/audpc.csv", sep = ",",
col.names = NA,
row.names = TRUE)