forked from DragonflyStats/Coursera-ML
-
Notifications
You must be signed in to change notification settings - Fork 0
/
StatsOneWeek1.tex
400 lines (291 loc) · 10.5 KB
/
StatsOneWeek1.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
\documentclass[11pt]{article} % use larger type; default would be 10pt
\usepackage{framed}
\usepackage[utf8]{inputenc} % set input encoding (not needed with XeLaTeX)
\usepackage{geometry} % to change the page dimensions
\geometry{a4paper} % or letterpaper (US) or a5paper or....
\usepackage{graphicx} % support the \includegraphics command and options
% \usepackage[parfill]{parskip} % Activate to begin paragraphs with an empty line rather than an indent
%%% PACKAGES
\usepackage{booktabs} % for much better looking tables
\usepackage{array} % for better arrays (eg matrices) in maths
\usepackage{paralist} % very flexible & customisable lists (eg. enumerate/itemize, etc.)
\usepackage{verbatim} % adds environment for commenting out blocks of text & for better verbatim
\usepackage{subfig} % make it possible to include more than one captioned figure/table in a single float
% These packages are all incorporated in the memoir class to one degree or another...
\usepackage{framed}
%%% HEADERS & FOOTERS
\usepackage{fancyhdr} % This should be set AFTER setting up the page geometry
\pagestyle{fancy} % options: empty , plain , fancy
\renewcommand{\headrulewidth}{0pt} % customise the layout...
\lhead{}\chead{}\rhead{}
\lfoot{}\cfoot{\thepage}\rfoot{}
%%% SECTION TITLE APPEARANCE
\usepackage{sectsty}
\allsectionsfont{\sffamily\mdseries\upshape} % (See the fntguide.pdf for font help)
% (This matches ConTeXt defaults)
%%% ToC (table of contents) APPEARANCE
\usepackage[nottoc,notlof,notlot]{tocbibind} % Put the bibliography in the ToC
\usepackage[titles,subfigure]{tocloft} % Alter the style of the Table of Contents
\renewcommand{\cftsecfont}{\rmfamily\mdseries\upshape}
\renewcommand{\cftsecpagefont}{\rmfamily\mdseries\upshape} % No bold!
\begin{document}
\tableofcontents
\newpage
%------------------------------------------------------%
\section{Statistics One -Lecture Topics}
\begin{description}
\item [Lecture 1:] Experimental research
\item [Lecture 2:] Correlational research
\item [Lecture 3:] Variables, distributions, and scales
\item [Lecture 4:] Summary statistics
\item [Lecture 5:] Correlation
\item [Lecture 6:] Measurement
\item [Lecture 7:] Introduction to regression
\item [Lecture 8:] Null hypothesis significance testing
\item [Lecture 9:] The central limit theorem
\item [Lecture 10:] Confidence intervals
\item [Lecture 11:] Multiple regression
\item [Lecture 12:] The general linear model
\item [Lecture 13:] Moderation
\item [Lecture 14:] Mediation
\item [Lecture 15:] Student’s t-test
\item [Lecture 16:] Analysis of variance (ANOVA)
\item [Lecture 17:] Factorial ANOVA
\item [Lecture 18:] Repeated measures ANOVA
\item [Lecture 19:] Chi-square tests
\item [Lecture 20:] Binary logistic regression
\item [Lecture 21:] Assumptions revisited
\item [Lecture 22:] Non-parametric statistics
\item [Lecture 23:] Generalized linear model
\item [Lecture 24:] Course summary
\end{description}
%------------------------------------------------------%
\section{Statistics One - Lab Topics}
\begin{description}
\item[ Lab 1:] Introduction to R
\item[ Lab 2:] Histograms and summary statistics
\item[ Lab 3:] Scatterplots and correlations
\item[ Lab 4:] Regression
\item[ Lab 5:] Confidence intervals
\item[ Lab 6:] Multiple regression
\item[ Lab 7:] Moderation and mediation
\item[ Lab 8:] Group comparisons (t-tests, ANOVA, post-hoc tests)
\item[ Lab 9:] Factorial ANOVA
\item[ Lab 10:] Chi-square
\item[ Lab 11:] Non-parametric tests
\item[ Lab 12:] Non-linear regression
\end{description}
%------------------------------------------------------%
\newpage
\section{Statistics One - Week 1}
\subsection{Using Packages}
\begin{itemize}
\item \texttt{install.packages()}
\item \texttt{library()}
\end{itemize}
Suppose we wish to use the \textbf{\emph{caret}} package for some statistical analyses.
As this package is not part of the base \texttt{R} installation, we must install it.
We use the \texttt{install.packages()} command to perform this operation. (N.B. notice the use of the plural packages,as this function may be used for downloading several packages simultaneously, and also the use of the quotation marks.)
\bigskip
Some options will be presented to the user, in particular, the choice of CRAN mirror.
Simply select the mirror of the country you are in, or nearest too.
\bigskip
Some packages require the last version of \texttt{R} to be installed. This problem is usually avoided by regularly updating your version of \texttt{R} on an ongoing basis.
\bigskip
The package may be installed on your PC, but that does not mean that \texttt{R} may use that package. The package must be called using the \texttt{library()} command (this time not using quotation marks)
\begin{framed}
\begin{verbatim}
install.packages("caret")
library(caret)
\end{verbatim}
\end{framed}
\subsection{What packages are avaiulable?}
Q3 : What should you type in the R console to check what packages you have installed and loaded on your computer?
To check what packages you have installed and \textbf{loaded} on your computer, you would use the \texttt{search()} command, with no additional arguments.
\begin{framed}
\begin{verbatim}
library()
installed.packages()
search()
\end{verbatim}
\end{framed}
\begin{verbatim}
> search()
[1] ".GlobalEnv" "package:nlme" "package:stats"
[4] "package:graphics" "package:grDevices" "package:utils"
[7] "package:datasets" "package:methods" "Autoloads"
[10] "package:base"
\end{verbatim}
\subsection{Getting Help}
The command to get help about an object, data set or function is simply \texttt{help()}.
%Q4 : What should you type to get help about the “data.frame” function?
\begin{framed}
\begin{verbatim}
help(sort)
help(list)
help(iris)
?sort
?list
?iris
\end{verbatim}
\end{framed}
\subsection{Sequences}
\begin{framed}
\begin{verbatim}
1:10
seq(1,10)
10:20
seq(1,10,length=3)
seq(1,10,by=1.5)
\end{verbatim}
\end{framed}
\begin{verbatim}
> 1:10
[1] 1 2 3 4 5 6 7 8 9 10
> seq(1,10)
[1] 1 2 3 4 5 6 7 8 9 10
>
> 10:20
[1] 10 11 12 13 14 15 16 17 18 19 20
> 20:10
[1] 20 19 18 17 16 15 14 13 12 11 10
>
> seq(1,10,length=3)
[1] 1.0 5.5 10.0
> seq(1,10,by=1.5)
[1] 1.0 2.5 4.0 5.5 7.0 8.5 10.0
\end{verbatim}
\subsection{data Frames}
Q5 : Create two vectors, the first one named "numbers" including all natural numbers from 1 to 10, and the second one named "words" containing the following series:"One", "Two", "Three", "Four", "Five", "Six", "Seven", "Eight", "Nine", "Ten".
\begin{framed}
\begin{verbatim}
words <-c("One", "Two", "Three", "Four", "Five", "Six", "Seven", "Eight", "Nine", "Ten")
\end{verbatim}
\end{framed}
From these two vectors, create a dataframe "nw" with each vector as a separate column. What should you type to check the attributes of "nw"?
\subsection{attributes of a data frame}
The \texttt{attribute()} function can access an object's attributes, and returns the object's attribute list. The following code demonstrates how to obtain details on attributes for the iris and Titanic data sets.
\begin{framed}
\begin{verbatim}
attributes(iris)
attributes(Titanic)
\end{verbatim}
\end{framed}
The output for the Titanic data set should look like this.
\begin{verbatim}
> attributes(Titanic)
$dim
[1] 4 2 2 2
$dimnames
$dimnames$Class
[1] "1st" "2nd" "3rd" "Crew"
$dimnames$Sex
[1] "Male" "Female"
$dimnames$Age
[1] "Child" "Adult"
$dimnames$Survived
[1] "No" "Yes"
$class
[1] "table"
\end{verbatim}
\newpage
%-----------------------------------------------------------------%
\subsection{Accessing Cells Rows and Columns of a data frame}
Q6 : Question 6
What command should you type to get R to return the number “8” from the dataframe "nw"?
Question 7
What command should you type to get R to return the word “eight” from the dataframe "nw"?
\subsection{Matrices}
Question 8
%What should you type to create a matrix “a” comprising all natural numbers from 1 to 10, with 2 rows and 5 columns.
In this following example, we will
create a matrix Mat1 comprising all natural numbers from 1 to 12 in order, structured with 2 rows and 5 columns.
\begin{verbatim}
dozen <- 1:12
Mat1 <- matrix(dozen, nrow=2)
Mat2 <- matrix(1:12, ncol=4)
\end{verbatim}
%------------------------------------%
\newpage
\subsection{Combining Vectors}
\begin{itemize}
\item \texttt{rbind()} combine several vectors by row
\item \texttt{cbind()} combine several vectors by column
\end{itemize}
%Question 9
Create a vector "x" comprising all natural numbers from 1 to 6 and another vector "y" comprising all natural numbers from 5 to 10. What should you type to combine them in a matrix of 2 rows and 6 columns?
%Question 10
Create a vector "x" comprising all natural numbers from 1 to 6 and another vector "y" comprising all natural numbers from 5 to 10. What should you type to combine them in a matrix of 6 rows and 2 columns?
\begin{framed}
\begin{verbatim}
x <- 1:6
y <- 5:10
cbind(x,y)
rbind(x,y)
\end{verbatim}
\end{framed}
%---------------------------------------------------%
\begin{verbatim}
> cbind(x,y)
x y
[1,] 1 5
[2,] 2 6
[3,] 3 7
[4,] 4 8
[5,] 5 9
[6,] 6 10
>
> rbind(x,y)
[,1] [,2] [,3] [,4] [,5] [,6]
x 1 2 3 4 5 6
y 5 6 7 8 9 10
\end{verbatim}
%-----------------------------------------------%
%-------------------------------------------------------------%
\section{Statistics One - Week 2 Exercise}
\subsection{Inspecting a Data frame}
%Question 1
How many rows of data are in the data file?
Answer for Question 1
\subsection{Column names}
%Question 2
What is the name of the dependent variable?
Answer for Question 2
\subsection{The summary of a dataframe}
%Question 3
What is the mean of SR across all subjects?
Answer for Question 3
\subsection{Using the psych package}
%Question 4
What is the variance of SR across all subjects?
Answer for Question 4
%Question 5
What is the mean of SR for all subjects at pretest?
Answer for Question 5
%Question 6
What is the standard deviation of SR for all subjects at posttest?
Answer for Question 6
%Question 7
What is the median of SR for all subjects at posttest?
Answer for Question 7
\subsection{groups}
%Question 8
Which group has the highest mean at posttest?
Answer for Question 8
\subsection{Testing Normality}
\begin{itemize}
\item QQ-plots
\item Skew and Kurtosis
\end{itemize}
%Question 9
Which one best approximates a normal distribution?
- WM group at pretest
- WM group at postest
- PE group at pretest
- PE group at posttest
- DS group at pretest
- DS group at postest
Question 10
Which group showed the biggest gains in SR?
Answer for Question 10
\end{document}