-
Notifications
You must be signed in to change notification settings - Fork 9
/
Copy pathChapter9.Rmd
112 lines (70 loc) · 2.98 KB
/
Chapter9.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
# Ch. 9 Inferences Based on Two Samples {-}
Sections covered: 9.1, 9.2, 9.3, 9.4
## 9.1 $z$ Tests and CI's for a Difference Between Two Population Means {-}
(Cases 1 & 2)
Skip: $\beta$ and the Choice of Sample Size (pp. 366-367)
### R {-}
Since you aren't given the original data, R isn't very helpful here, but you can write a function that you could reuse, such as:
```{r}
options(scipen = 999) # get rid of scientific notation
# this function only has to be run once per session, and then you can reuse it.
diffmeans <- function(xbar, ybar, delta0,
sigma1, sigma2,
m, n, type = "twosided") {
Z <- ((xbar - ybar) - delta0)/sqrt(((sigma1^2)/m) + ((sigma2^2)/n))
if (type == "twosided") {
pvalue <- pnorm(-abs(Z))*2
} else if (type == "lowertail") {
pvalue <- pnorm(Z)
} else {
pvalue <- 1 - pnorm(Z)
}
print(c("The p-value is", round(pvalue, 4)))
}
# Example 9.1, p. 365
diffmeans(xbar = 29.8, ybar = 34.7,
delta0 = 0, sigma1 = 4, sigma2 = 5,
m = 20, n = 25, type = "twosided")
# Example 9.4, p. 368
# As long as you use the correct order of parameters, you don't have to write out the names:
diffmeans(2258, 2637, -200, 1519, 1138, 663, 413, "lowertail")
```
## 9.2 The Two-Sample $t$ Test and CI {-}
(Case 3)
Use: https://www.statology.org/welchs-t-test-calculator/ (choose "Enter raw data" or "Enter summary data" as appropriate) to calculate the degrees of freedom.
Skip: Pooled $t$ Procedures (pp. 377-378)
Skip: Type II Error Probabilities (pp. 378-379)
Given two random samples, use `t.test()` with different parameters to carry out a two-sample hypothesis test.
For demonstration purposes, x and y here are samples of 10 numbers that drawn from the standard normal distribution.
```{r}
set.seed(1)
x <- rnorm(10)
set.seed(2)
y <- rnorm(10)
t.test(x, y, var.equal = TRUE, alternative = "less")
```
Thus, H_0 is rejected at alpha = .05.
In `t.test()`, "less" indicates that H_A: delta < 0. Also try using `alternative = "two.sided"` or `alternative = "two.sided"` for a different alternative hypothesis.
## 9.3 Analysis of Paired Data {-}
Skip: Paired Data and Two-Sample $t$ Procedures (pp. 386-387)
Skip: Paired Versus Unpaired Experiments (pp. 387-388)
Two ways of doing paired test:
(Here we continue to use the `x` and `y` in section 9.2 above)
**[Method 1]** Take x-y and do a one-sample test.
```{r}
t.test(x-y, alternative = "less")
```
**[Method 2]** Give another parameter `paired = TRUE`. In R, the default parameter for `paired` in `t.test()` is `FALSE`; here we set it to `TRUE` and leave `x` and `y` as two separate inputs.
```{r}
t.test(x, y, alternative = "less", paired = TRUE)
```
It's clear that they give the same results.
## 9.4 Inferences Concerning a Difference Between Population Proportions {-}
Skip: Type II Error Probabilities and Sample Sizes (pp. 394-395)
### R {-}
A/B Testing question from class:
```{r}
clicks <- c(25, 20)
people <- c(100, 100)
prop.test(clicks, people, correct = FALSE)
```