-
Notifications
You must be signed in to change notification settings - Fork 0
/
mortality_vaccinated_uk.Rmd
105 lines (73 loc) · 4.23 KB
/
mortality_vaccinated_uk.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
---
title: "Compared all-cause mortality of vaccinated vs unvaccinated in England"
output:
html_document:
df_print: paged
html_notebook:
theme: journal
---
```{r include=FALSE}
library(dplyr)
library(readxl)
library(ggplot2)
library(reshape2)
library(formattable)
library(gt)
```
This document analyzes mortality data published by the United Kingdom's Office for National Statistics. It compares the mortality of people, aged between 10 and 59, with different vaccination status.
No conclusion can be drawn from this analysis because the groups with different vaccination statuses have different age structure, therefore different age mortality.
This document has been created using the R statistical programming language. The source code of this document can be found online on [GitHub](https://github.com/gfraiteur/mortality/blob/main/mortality_vaccinated_uk.Rmd). We are including the scripts that collect and transform the data to make it easier for people with knowledge of programming to verify each step of the analysis.
If you want to analyze the data yourself, download it from [this page](https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/datasets/deathsbyvaccinationstatusengland) of the United Kingdom's Office for National Statistics web site.
We are using the following script to load the data:
```{r}
url = "https://www.ons.gov.uk/file?uri=%2fpeoplepopulationandcommunity%2fbirthsdeathsandmarriages%2fdeaths%2fdatasets%2fdeathsbyvaccinationstatusengland%2fdeathsoccurringbetween2januaryand24september2021/datasetfinalcorrected3.xlsx"
filename = "datasetfinalcorrected3.xlsx"
if ( !file.exists(filename)){
download.file(url, filename, mode="wb" )
}
dataset <- read_excel("datasetfinalcorrected3.xlsx",
sheet = "Table 4", col_types = c("skip",
"numeric", "text", "text", "numeric",
"numeric", "numeric", "numeric",
"skip", "numeric", "numeric"), na = ":",
skip = 2)
names(dataset) = c( "week","vaccination_status", "age_group", "deaths_number", "population", "population_total", "death_rate", "death_rate_min", "death_date_max")
dataset = dataset %>% filter( !is.na(dataset$vaccination_status))
```
The column *vaccination status* of this data set is interesting to us. The original data differentiates the population according to the following vaccination statuses:
* Unvaccinated
* Within 21 days of first dose
* 21 days or more after first dose
* Second dose
We can now draw a graph of this data for the 10-59 age group:
```{r warning=FALSE}
dataset %>%
filter( age_group == "10-59") %>%
ggplot(aes(x=week,y=death_rate, color = vaccination_status, shape = vaccination_status)) +
geom_point() +
geom_line() +
scale_x_continuous(name = "Week of 2021", minor_breaks = FALSE) +
scale_y_continuous(name = "Weekly mortality rate") +
ggtitle("Weekly mortality rate for different vaccination statues")
```
As we can see, after the 6th week, the mortality of the vaccinated becomes greater than the one of the unvaccinated.
Let's compare, for both populations, the average mortality of these groups after the 20th week of 2021.
```{r}
mortality_tail =
dataset %>%
filter( age_group == "10-59") %>%
filter( week >= 20 ) %>%
group_by( vaccination_status ) %>%
summarise( death_rate = mean(death_rate) )
mortality_compared = (( mortality_tail %>% filter( vaccination_status == "Second dose"))$death_rate) / (( mortality_tail %>% filter( vaccination_status == "Unvaccinated"))$death_rate)
```
This gives the following results:
```{r echo=FALSE}
mortality_tail %>%
knitr::kable( col.names = c("Vaccination Status", "Death Rate"))
```
We can see that the people having vaccinated with two doses have a `r percent(mortality_compared-1, 0)` higher risk to die than people that have not received any injection against covid19 between weeks 20 and `r max(dataset$week)`.
## Possible interpretations
This data is difficult to interpret because the age group is very wide. The cohorts of vaccinated and unvaccinated people have different age structures and therefore,
naturally, different mortality.
The next step of the analysis would be to unskew the data using the age structure of the two cohorts, based on vaccination data and on normal mortality rate for each group.