-
Notifications
You must be signed in to change notification settings - Fork 0
/
ms.Rmd
1067 lines (914 loc) · 58.3 KB
/
ms.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
title: "Affective Uplift During Video Game Play: A Naturalistic Case Study"
shorttitle: Affective Uplift During Play
leftheader: Affective Uplift During Play
author:
- name: Matti Vuorre
corresponding: yes
affiliation: "1,2"
address: Tilburg University
email: mjvuorre@uvt.nl
- name: Nick Ballou
affiliation: "2"
- name: Thomas Hakman
affiliation: "2"
- name: Kristoffer Magnusson
affiliation: "2,3"
- name: Andrew K. Przybylski
affiliation: "2"
address: Oxford Internet Institute
email: andy.przybylski@oii.ox.ac.uk
affiliation:
- id: 1
institution: Tilburg School of Social and Behavioral Sciences, Tilburg University
- id: 2
institution: Oxford Internet Institute, University of Oxford
- id: 3
institution: Centre for Psychiatry Research, Department of Clinical Neuroscience, Karolinska Institutet, & Stockholm Health Care Services, Region Stockholm, Sweden
authornote: |
\noindent \textbf{This pre-print is not yet peer-reviewed.}
wordcount: "4,802"
bibliography: references.bib
keywords: "video games, well-being, mood, play behavior, telemetry"
csl: apa.csl
floatsintext: yes
linenumbers: no
draft: no
mask: no
figurelist: no
tablelist: no
footnotelist: no
documentclass: apa7
classoption: jou
output:
papaja::apa6_pdf:
number_sections: false
keep_tex: true
papaja::apa6_docx:
number_sections: false
editor_options:
chunk_output_type: console
---
```{r}
#| label: prepare
#| cache: false
library(papaja)
library(scales)
library(cmdstanr)
library(splines)
library(emmeans)
library(posterior)
library(lme4)
library(brms)
library(ggdist)
library(patchwork)
library(tidyverse)
source("R/functions.R")
source("R/common.R")
# Output and temporary files
dir.create("models", FALSE)
dir.create("data", FALSE)
# Document options
knitr::opts_chunk$set(
cache = TRUE,
warning = FALSE,
message = FALSE,
fig.width = 8,
fig.asp = 0.618
)
```
How do video games affect players' well-being? Games are often studied for their potential in catalyzing psychological change over timescales spanning from weeks to months [e.g., effects on school performance, depression, or life satisfaction, @sauterSocialContextGaming2021; @vuorreTimeSpentPlaying2022], and the surrounding public debate has typically focused on play's far-reaching consequences on players' mental health, social attitudes, or cognitive development [@fergusonDoesSexualizationVideo2022; @hilgardOverestimationActionGameTraining2019; @mathurFindingCommonGround2019]. In stark contrast, typical play appears to be motivated by short-term goals, such as wanting to unwind after a long day, escape to a pleasant non-reality in the moment, or engage in uplifting social interaction over periods of hours [@bourgonjonPlayersPerspectivesPositive2016; @kahnTrojanPlayerTypology2015; @stensengAreThereTwo2021]. Such short-term dynamics between play and affect can exist but need not necessarily accumulate into long-term impacts. For example, games might provide relief, relaxation, and brief improvements in mood over several hours [@riegerEatingGhostsUnderlying2015; @russonielloEffectivenessCasualVideo2009; @tyackRestorativePlayVideogames2020], after which the effects taper out as individuals return to their baseline moods.
Understanding whether and when games' short-term effects emerge is critical for establishing games' potential for mood-related interventions, as well as for building a theoretical foundation for repeated short-term gaming experiences' long-term effects on mental health. Substantial existing evidence suggests that games can provide short-term boosts to well-being [@bowmanMoodGameSelective2015; @tyackRestorativePlayVideogames2020], possibly to a greater extent than non-interactive media such as videos [@riegerEatingGhostsUnderlying2015]. Much of that work took place under the "mood repair" and "mood management" labels [@zillmannMoodManagementCommunication1988], which describe how media might support users in balancing internal states following unpleasant feelings, possibly through addressing basic psychological needs [@BallouDeterding2023Basic; @ReineckeEtAl2012characterizing; @riegerEatingGhostsUnderlying2015; @TamboriniEtAl2011Media; @Tyack2019Need]. On the other hand, games might also affect players negatively: Frustrating gaming experiences, for example, can lead to negative consequences such as immediate post-play aggression [@przybylskiCompetenceimpedingElectronicGames2014].
At present, however, the prevalence and magnitude of these short-term effects remain poorly understood. Despite the above examples, the validity and generalizability of research on games' short-term affective effects remains limited by three challenges. First, a substantial portion of gameplay research has relied on artificial stimuli; games created or substantially modified by academic researchers [@bowmanMoodGameSelective2015]. While such customized games allow for greater experimental control, they are unlikely to reflect actual games' rich complexity [@mcmahanConsiderationsUseCommercial2011]. This issue of the limited ecological validity and generalizability of research stimuli (games) limits current inferences about popularly played games' psychological effects.
The second challenge is providing an ecologically valid *context* for play. Research participants typically play games in (online or physical) labs that do not resemble the natural contexts of play, such as when, with whom, and why people choose to play [@tyackRestorativePlayVideogames2020]. In lab settings, research participants play to satisfy study requirements, rather than the intrinsic motivations that typically lead them to play. While beneficial to clarifying causal inferences, the extrinsically motivated play behaviors necessary in lab studies might relate differently to well-being than intrinsically motivated naturally occurring play [@bruhlmannMotivationalProfilingLeague2020; @howardStudentMotivationAssociated2021]. Therefore, results from such studies are less likely to accurately generalize to how games are played in the real world.
The third challenge concerns the timescale of effects: How quickly do potential effects emerge, and how long are they sustained? For example, some studies indicate that by the end of a half-hour game session, players may exhibit changes in stress [@russonielloEffectivenessCasualVideo2009], aggressive affect [@przybylskiCompetenceimpedingElectronicGames2014], and vitality [@tyackRestorativePlayVideogames2020]. When and how video games' effects evolve during the initial half-hour remains unclear and difficult to study because researchers are typically unable to ask questions at a sufficient temporal resolution, with notable exceptions of @boweyPredictingBeliefsNPC2021 and @frommelGatheringSelfReportData2021, who used non-player characters to ask questions directly within the game. However, they did not enquire about well-being, leaving it unclear when and how the affective dimensions of play change on short timescales.
Here, we aim to address these three challenges to better understand how real play in natural contexts might predict mood on short timescales. Specifically, we examined an intensive longitudinal dataset from the popular commercially available game PowerWash Simulator [PWS, @vuorreIntensiveLongitudinalDataset2023], which includes mood questions embedded in the game itself, to ask three questions: First, to what extent does mood change from immediately before video game play to during play? Second, how heterogeneous are these changes in the population of similar players? And third, how do changes in mood develop over the course of a gaming session?
## Methods
<!-- Data wrangling code -->
```{r}
#| label: data-get
# Download data from OSF PWS database
PWS_DATA_PATH <- Sys.getenv("PWS_DATA_PATH", unset = "data-raw/data.zip")
if (!file.exists(PWS_DATA_PATH)) {
dir.create(
dirname(PWS_DATA_PATH),
showWarnings = FALSE,
recursive = TRUE
)
download.file(
url = "https://osf.io/download/j48qf/",
destfile = PWS_DATA_PATH
)
}
# unzip required data tables
if (!file.exists("data/demographics.csv")) {
dir.create("data", FALSE)
unzip(
zipfile = PWS_DATA_PATH,
files = c(
"data/demographics.csv",
"data/study_prompt_answered.csv"
)
)
}
# Load data to R session
dat <- read_csv(
"data/study_prompt_answered.csv",
col_select = c(
pid,
time = Time_utc,
duration = CurrentSessionLength,
prompt = LastStudyPromptType,
mood = response
)
)
# Convert types and units
dat <- dat |>
mutate(
pid = factor(pid),
# Mood to 0-1 scale
mood = mood / 1000,
# hours indicates session duration in hours
hours = duration / 60,
.keep = "unused"
)
# Create session indicators
dat <- dat |>
arrange(pid, time) |>
mutate(
new_session = hours < lag(hours, default = 999),
session = cumsum(new_session),
.by = pid
) |>
select(-c(new_session)) |>
mutate(ps = paste(pid, session, sep = "_"))
dat_sum <- dat |>
summarize(
when = "raw",
n_pid = length(unique(pid)),
n_session = length(unique(ps)),
n_obs = n()
)
# Include only wellbeing responses that occurred
# in sessions with both "pre" and "post" measures
dat <- dat |>
filter(prompt == "Wellbeing") |>
select(-prompt)
# Calculate basic data summaries
dat_sum <- dat |>
summarize(
when = "mood",
n_pid = length(unique(pid)),
n_session = length(unique(ps)),
n_obs = n()
) |>
bind_rows(dat_sum)
```
```{r}
#| label: fig-session-durations
#| include: false
#| fig.asp: 0.5
#| fig.cap: Summaries of session durations. X-axes are truncated at 10 hours.
p_durations <- dat |>
filter(hours == max(hours), .by = ps) |>
ggplot(aes(hours)) +
scale_x_continuous(
"Session duration",
expand = expansion(c(0.05, 0.05))
) +
scale_y_continuous(
"Sessions",
expand = expansion(c(0, 0.05))
) +
coord_cartesian(xlim = c(0, 10)) +
geom_histogram(binwidth = 0.33)
p_mean_durations <- dat |>
filter(hours == max(hours), .by = ps) |>
summarise(hours = mean(hours), n = n(), .by = pid) |>
ggplot(aes(hours)) +
scale_x_continuous(
"Mean session duration",
expand = expansion(c(0.05, 0.05))
) +
scale_y_continuous(
"Players",
expand = expansion(c(0, 0.05))
) +
coord_cartesian(xlim = c(0, 10)) +
geom_histogram(binwidth = 0.33)
p_ecdf_durations <- dat |>
filter(hours == max(hours), .by = c(session, pid)) |>
ggplot(aes(hours)) +
scale_x_continuous(
"Session duration",
expand = expansion(c(0.05, 0.05))
) +
scale_y_continuous(
"Cumulative proportion",
expand = expansion(0.01),
breaks = (0:10) / 10
) +
stat_ecdf() +
coord_cartesian(xlim = c(0, 10))
p_durations | p_mean_durations | p_ecdf_durations
dat <- dat |>
filter(hours <= 5)
dat_sum <- dat |>
summarize(
when = "5h",
n_pid = length(unique(pid)),
n_session = length(unique(ps)),
n_obs = n()
) |>
bind_rows(dat_sum)
```
```{r}
#| label: wrangle-more
#| include: false
# Drop rows with missing mood responses
dat <- dat |>
drop_na(mood) |>
mutate(pid = fct_drop(pid))
# Create indicators for analyses
dat <- dat |>
mutate(
post = factor(
row_number() > 1,
levels = c(FALSE, TRUE),
labels = c("0", "1")
),
.by = c(pid, session)
) |>
# Censoring indicator
mutate(
cl = case_when(
mood == 0 ~ "left",
mood == 1 ~ "right",
TRUE ~ "none"
)
)
dat_sum <- dat |>
summarize(
when = "non-na",
n_pid = length(unique(pid)),
n_session = length(unique(ps)),
n_obs = n()
) |>
bind_rows(dat_sum)
# How many sessions had pre-mood & post-mood
dat |>
mutate(
has_pre = any(hours == 0),
has_post = any(hours != 0),
has_both = has_pre & has_post,
.by = ps
) |>
distinct(ps, has_pre, has_post, has_both) |>
summarise(
n_pre = number2(sum(has_pre)),
n_post = number2(sum(has_post)),
n_both = number2(sum(has_both)),
p_pre = percent2(mean(has_pre)),
p_post = percent2(mean(has_post)),
p_both = percent2(mean(has_both))
)
# Use contrast coding with unit difference
contrasts(dat$post) <- c(-0.5, 0.5)
# Analyze a subset of participants if required
# (for testing etc.; be careful)
N_SUBSET_PROPORTION <- as.numeric(Sys.getenv("N_SUBSET_PROPORTION", unset = 1))
dat <- dat |>
filter(
pid %in%
sample(
unique(dat$pid),
size = length(unique(dat$pid)) * N_SUBSET_PROPORTION
)
)
# Write data for any supplementary notebooks
write_rds(dat, "data/data.rds")
```
<!-- End code -->
In this study, we analyzed data from a large open dataset on PowerWash Simulator (PWS) play and psychological experiences [@vuorreIntensiveLongitudinalDataset2023]. The data was collected in a research edition of PWS that recorded gameplay events, game status records, participant demographics, and responses to psychological survey items. We developed the research edition of PWS in collaboration with PWS's developer, FuturLab, who made it freely available on Steam to anyone who owned the original game (£19.99 on 2023-09-20). From the players' perspective, the research edition was nearly identical to that of the main game with the addition of in-game pop-ups that inquired about psychological states during play.
### PowerWash Simulator
PWS is a first-person simulation game developed by FuturLab. In the game, players run a small power washing business and take jobs from a variety of clients in different locations in the form of levels. The core mechanic of PWS is aiming and using a pressure washer to remove dirt from various objects and levels, ranging from Ferris wheels to skateparks. Progression happens sequentially through a career mode in which the player earns credits for cleaning objects and completing cleaning jobs. These credits can be used to upgrade the pressure washer to increase its range and effectiveness, as well as to purchase cosmetic modifications for the washer or avatar. The game offers a multiplayer mode which was disabled in the research version.
Critically, in addition to regular gameplay, the research edition surfaced psychological survey items to the player during play sessions. These survey items were integrated into the game as pop-ups using the existing in-game character dialogue system and delivered by a newly created character called "The Researchers" making them both conversational and part of the game lore, ensuring minimal disruption to the play experience. The maximum number of questions per hour was six, with a window of at least five minutes in between pop-ups. In addition, at the beginning of each play session, at player login, there was a 10% probability that the player was asked a question about their mood before starting play. Furthermore, players were also given the option to self-report mood in the main menu once every 30 minutes, but we excluded those menu reports in this manuscript.
### Participants
```{r}
#| label: demographics
demo <- read_csv(
"data/demographics.csv",
col_select = c(pid, country, gender, age)
) |>
filter(pid %in% unique(dat$pid))
age <- median_qi(demo, age, .width = .8, na.rm = TRUE)
age <- str_glue("{age[1]} ({age[2]}, {age[3]}; 1st and 9th deciles)")
gender <- count(demo, gender, sort = TRUE) |>
mutate(p = n / sum(n)) |>
mutate(x = str_glue("{gender} ({number2(n)}, {percent2(p)})")) |>
slice(1:4) |>
pull(x) |>
paste(collapse = ", ")
n_country <- length(unique(demo$country[!is.na(demo$country)]))
country <- count(demo, country, sort = TRUE) |>
mutate(p = n / sum(n)) |>
mutate(x = str_glue("{country} ({number2(n)}, {percent2(p)})")) |>
slice(1:4) |>
pull(x) |>
knitr::combine_words()
```
After downloading the PWS research edition and starting the game for the first time, but before entering the game menu, participants gave informed consent, confirmed that they were 18 years old or older, and answered optional demographic questions. The characteristics of the full sample of 11,080 players in the PWS dataset are described in @vuorreIntensiveLongitudinalDataset2023; here, we describe the subset of data relevant to our questions (see Data analysis below). All participants were over 18 years old, provided informed consent, answered at least one mood question, and did not request their data to be deleted. The median age was `r age`, and the four most frequent gender responses were `r gender`. Participants played in `r n_country` countries, with the `r country` being the most represented. Recruitment happened in multiple waves through multiple avenues inside and outside of the game [@vuorreIntensiveLongitudinalDataset2023]. Study participation was incentivized through cosmetic in-game rewards (e.g. item skins). For every 12 questions answered, players could unlock a reward, of which five were available. These rewards could only be unlocked in the research version but were usable in both the research and main versions of PWS.
The study procedures were granted ethical approval by Oxford University's Central University Research Ethics Committee (SSH_OII_CIA_21_011).
### Measures
We measured mood with a single item: "How are you feeling right now?" [@killingsworthWanderingMindUnhappy2010]. Participants responded using a visual analogue scale (VAS) with endpoints "Very bad" and "Very good" that recorded 1000 possible values, which we rescaled to the unit interval (0-1) for this study. Consequently, our results can also be interpreted on the "[proportion] of maximum possible" scale [POMP, @cohenProblemUnitsCircumstance1999]. While well-being is often studied with multi-item scales to differentiate between dimensions of positive and negative affect, the frequent probing of mental states in this study required a minimally intrusive instrument that would interrupt the participants' play experience as little as possible. Moreover, such single-item assessments have previously been validated and are recommended for intensive longitudinal studies [@songExaminingConcurrentPredictive2022].
### Data analysis
For the analyses reported here, we used a subset of the data in @vuorreIntensiveLongitudinalDataset2023 that was relevant to our questions. The full dataset contains `r number2(filter(dat_sum, when=="raw")$n_obs)` in-game survey responses, but here we ignored the enjoyment, focus, autonomy, competence, and immersion items and focused on the `r number2(filter(dat_sum, when=="mood")$n_obs)` mood responses from `r number2(filter(dat_sum, when=="mood")$n_session)` sessions and `r number2(filter(dat_sum, when=="mood")$n_pid)` players. We then excluded sessions longer than 5 hours in duration (`r percent2(1 - filter(dat_sum, when=="5h")$n_session / filter(dat_sum, when=="mood")$n_session)`) and dropped all responses with missing values (`r percent2(1 - filter(dat_sum, when=="non-na")$n_session / filter(dat_sum, when=="5h")$n_session)`). We made these decisions to reduce the complexity of our anticipated models, and under the belief that very long sessions are likely to be qualitatively different, and very rare, compared to typically shorter sessions. Our final dataset consisted of `r number2(filter(dat_sum, when=="non-na")$n_obs)` mood responses from `r number2(filter(dat_sum, when=="non-na")$n_session)` sessions and `r number2(filter(dat_sum, when=="non-na")$n_pid)` players.
Our first and main research question concerned the difference between players' moods at the beginning of each session (pre-play) and during the subsequent play session (during play). This contrast does not represent a causal hypothesis (see Limitations, below): Players could begin (and end) their play sessions for whatever reason, and these reasons are likely to confound the pre – during contrast. For example, a player might come home after a stressful day at work and then play PWS. Coming home from a stressful work environment might then cause the person to both (1) choose to play, and (2) experience an elevated mood, in which case we would be in error if we attributed play itself the position of causal antecedent of any potential mood consequences. Generally, reasons for starting to play are likely to contribute to the pre-during contrast and we are unable to disentangle those from any changes specifically caused by play.
```{r}
#| include: false
# How many sessions per player
dat |>
count(pid, session) |>
count(pid) |>
summarise(median(n))
# How many data per session
dat |>
count(ps) |>
summarise(median(n))
```
We estimated this contrast within a three-level hierarchical regression model that nested observations within sessions, and sessions within players. We decided this three-level hierarchy as most appropriate, because individuals typically contributed data over many sessions (the median player contributed five sessions' data), and sessions typically had multiple observations (the median session included two observations). More formally, then, we modelled the mood report of the $i^{th}$ observation, of the $j^{th}$ person's $k^{th}$ session as censored-normal distributed with a common variance using the following equations
\begin{align*}
\text{mood}_{ijk} &\sim \text{CensNorm}^{[0, 1]}(\beta_{0jk} + \beta_{1j}\text{during}_{ijk}, \sigma^2), \\
\beta_{0j} &= \gamma_0 + u_{0j} + v_{0k}, \\
\beta_{1j} &= \gamma_1 + u_{1j}, \\
\begin{bmatrix}
u_{0j} \\ u_{1j}
\end{bmatrix} &\sim \text{MVN}\left(
\begin{bmatrix}
0 \\ 0
\end{bmatrix},
\begin{pmatrix}
\tau_{0} \ & \\
\rho_{01} \ &\tau_{1}
\end{pmatrix}
\right), \\
v_{0k} &\sim \text{Normal}\left(0, \kappa_{0}\right)
\end{align*}
We specified a censored (at 0 and 1) Gaussian model of mood because a VAS necessarily limits response options at the lower and upper ends. Ignoring censoring would leave the contrast susceptible to ceiling or floor effects and might confound changes in the mood distribution's location with changes in its scale. We then modelled mean mood on an intercept and a coefficient of *during* play (coded as pre-play: -0.5; during play: 0.5) and allowed both parameters to vary randomly across players ($u_{0j}$ and $u_{1j}$). Thus, to answer RQ2, we could examine $\tau_1$, describing the variability of individuals' mood changes around the mean mood change ($\gamma_1$). In addition, we modelled random intercepts over player sessions. Although equal residual variances across people in natural observation seem unlikely, we estimated only one residual deviation parameter to limit model complexity.
We analyzed the data with R and used the brms package to estimate, via Stan's HMC sampling algorithm, and post-process the models [@burknerBrmsPackageBayesian2017; @rcoreteamLanguageEnvironmentStatistical2023; @standevelopmentteamStanModelingLanguage2021]. These probabilistic methods are especially helpful for complex models where some variance parameters might be small---as we anticipated here for the session-level variances. We drew `r number2(N_ITER*4/2)` samples from the model's posterior distribution using brms' default prior distributions on all parameters and used numerical and graphical checks to ensure model convergence and adequacy.
```{r}
#| label: model-1-brms
model <- bf(
mood | cens(cl) ~
1 + post +
(1 + post | pid) +
(1 | ps)
) +
gaussian()
fit1 <- brm(
model,
data = dat,
silent = 0,
iter = N_ITER,
control = list(adapt_delta = .95),
file = "models/brm-prepost-pid-ps-censored"
)
```
```{r}
#| label: fig-convergence
#| fig.cap: Convergence diagnostic plot showing bivariate scatterplots of the model's population-level parameters' posterior draws.
#| eval: false
pairs(
fit1,
variable = c("b_", "sd_", "cor_"),
regex = TRUE,
off_diag_args = list(size = .33, shape = 1)
)
```
## Results
```{r}
#| label: fig-descriptives
#| fig-height: 2.2
#| include: false
#| fig.cap: 'A. Histogram of session durations (note log10 x-axis). B. Summary of how many sessions each participant completed. C. Histogram of all mood ratings.'
p0 <- dat |>
ggplot() +
scale_x_continuous(
expand = expansion(c(0.01, 0.01)),
) +
scale_y_continuous(
expand = expansion(c(0.001, 0.05)),
) +
geom_histogram(
col = "white",
bins = 50,
linewidth = .25,
boundary = 0
)
dat_session_durations <- dat |>
filter(hours == max(hours), .by = c(pid, session))
dat_session_counts <- dat |>
distinct(pid, session) |>
count(pid, name = "sessions")
p_durations <- p0 %+%
dat_session_durations +
aes(hours) +
labs(x = "Session duration (hours)", y = "Sessions")
p_sessions <- p0 %+%
dat_session_counts +
coord_cartesian(xlim = c(0, 75)) +
aes(sessions) +
labs(x = "Sessions", y = "Participants")
p_mood <- p0 +
aes(mood) +
labs(x = "Mood", y = "Responses")
(p_durations | p_sessions | p_mood) +
plot_annotation(tag_levels = "A")
```
```{r}
#| include: false
tmp <- list(
dat_session_durations$hours,
dat_session_counts$sessions,
dat$mood,
dat$mood[dat$post == 0],
dat$mood[dat$post == 1]
) |>
map(
~ median_qi(.x, .width = 0.8, na.rm = TRUE)
) |>
map(
~ mutate(
.x,
across(
c(y, ymin, ymax),
~ number2(., .01)
)
) |>
str_glue_data("{y} [{ymin}, {ymax}]")
)
tmp
```
The median session duration was `r tmp[[1]]` hours [10 and 90 percentiles]; the median player contributed data from `r tmp[[2]]` sessions, and the median mood was `r tmp[[3]]` (pre-session: `r tmp[[4]]`, during play: `r tmp[[5]]`). We illustrate these basic features of the data in Figure \@ref(fig:fig-data).
### RQ1: Mood changes from pre- to during play
```{r}
#| label: fig-data
#| include: true
#| fig.env: "figure*"
#| fig.cap: "A. Scatterplots of three participants' (rows) mood responses (pre-play: red; during play: blue) over eight sessions' (columns) durations. B. Histograms of session-mean (C. person-mean) moods before (top) and during (bottom) play sessions. D. Differences in session-mean (E. player-mean) mood differences (during session - pre-play). F. Scatterplot of person-mean mood reports at the beginning (x-axis) and during gameplay sessions (y-axis). Identity line is shown in green, and an exploratory GAM regression line is shown in blue."
# Plot: Example raw sessions
set.seed(99)
dat_example_sessions <- dat |>
add_count(pid, session, name = "obs_per_session") |>
filter(obs_per_session >= 4) |>
mutate(session = as.numeric(as.factor(session)), .by = pid) |>
mutate(session_per_person = length(unique(session)), .by = pid)
# default
min_sessions <- 8
n_filtered <- dat_example_sessions |>
distinct(pid, session_per_person) |>
filter(session_per_person >= 8) |>
nrow()
# pick anyone if there's not 3 pids with >= 4 sessions
if (n_filtered < 3) min_sessions <- 0
dat_example_sessions <- dat_example_sessions |>
filter(session_per_person >= min_sessions) |>
arrange(pid, session) |>
filter(pid %in% sample(unique(pid), 3)) |>
mutate(
Session = str_glue("Session {session}"),
Person = str_glue("Person {fct_anon(pid)}")
)
p_mood_example <- dat_example_sessions |>
filter(session <= 8) |>
ggplot() +
aes(hours, mood, col = post) +
scale_color_brewer(
"Pre-session measure",
palette = "Set1",
aesthetics = c("color", "fill")
) +
scale_x_continuous(
"Session duration (hours)"
) +
scale_y_continuous(
"Mood",
limits = c(0, 1),
breaks = c(0, 0.25, 0.5, 0.75, 1.0),
labels = c("0", "", "0.5", "", "1")
) +
geom_point(size = 1.5, alpha = 1) +
facet_grid(
rows = vars(Person),
cols = vars(Session),
scales = "fixed"
) +
theme(
legend.position = "none",
strip.text.y = element_blank()
)
# Plot: Raw mood histograms at pre and during play
p_mood_prepost_raw <- dat |>
mutate(post = factor(post, labels = c("Pre-play", "During play"))) |>
ggplot(aes(mood, fill = post)) +
scale_color_brewer(
"Pre-session measure",
palette = "Set1",
aesthetics = c("color", "fill")
) +
geom_histogram(bins = 30, col = "white") +
scale_y_continuous(
"Observations",
expand = expansion(c(0.01, 0.1)),
breaks = scales::extended_breaks(5)
) +
scale_x_continuous(
"Mood",
expand = expansion(c(0.01))
) +
facet_wrap("post", ncol = 1, scales = "free_y") +
theme(legend.position = "none")
# Plot: Person-session-mean mood histograms at pre and during play
p_mood_prepost_sessions <- p_mood_prepost_raw %+%
summarise(
p_mood_prepost_raw$data,
mood = mean(mood, na.rm = TRUE), .by = c(post, pid, ps)
) +
scale_y_continuous(
"Sessions",
expand = expansion(c(0.01, 0.1)),
breaks = scales::extended_breaks(5)
)
# Plot: Person-mean mood histograms at pre and during play
p_mood_prepost_players <- p_mood_prepost_sessions %+%
summarise(
p_mood_prepost_raw$data,
mood = mean(mood, na.rm = TRUE), .by = c(post, pid)
) +
scale_y_continuous(
"Players",
expand = expansion(c(0.01, 0.1)),
breaks = scales::extended_breaks(5)
)
# Plot: Difference histogram (sessions)
p_mood_difference_sessions <- p_mood_prepost_sessions$data |>
pivot_wider(names_from = post, values_from = mood) |>
mutate(Difference = `During play` - `Pre-play`) |>
drop_na(Difference) |>
ggplot(aes(Difference)) +
geom_histogram(bins = 50, col = "white") +
geom_vline(xintercept = 0, linewidth = .5, col = "#2ca25f") +
scale_y_continuous(
"Sessions",
expand = expansion(c(0.01, 0.1))
) +
scale_x_continuous(
expand = expansion(c(0.01))
) +
coord_cartesian(xlim = c(-.4, .6))
# Plot: Difference histogram (players)
p_mood_difference_players <- p_mood_prepost_players$data |>
pivot_wider(names_from = post, values_from = mood) |>
mutate(Difference = `During play` - `Pre-play`) |>
drop_na(Difference) |>
ggplot(aes(Difference)) +
geom_histogram(bins = 50, col = "white") +
geom_vline(xintercept = 0, linewidth = .5, col = "#2ca25f") +
scale_y_continuous(
"Players",
expand = expansion(c(0.01, 0.1))
) +
scale_x_continuous(
expand = expansion(c(0.01))
) +
coord_cartesian(xlim = c(-.4, .6))
p_mood_difference <- (
(p_mood_difference_sessions +
theme(
axis.title.x = element_blank(),
axis.text.x = element_blank(),
axis.ticks.x = element_blank()
)) /
p_mood_difference_players
)
p_mood_biscatter_sessions <- p_mood_difference_sessions$data |>
ggplot(aes(`Pre-play`, `During play`)) +
scale_x_continuous(
expand = expansion(0.01)
) +
scale_y_continuous(
expand = expansion(0.01)
) +
geom_point(
alpha = .2, size = 0.33, shape = 1
) +
geom_abline(linewidth = .5, col = "#2ca25f") +
geom_smooth(
method = "gam",
se = FALSE,
linewidth = .75,
col = "dodgerblue"
) +
theme(aspect.ratio = 1)
p_mood_biscatter_players <- p_mood_difference_players$data |>
ggplot(aes(`Pre-play`, `During play`)) +
scale_x_continuous(
expand = expansion(0.01)
) +
scale_y_continuous(
expand = expansion(0.01)
) +
geom_point(
alpha = .2, size = 0.33, shape = 1
) +
geom_abline(linewidth = .5, col = "#2ca25f") +
geom_smooth(
method = "gam",
se = FALSE,
linewidth = .75,
col = "dodgerblue"
) +
theme(aspect.ratio = 1)
p_mood_example /
(
p_mood_prepost_sessions |
p_mood_prepost_players |
p_mood_difference |
p_mood_biscatter_players
) +
plot_layout(heights = c(5, 5)) +
plot_annotation(tag_levels = "A")
```
We first focused on our primary research question: To what extent do PWS players' mood change from pre-play to during play? We visualized the relevant data in Figure \@ref(fig:fig-data): Panel A shows mood responses from three example participants' first eight sessions of play. Figure \@ref(fig:fig-data) B (C) then shows histograms of all sessions' (players') aggregated pre- and during play moods to facilitate visual comparison of the raw data. We show the differences in these aggregated moods in Figure \@ref(fig:fig-data) D (sessions) and E (players). Moreover, Figure \@ref(fig:fig-data) F shows the difference in player-mean pre- and during play moods across different values of pre-play moods. Overall, these figures suggested small increases in mood from pre- to during play, but also that there were broad distributions of this difference over sessions and players, and that the difference was greater for lower pre-play moods (Panel F).
```{r}
#| label: tbl-avg
#| include: true
#| tbl-cap: Summaries of key population-level estimates
fit_tbl <- as_draws_df(
fit1,
c("b_", "sd_", "sigma"),
regex = TRUE
) |>
transmute(
`Pre-play` = b_Intercept + b_post1 * -0.5,
`During play` = b_Intercept + b_post1 * 0.5,
Difference = b_post1,
`Difference (scaled)` = Difference /
(sd_pid__Intercept + sd_pid__post1 + sd_ps__Intercept + sigma),
`(SD) Difference` = sd_pid__post1,
`Positive shifts` = pnorm(0, b_post1, sd_pid__post1, lower.tail = FALSE)
) |>
summarise_draws(
mean = ~ mean(.),
~ quantile2(., probs = c(.025, .975))
) |>
mutate(
res = str_glue("{number2(mean, .001)} [{number2(q2.5, .001)}, {number2(q97.5, .001)}]"),
resp = str_glue("{percent2(mean, .1)} [{percent2(q2.5, .1)}, {percent2(q97.5, .1)}]"),
)
# Show this one as a percentage
fit_tbl[6, "res"] <- fit_tbl[6, "resp"]
fit_tbl |>
mutate(Variable = variable, Estimate = res, .keep = "none") |>
papaja::apa_table(
span_text_columns = FALSE,
caption = "Summaries of the hierarchical model's key population-level estimates.",
note = "Numbers indicate posterior means and 95%CIs. Difference (scaled) is the standardized during play--pre-play difference."
)
```
We then turned to the model's results regarding (differences) in players' moods. They confirmed the visual impressions described above: Table \@ref(tab:tbl-avg) indicates that the average PWS player experiences a `r filter(fit_tbl, variable=="Difference")$res` unit increase in mood during PWS play, on a VAS from 0 to 1. We also interpreted this difference in a different light by dividing it by the total random variation estimated by the model. This standardized pre-play – during play contrast was `r filter(fit_tbl, variable=="Difference (scaled)")$res`.
### RQ2: Heterogeneity in mood changes
Above, we estimated that the average player's mood increased by approximately `r filter(fit_tbl, variable=="Difference")$res` units (on a 0-1 scale) from the beginning of the session to during play. However, that number does not indicate how representative this "average player" is. In other words, we do not know how variable this mood increase is likely to be in the population of similar players. We therefore next turned to our second research question: How heterogeneous are mood shifts in the population of similar PWS players? As a first approximation to an answer, we looked at the model's standard deviation of the person-specific mood increases. It was `r filter(fit_tbl, variable=="(SD) Difference")$res`. In comparison to the average person's estimated difference, that quantity indicated a moderate degree of heterogeneity between individuals. To give a more concrete quantity describing heterogeneity in this mood uplift, we then calculated the model-estimated proportion of individuals in this population who are expected to experience positive mood changes from pre- to during play. This proportion was `r filter(fit_tbl, variable=="Positive shifts")$resp`: Nearly three quarters of individuals are predicted to experience mood lifts during PWS play.
In sum, the results from our model contrasting pre- and during play moods indicated small increases in mood during play, and that those changes were somewhat robust across people.
### RQ3: Time course of mood changes during play
The above analysis provides an easily interpretable contrast between during-play moods and moods just before play. However, it does not address the time course of moods *within* the sessions. We therefore next turned to our third question: How do (changes in) players' moods evolve during gameplay sessions? To answer, we used time (hours) as a continuous predictor and allowed mood changes during sessions to be non-linear by estimating a piecewise cubic spline with 4 degrees of freedom using the R package lme4 [@lme4]. Just like the main model, this was a three-level hierarchical model, with random intercepts at the session and participant levels, and random participant slopes for each piece of the spline. Moreover, in a separate model, we also examined how within-session change related to mood before play by including pretest values as a covariate and modeled the hour-by-pretest continuous interaction. We also modeled pretest mood with a cubic spline to allow the relationship to be non-linear.
```{r}
#| label: model-2-3-lmer
fit2 <- fit_cached(
"models/lmm-ns4.Rds",
lmer(
mood ~ ns4(hours) + (1 | ps) + (1 + ns4(hours) | pid),
data = dat
)
)
dat_pre <- dat |>
filter(
# Session has a wellbeing measure at time = 0
hours[1] == 0,
.by = c(pid, session)
) |>
mutate(
pre = mood[1],
.by = c(pid, session)
) |>
filter(
hours > 0
) |>
droplevels()
fit3 <- fit_cached(
"models/lmm-pre-interaction.Rds",
lmer(
mood ~ ns4(hours) * ns(pre, 5) + (1 + ns4(hours) | pid) + (1 | ps),
data = dat_pre
)
)
```
The main model without an interaction with pre-play mood included all mood responses, sessions, and participants as above. However, the interaction model required each session to have a pre-play mood measure, which led to `r number2(lme4::ngrps(fit3)["pid"])` players, `r number2(lme4::ngrps(fit3)["ps"])` sessions, and `r number2(nobs(fit3))` observations included in that model.
We chose not to use model censoring in these models due to the increased computational cost. However, we performed sensitivity analyses with and without censoring on a reduced data set (1000 participants), which indicated nearly identical results. At worst, ignoring censoring resulted in slightly different intercepts; that is, the whole curve was shifted up or down.
```{r}
#| label: fig-ct1
#| include: true
#| fig.width: 4
#| fig.cap: "Estimated (changes) in mood as a function of session duration. Top: Average mood during a gaming session. Bottom: Change in mood during a session compared to mood at the beginning of a session. Gray ribbons indicate 95\\% confidence bands. We truncated the x-axis at three hours for this figure."
XMAX <- 3
XOUT <- 100
hours <- seq(0, XMAX, length.out = XOUT)
emm2_mood <- emmeans(
fit2,
~hours,
at = list(hours = hours),
lmer.df = "asymptotic"
)
emm2_diff <- contrast(
emm2_mood,
method = "trt.vs.ctrl",
ref = "hours0"
) |>
confint() |>
as.data.frame() |>
mutate(
hours = hours[-1],
res = str_glue(
"{number2(estimate, .001)} ",
"[{number2(asymp.LCL, .001)}, {number2(asymp.UCL, .001)}]"
)
)
bind_rows(
"Mood" = as_tibble(emm2_mood),
"Difference" = as_tibble(emm2_diff) |>
rename(emmean = estimate),
.id = "x"
) |>
mutate(x = fct_rev(x)) |>
ggplot(aes(hours, emmean)) +
scale_y_continuous(
"Value",
breaks = extended_breaks(7)
) +
geom_hline(
data = tibble(
x = c("Mood", "Difference") |> fct_rev(),
y = c(NaN, 0)
),
aes(yintercept = y),
linewidth = .5, col = "#2ca25f"
) +
geom_line() +
geom_ribbon(
aes(
ymin = asymp.LCL,
ymax = asymp.UCL
),
alpha = 0.25
) +
scale_x_continuous(
"Session duration (hours)",
breaks = c(0, 0.5, 1, 2, 3, 4, 5),
labels = c("0m", "30m", "1h", "2h", "3h", "4h", "5h"),
expand = expansion(0.01)
) +
facet_wrap(
"x",
ncol = 1,
scales = "free_y",
strip.position = "left"
) +
theme(
axis.title.y = element_blank(),
strip.placement = "outside",
strip.text = element_text(size = rel(1), hjust = 0.5)
)
```
```{r}
#| label: fig-ct2
#| fig.env: "figure*"
#| include: true
#| fig.asp: 0.5
#| fig.cap: "Estimated (changes) in mood as a function of session duration and pre-play mood. Top. Mood for the average player during a gaming session with pre-play mood at 5th, 25th, 50th, and 75th percentiles (columns). Ribbons indicate 95\\% confidence. Bottom. Same as above but with change in mood on the y-axis."
pre <- dat_pre |>
summarise(pre = pre[1], .by = ps) |>
pull(pre)
pre_values <- quantile(pre, c(0.05, .25, 0.5, 0.75))
pre_labels <- c("5th", "25th", "Median", "75th")
emm3_mood <- emmeans(
fit3,
~ hours + pre,
at = list(
hours = hours,
pre = pre_values
),
lmer.df = "asymptotic"
)
emm3_diff <- emm3_mood |>
contrast(
method = "trt.vs.ctrl",
ref = "hours0",
by = "pre"
) |>
confint() |>
mutate(
hours = rep(hours[-1], length(pre_values)),
emmean = estimate,
.keep = "unused"
)
bind_rows(
"Mood" = as_tibble(emm3_mood),
"Difference" = as_tibble(emm3_diff),
.id = "x"
) |>
mutate(x = fct_rev(x)) |>
mutate(pre = factor(pre, levels = pre_values, labels = pre_labels)) |>
ggplot(aes(hours, emmean)) +
scale_y_continuous(
"Value",
breaks = extended_breaks(7)
) +
geom_hline(
data = tibble(
x = c("Mood", "Difference") |> fct_rev(),
y = c(NaN, 0)
),
aes(yintercept = y),
linewidth = .5, col = "#2ca25f"
) +
geom_line() +
geom_ribbon(
aes(
ymin = asymp.LCL,
ymax = asymp.UCL
),
alpha = 0.25
) +
scale_x_continuous(
"Session duration (hours)",
breaks = c(0, 0.5, 1, 2, 3, 4, 5),
labels = c("0", "30m", "1h", "2h", "3h", "4h", "5h"),
expand = expansion(0.01)
) +
facet_grid(
x ~ pre,
scales = "free_y",
switch = "y",
) +
theme(
axis.title.y = element_blank(),
strip.placement = "outside",
strip.text = element_text(size = rel(1), hjust = 0.5)
)
```
This continuous time analysis added three important nuances to the simpler pre-during play contrast presented above. First, Figure \@ref(fig:fig-ct1) shows how the average mood increased during a session, suggesting a small but sharp uplift early during a session, and slightly greater in magnitude to the pre-during contrast. Second, the bulk of this increase occurred early in play sessions, with an increase of `r slice(emm2_diff, floor(XOUT/XMAX*0.25))$res` units for the average player after 15 minutes of play. Third, the rate and shape of change depended on the participants' initial mood levels. Figure \@ref(fig:fig-ct2) shows (changes in) estimated mood over a typical session based on different percentiles of pre-play mood, where the lower percentiles (5th percentile of pre-play mood = `r number2(pre_values[1], .01)`, and 25th = `r number2(pre_values[2], .01)`) showed greater uplift in moods during a session compared to median or greater pre-play mood levels.
---
abstract: Do video games affect players' well-being? In this case study, we examined `r number2(nrow(dat))` intensive longitudinal in-game mood reports from `r number2(length(unique(dat$ps)))` play sessions of `r number2(length(unique(dat$pid)))` players of the popular game PowerWash Simulator. We compared players' moods at the beginning of play session with their moods during play, and found that the average player reported `r filter(fit_tbl, variable=="Difference")$res` visual analog scale (VAS; 0-1) units greater mood during than at the beginning of play sessions. Moreover, we predict that `r filter(fit_tbl, variable=="Positive shifts")$resp` of similar players experience this affective uplift during play, and that the bulk of it happens during the first 15 minutes of play. We do not know whether these results indicate causal effects or to what extent they generalize to other games or player populations. Yet, these results based on in-game subjective reports from players of a popular commercially available game suggest good external validity, and as such offer a promising glimpse of the scientific value of transparent industry-academia collaborations in understanding the psychological roles of popular digital entertainment.
---
## Discussion
The current study corroborates what qualitative research and reports from video game players around the world have long suggested: People feel good playing games. Specifically, we find that playing a popular commercial video game, PowerWash Simulator, is linked with a small improvement in mood, that this improvement is experienced by `r filter(fit_tbl, variable=="Positive shifts")$resp` of players, and that the bulk of the improvement occurs during the first 15 minutes of play.