-
Notifications
You must be signed in to change notification settings - Fork 2
/
IPL.Rmd
1357 lines (1161 loc) · 90.5 KB
/
IPL.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
bibliography: themusiclab.bib
csl: nature.csl
header-includes:
- \usepackage[left]{lineno}
- \usepackage{caption}
- \captionsetup[figure]{labelformat=empty}
- \usepackage{tabu}
- \usepackage{afterpage}
- \usepackage{mdframed}
- \usepackage{color}
notes-after-punctuation: no
output:
pdf_document:
fig_caption: yes
latex_engine: lualatex
keep_tex: true
word_document: default
html_document: default
urlcolor: blue
---
```{r config, include = FALSE}
# config
knitr::opts_chunk$set(echo = FALSE, message = FALSE, warning = FALSE)
library(tidyverse)
library(broom)
library(lsr)
library(lme4)
library(lmerTest)
library(patchwork)
library(car)
library(renv)
library(TOSTER)
# create a snapshot of all package versions for posterity
renv::consent(provided = TRUE)
renv::snapshot()
# Rmd config
options(scipen = 999) # prevent scientific notation for numerals
format_p <- function(p) { # format p-values automatically
if (p < .001) {
return("< .001")
} else {
return(paste0("= ", round(p, digits = 3)))
}
}
```
# Infants relax in response to unfamiliar foreign lullabies
Constance M. Bainbridge\*†^1^, Mila Bertolo\*†^1^, Julie Youngers^1,2^, S. Atwood^1,3^, Lidya Yurdum^1^, Jan Simson^1^, Kelsie Lopez^1,4^, Feng Xing^1,5^, Alia Martin^6^ & Samuel A. Mehr\*^1,6,7^
\small
^1^Department of Psychology, Harvard University, Cambridge, MA 02138, USA.
^2^Department of Psychology, University of British Columbia, Vancouver, BC V6T 1Z4, Canada.
^3^Department of Psychology, University of Washington, Seattle, WA 98105, USA.
^4^Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, Providence, RI 02912, USA.
^5^Department of Education, Johns Hopkins University, Baltimore, MD 21218, USA.
^6^School of Psychology, Victoria University of Wellington, Wellington 6012, New Zealand.
^7^Data Science Initiative, Harvard University, Cambridge, MA 02138, USA.
†These authors contributed equally and are listed alphabetically.
\*Corresponding author. Emails: cbainbridge@g.harvard.edu; mila_bertolo@g.harvard.edu; sam@wjh.harvard.edu
\bigskip
\normalsize
\begin{mdframed}[backgroundcolor=gray!20]
Music is characterized by acoustical forms that are predictive of its behavioral functions. For example, adult listeners accurately identify unfamiliar lullabies as infant-directed on the basis of their musical features alone. This property could reflect a function of listeners' experiences, the basic design of the human mind, or both. Here, we show that American infants ($N = 144$) relax in response to 8 unfamiliar foreign lullabies, relative to matched non-lullaby songs from other foreign societies, as indexed by heart rate, pupillometry, and electrodermal activity. They do so consistently throughout the first year of life, suggesting the response is not a function of their musical experiences, which are limited relative to those of adults. The infants' parents overwhelmingly chose lullabies as the songs that they themselves would use to calm their fussy infant, despite their unfamiliarity. Together, these findings suggest that infants are predisposed to respond to universal features of lullabies.
\end{mdframed}
\bigskip
\linenumbers
Music is a human universal [@Mehr2019; @Jacoby2017; @Jacoby2019] that appears often in the lives of infants and their families [@Mendoza2019a; @Custodero2003; @Custodero2003a; @Mehr2014; @Trehub1997a]. Infants demonstrate a remarkable variety of responses to music as they develop: in the first few days of life, newborns remember melodies heard in the womb [@Granier-Deferre2011a]; distinguish consonant from dissonant intervals [@Zentner1996]; and detect musical beats [@Winkler2009]. Older infants differentiate synchronous movement from asynchronous movement in response to music [@Hannon2017]; become attuned to the rhythms of their native culture’s music by their first birthday [@Hannon2005]; garner social information from the songs they hear [@Mehr2016; @Mehr2017b]; and recall music in impressive detail [@Trainor2004; @Volkova2006] after long delays [@Mehr2016].
Why are infants so interested in music? One possibility centers on the dynamics of parent-offspring interactions. Relative to other animals, human infants are helpless; to survive, they rely on resources provided by parents and alloparents [@Hrdy2009]. Such resources, whether material (like food) or not (like attention) constitute parental investment [@Trivers1972]. Human parental investment is routinely provided to infants in response to their elicitations, which often take the form of fussiness and crying [@Soltis2004].
Infant-directed songs may credibly signal parental attention to infants, conveying information to infants that an adult is nearby, attending to them, and keeping them safe [@Mehr2017; @Mehr2020]. Singing indicates the location, proximity, and orientation of the singer (even when the singer is not visible, as at night); and it is also costly, in that the singer could be expending their energy on some other activity. Because parental attention is a key resource for helpless infants, they likely are predisposed to attend to signals of it: infants should be particularly interested in and reassured by vocal music with features suggesting that it is directed toward them.
Studies of people with genomic imprinting disorders provide a unique test of this hypothesis because these disorders are characterized by divergent behaviors related to parental investment [@Haig2003; @Ubeda2008]. For example, infants with Prader-Willi syndrome elicit less parental investment than do typically developing infants: they have feeding difficulties, nursing less often; and they tend to be lethargic [@Cassidy2008]. Children with Angelman syndrome show the opposite pattern: they elicit *more* parental investment, with frequent drooling and chewing, uncoordinated overfeeding, and high degrees of social engagement [@Williams2006].
Genomic imprinting disorders also alter the psychology of music, in a fashion consistent with the idea that infant-directed song signals parental investment. Compared to the relaxation response that typically developing people display during passive music listening, Prader-Willi syndrome is associated with an increased relaxation response [@Mehr2017a], and Angelman syndrome is associated with a reduced relaxation response [@Kotler2019]. These effects are specific to music; they were not elicited by listening to pleasant speech, suggesting that singing is a particularly effective means of satisfying parental investment elicitations in Prader-Willi syndrome, and a particularly ineffective means of doing so in Angelman syndrome.
Credible signals have evolved repeatedly in many species with similar patterns across senders and receivers [@MaynardSmith2003; @Mehr2020]. The resulting innate links between the forms and functions of vocal signals [@Morton1977; @Owren2001; @Endler1993] explain why, for example, hostile vocalizations across species — from growling tigers to shrieking eagles — are recognized as hostile by human listeners [@Filippi2017]. Because these signals are shaped by natural selection, they are expected to show consistency across members of a species.
Infant-directed vocalizations appear to fit this pattern. Infant-directed speech is acoustically distinct from adult-directed speech across cultures [@Moser2020; @Fernald1989; @Kuhl1997; @Piazza2017; @Broesch2018; @Bryant2007]. Lullabies, a common form of infant-directed song, are reliably distinguishable from other songs [@Trehub1993a]; in a representative sample of music from small-scale societies, adult naïve listeners considered foreign lullabies likely to be "used to soothe a baby", relative to dance, healing, and love songs [@Mehr2018]. This result, which has also been supported by a massive conceptual replication (*N* = 29,357), is explained in large part by the striking musical consistency of lullabies found across cultures: their slow tempos and smooth, minimally-accented melodic contours [@Mehr2019]. Strikingly, these same musical features appear in infant-directed or low-arousal Western music [@Trainor1997; @Trehub1997; @Gomez2007; @Rock1999].
If infant-directed song indeed functions as a credible signal of parental attention, then the universal features of the signal should produce reliable relaxation effects in the receiver: singing should satisfy infants' fussy demands for parental investment, calming them. Common sense does suggest that infants are calmed by infant-directed song, but typically, this question has been tested in the context of songs that are known to the infant and/or are sung in a familiar language. This makes it difficult to measure the specific soothing effects of infant-directed song, independently of the soothing effects of familiar sounds, more generally. Adults' ratings of the familiarity and perceived relaxation of music are positively correlated [@Tan2012], and parents produce music for their children often [@Mendoza2019a; @Custodero2003; @Custodero2003a; @Mehr2014; @Trehub1997a], so familiar music may produce mere-exposure effects [@Zajonc2001] on infant relaxation.
Indeed, infant arousal, as indexed by electrodermal activity, decreased in response to maternal singing in a "soothing" style, relative to a "playful" style; but both styles were produced in familiar songs [@Cirelli2019]. Listening to live or recorded lullabies reduced heart rate in pre-term infants, more so than a silent control, but the songs were well-known and produced in a familiar language [@Garunkstiene2014]. Singing reduced distress after a still-face procedure, as indexed by increased smiling and decreased ratings of negative affect, but the effects were driven by the familiarity of the songs [@Cirelli2020]. Infants attended longer to singing than speech before becoming fussy, when both were produced in a foreign language [@Corbeil2016], but whether this effect reflects increased attention to songs or increased relaxation as a result of listening to music is unknown. In sum, while there is some evidence that infant-directed songs produce relaxation effects in infants, the effects in prior studies may be attributable to infants' familiarity with the songs, rather than the songs' acoustic properties (as would be predicted by a credible signaling account [@Mehr2017; @Mehr2020]).
In this paper, we ask whether infants relax in response to infant-directed songs produced in unfamiliar languages from foreign societies. We played infants pairs of songs drawn from the *Natural History of Song* Discography [@Mehr2019], a collection of lullabies, dance songs, healing songs, and love songs recorded in 86 world cultures, that were either infant-directed (the lullabies) or not (the other song types). We measured infants' heart rate, pupil dilation, electrodermal activity, frequency of blinking, and gaze direction. Based on prior results in a similar listening paradigm [@Mehr2017a; @Kotler2019], we preregistered a hypothesis that infants would show decreased heart rate (i.e., a relaxation response) during the lullabies, relative to the non-lullabies. We report a test of that hypothesis, a series of planned exploratory analyses of other measures of infants' responses, and a measure of parents' intuitions about the songs.
# Methods
## Participants
```{r descriptives}
# get data
hr <- read.csv("./data/IPL_hr_clean.csv")
studylog <- read.csv("./data/IPL_studylog.csv", header = TRUE)
# descriptives
n_female <- n_groups(hr %>%
filter(female_baby == 1) %>%
group_by(id))
ages <- hr %>%
group_by(id) %>%
summarise(age = mean(age))
# data exclusions and why
excluded <- studylog %>%
filter(fussout == 1 | exclude == 1) %>%
select(id, fussout, exclude_reason) %>%
group_by(id)
n_excluded <- n_groups(excluded)
n_excluded_fussy <- n_groups(excluded %>%
filter(fussout == 1))
n_excluded_attn <- n_groups(excluded %>%
filter(exclude_reason == "never looked at stim"))
n_excluded_tech <- n_groups(
excluded %>%
filter(fussout == 0) %>%
filter(
exclude_reason == "missing front camera video" |
exclude_reason == "missing part of front camera video" |
exclude_reason == "missing e4 marker" |
exclude_reason == "no E4 sync marker" |
exclude_reason == "very poor BVP signal" |
exclude_reason == "bad HR"
)
)
n_excluded_error <- n_groups(excluded %>%
filter(exclude_reason == "wrong condition run"))
# how many completed all trials
n_complete <- n_groups(studylog %>%
group_by(id) %>%
filter(fussout == 0 & exclude == 0 & exclude_trials. == "No"))
# how many participants per seat type
seat_count <- studylog %>%
filter(fussout == 0 | exclude == 0 | seat != "") %>%
select(id, seat)
seat_highchair <- count(seat_count %>% filter(seat == "highchair"))
seat_recliner <- count(seat_count %>% filter(seat == "recliner"))
seat_lap <- count(seat_count %>% filter(seat == "lap"))
```
We recruited 144 typically-developing infants from the greater Boston area (`r n_female` females, mean age = `r mean(ages$age) %>% round(1)` months, SD = `r sd(ages$age) %>% round(1)`, range: `r min(ages$age) %>% round(1)`-`r max(ages$age) %>% round(1)`). Data from an additional `r n_excluded` infants were collected but excluded due to infant fussiness (*n* = `r n_excluded_fussy`); lack of attention (*n* = `r n_excluded_attn`); technical error (*n* = `r n_excluded_tech`); or experimenter error (*n* = `r n_excluded_error`). Nearly all infants were born full-term. Information about language exposure was available from 98 of the participants; of these, none of the languages spoken at home matched those used in the stimuli of this study (see Table 1).
Infants who became fussy and ended their participation partway through the study were included in the analyses if they attended to the first pair of songs and the subsequent preference trial (see Stimuli, below). Most infants (*n* = `r n_complete`) contributed data for all four song pairs and preference trials. For compensation, parents received a $5 gift card and infants were given a prize. All testing took place at the Music Lab at Harvard University. Parents provided informed consent prior to their and their infant's participation. This research was approved by the Committee on the Use of Human Subjects, Harvard University's Institutional Review Board.
## Stimuli
We chose 16 songs from the *Natural History of Song* Discography [@Mehr2019] that were originally produced in 15 different societies and languages (Table 1). Eight of the songs were infant-directed, having been used as lullabies (i.e., they were originally used to soothe, calm, or put an infant/child to sleep) in the societies where they were recorded, according to the anthropologist or ethnomusicologist who collected each recording. The other 8 songs were originally produced in the context of expressing love (5); healing the sick (2); or dancing (1).
We chose this particular subset of 16 songs by first limiting the corpus to those songs produced by a single singer with no instrumental accompaniment; then, using adults' ratings of the songs from a previous study [@Mehr2018], we chose a set of lullabies rated as likely to be "used to soothe a baby" and a set of non-lullabies with low ratings on that item.
We paired the lullabies and non-lullabies from these sets so as to match the perceived gender of the singer as closely as possible, because infants are sensitive to the gender of voices [@Miller1983]. We ordered the pairs such that those with larger differences on the rating "used to soothe a baby" were presented first, so as to maximize the measurable differences in responses to lullabies vs. non-lullabies in each infant, even if they became inattentive or fussy partway through the study. All recordings were normalized to approximately balance their perceived loudness and were also manually edited to remove background noise and microphone artifacts, using noise reduction filters and equalization.
<!-- table 1: info about songs, cultures, languages -->
\afterpage{%
\begin{table}[p]
\small
\tabulinesep=1.1mm
\begin{tabu} to \textwidth {l@{\hskip 0.3in}X[l]X[l]X[l]@{\hskip 0.3in}lX[l]X[l]X[l]}
& \multicolumn{3}{@{}l}{Lullaby} & \multicolumn{4}{@{}l}{Paired non-lullaby} \\
\hline
Gender & Society & Region & Language & Type & Society & Region & Language \\
\hline
Female & Saami & Scandinavia & Luk Saami & Love & Nenets & North Asia & Tundra Nenets \\
& Nahua & Maya Area & Western Nahuatl & Love & Serbs & Southeastern Europe & Serbian Standard \\
& Igulik Inuit & Arctic and Subarctic & Western Canadian Inuktitut & Dance & Chachi & Northwestern South America & Cha'palaa \\
& Kuna & Central America & Border Kuna & Love & Highland Scots & British Isles & Scottish Gaelic \\
Male & Iroquois & Eastern Woodlands & Cherokee & Love & Kurds & Middle East & Central Kurdish \\
& Hopi & Southwest and Basin & Hopi & Healing & Hawaiians & Polynesia & Hawaiian \\
& Ona & Southern South America & Selk'nam & Love & Chuuk & Micronesia & Chuukese \\
& Highland Scots & British Isles & Scottish Gaelic & Healing & Seri & Northern Mexico & Seri \\
\end{tabu}
\caption*{\textbf{Table 1 | The songs infants heard.} Using the \textit{Natural History of Song} Discography\textsuperscript{1}, we chose 8 lullabies and paired them with non-lullabies drawn from the other three song types in the corpus (dance, love, or healing), matching the perceived gender of the singer. All songs were produced by solo voices without instrumental accompaniment.}
\end{table}
\clearpage
}
We generated animations of two characters who lip-synced to each song, giving the impression that they were singing (Fig. 1; videos are available at https://osf.io/2t6cy). Each character sang four songs, such that one exclusively sang lullabies while the other exclusively sang non-lullabies. The videos were counterbalanced on four dimensions: which was the first song heard (lullaby or non-lullaby), which character was the lullaby singer (red or blue), which side the lullaby singer appeared on (left or right, to match character placement during silent preference trials; see Procedure), and the perceived gender of the singer (male or female). This yielded 16 conditions, which we balanced across ages, such that each counterbalancing condition included infants across the full range of ages tested.
Regardless of counterbalancing condition, we varied the presentation order of lullabies vs. non-lullabies, so that they did not appear in strict alternation, which could introduce order effects. This yielded trial orderings that were either P-L-N-P-N-L-P-L-N-P-N-L-P or P-N-L-P-L-N-P-N-L-P-L-N-P, where L denotes a lullaby singing trial, N denotes a non-lullaby singing trial, and P denotes a preference trial. Because there were two characters, and each character sang four songs, each infant in the experiment listened to 8 of the 16 songs.
<!-- fig 1: diagram of order of events (no code required) -->
\afterpage{%
\begin{figure}[p]
\centering
\includegraphics[height=4.5in]{./viz/IPL_fig1.pdf}
\caption*{\textbf{Fig. 1 | Structure of the experiment.} Infants viewed videos of animated characters who either appeared in silence (during preference trials) or who sang the songs one at a time, next to a distracting animation of slowly-moving colored boxes.}
\end{figure}
\clearpage
}
## Procedure
Infants sat in a high chair (*n* = `r seat_highchair`), recliner (*n* = `r seat_recliner`), or a parent’s lap (*n* = `r seat_lap`) approximately 150 cm away from a 107.5 x 60.5 cm television screen; parents chose the seat based on the physical size of the infant and whether the infant was comfortable sitting in it. When infants sat in a high chair or recliner the parent sat behind them. When infants sat on their parent's lap, the parent listened to masking music through passive noise-canceling headphones throughout the experiment; we also asked parents to keep their eyes closed. We recorded videos of the infants at ultra high definition (8-bit 4K at 150Mbps; Panasonic Lumix GH5S and Lumix G Vario 14-140mm lens).
Fig. 1 depicts the order of events. The experiment began with a 14 s baseline preference trial, in which the two animated characters were presented simultaneously in silence. Four sets of three trials followed, with each set consisting of two singing trials and one preference trial. On the singing trials, one of the animated characters sang a song, appearing alone on the screen next to a screen-saver-like animation (to reduce the likelihood that infants would look only at the singer). Each singing trial was 14 s long. The preference trials were identical to the baseline preference trial. Attention-grabbing animations appeared at the center of the screen before each preference trial. The experiment lasted about five minutes.
Characters on the screen were 25 cm wide. They were presented 45 cm apart when appearing simultaneously during the preference trials. Videos were presented at 4K resolution and audio played from two speakers (Neumann KH80 DSP) at approximately the height of the infants' ears, 125 cm apart, placed such that the infant was seated at the apex of an equilateral triangle formed with the two speakers. The songs had a maximum volume of approximately 60 dB.
## Infant measures
### Psychophysiology
We recorded infant heart rate and electrodermal activity with a physiological monitor (Empatica E4) attached to the infant's thigh or calf, depending on the size of the infant, and usually on the left side. The monitor records heart rate via a photoplethysmograph at the site of the device and electrodermal activity via electrodes attached to the side or bottom of the infant's foot (with BIOPAC isotonic gel); it has been successfully validated in adults [@vanLier2020].
### Pupillometry
```{r pupilsReliability}
# get data
all_pupil_annotations <-
read_csv("./data/IPL_pupils.csv") %>%
rename(participant = video)
# extract reliability data (for later on)
pupils_reliability_annotations <- all_pupil_annotations %>%
group_by(stimulus) %>%
mutate(n_annotations = n()) %>%
filter(n_annotations > 1) %>%
ungroup() %>%
mutate(num = nth_annotation)
# reshape annotations
pupils_rel_ann_wide <- pupils_reliability_annotations %>%
select(
stimulus,
num,
pupil_area,
pupil_area_rel,
width,
height,
left,
top
) %>%
pivot_longer(-c(stimulus, num)) %>%
mutate(name = paste0(name, "_", num)) %>%
select(-num) %>%
pivot_wider()
# reshape visibility categories
pupils_rel_radio_respones <- pupils_reliability_annotations %>%
select(stimulus, num, noticeRadios) %>%
pivot_longer(-c(stimulus, num)) %>%
mutate(name = paste0(name, "_", num)) %>%
select(-num) %>%
pivot_wider()
# combine reshaped data
pupils_reliability_wide <-
full_join(pupils_rel_ann_wide, pupils_rel_radio_respones, by = "stimulus") %>%
mutate(comb_radio = if_else(
noticeRadios_1 == noticeRadios_2,
noticeRadios_1,
"differentRadios"
))
# overall reliability
pupils_reliability_correlation_all <- pupils_reliability_wide %>%
select(comb_radio, pupil_area_1, pupil_area_2) %>%
na.omit() %>%
summarise(r = cor(pupil_area_1, pupil_area_2))
# reliability by visibility type
pupils_reliability_correlations <- pupils_reliability_wide %>%
select(comb_radio, pupil_area_1, pupil_area_2) %>%
na.omit() %>%
group_by(comb_radio) %>%
summarise(r = cor(pupil_area_1, pupil_area_2))
```
We developed a new procedure to manually annotate pupil dilation and applied it to still images from 30 of the infants. We extracted still images of the infant's face from the videos and used the `dlib` face recognition library [@King2009] to automatically rotate the frame, levelling the eye horizontally; and to isolate one of the infant's eyes (we randomly selected either the left or right eye for each infant). Workers on Amazon Mechanical Turk then viewed each eye image (see Supplementary Fig. 1) and were asked (1) to adjust its brightness and contrast, so as to optimize visibility of the pupil; (2) to rate how visible the pupil was (from one of six options: \textit{Pupil is clearly visible; Pupil is visible, but it's difficult to see; Pupil is NOT visible, but I could see enough of it to make a guess about its outline; Pupil is NOT visible but the eye is still open; Pupil is NOT visible because the eye is closed; Other}); and (3) to draw a superimposed ellipse on the image, surrounding the visible area of the pupil.
We set two qualification criteria for workers based on their performance in 10 eye images: (1) a correlation of at least $r = .8$ between their annotations (i.e., width and height of their ellipses) and the mean annotations from a pilot study ($N = 46$ workers); and (2) in at least 7 of the 10 images, a matching visibility rating with the option selected by at least 15% of pilot participants. Workers were not aware of whether or not an image being annotated was counted toward the qualification, but they were told that their performance was being evaluated in real time.
The pool of qualified workers then annotated 3 images per second of infant video, drawn from the singing trials only and presented in a random order. Each worker annotated approximately 263 images and spent 22.5 s per image, on average. Four images per trial were "validation images" that were presented more than once to the same worker, providing a measure of internal reliability of the annotations. Reliability was high, as measured in two ways. First, visibility ratings were internally consistent (that is, validation images were generally classified repeatedly in the same fashion by annotators; see confusion matrix in Supplementary Fig. 2). Second, the annotated pupil sizes were internally consistent: validation annotations correlated with the original annotations at $r = `r pupils_reliability_correlation_all[['r']] %>% round(2)`$ (using total pupil area). The degree of reliability varied as a function of how visible the pupil was; validation images marked \textit{Pupil is clearly visible} correlated at $r = `r pupils_reliability_correlations %>% filter(comb_radio == 'clearly') %>% pull(r) %>% round(2)`$, whereas those marked \textit{Pupil is NOT visible, but I could see enough of it to make a guess about its outline} correlated less strongly, at $r = `r pupils_reliability_correlations %>% filter(comb_radio == 'estimate') %>% pull(r) %>% round(2)`$.
To produce the data used in analyses, we computed a relative pupil size measure by dividing the pupil area by the full eye area, in pixels and within-participants, so as to adjust for increases in visible pupil size due to motion toward or away from the camera (which would erroneously increase or decrease the visible area of the pupil, respectively). Last, we removed all observations above the 99th percentile and below the 1st percentile; these appeared to be impossibly large or small values due to face recognition errors in the automated image extraction.
### Gaze and blinking
We manually annotated infant gaze and blinks, frame-by-frame at 60 fps using Datavyu [@DatavyuTeam2014]. Annotators worked with the audio muted, so that they remained unaware of the songs each character sang.
For gaze, we randomly selected 20% of the videos, which a second person then annotated, independently of the first set of annotations. We assessed reliability by correlating trial-wise durations of gaze toward the two locations on the screen across pairs of annotators for each infant. Reliability was high (median *r* = .98, interquartile range: .90-.99).
For blinks, which are more difficult to annotate, and, given their sparsity, are more likely to produce internally unreliable annotations, we used a slightly different procedure. Two annotators independently annotated all the videos, and we assessed the reliability of each video's annotations by correlating the two annotators' trial-wise counts of blinks for each infant. The distribution of correlations was strongly left-skewed, with approximately ten low outliers (*r*s < .6). The annotators revisited those 10 videos and corrected any evident errors, or elected to drop these infants from analyses, because they disagreed about the timing and frequency of the blinking. The decision to drop these participants was made blind to the results of any analyses. Among the remaining participants (*n* = 140) reliability was high (median *r* = .94, interquartile range: .85-1).
## Parent measures
After the infant completed the experiment, parents viewed videos of singing trials for the 8 songs which their infant had not heard during the study, using a tablet. For each pair, we asked parents to choose the song they would prefer to sing if their baby were fussy (and assuming the parent already knew how to sing both songs). We analyzed all available data, regardless of whether or not the parent's infant had completed the experiment. Parents also completed a survey concerning their infant's home musical environment, for use in a separate study.
## Statistical power
Because the experimental method we designed is new, no identical benchmark exists on which we could base a power analysis. Instead, we used data from a similar listening experiment in typically-developing adults [@Mehr2017a] to compute a plausible within-subjects effect size, based on the difference in mean heart rate during speech vs. song in people with Prader-Willi syndrome (*d* = 0.36). We chose a target sample size of *N* = 144 prior to running the experiment to provide power greater than .99 for the main planned comparison (i.e., mean heart rate during lullaby trials relative to non-lullaby trials). We also chose this sample size to facilitate even counterbalancing of stimuli across a wide range of infant ages, maximizing our ability to measure age effects while avoiding effects of stimulus ordering.
# Results
## Confirmatory analysis
```{r heartRate}
# get data
hr <- read.csv("./data/IPL_hr_clean.csv")
# mean-based analyses
hr_lul_means <- hr %>%
filter(lultrial == 1) %>%
group_by(id) %>%
summarise(mean_lul_hr = mean(zhr_pt, na.rm = TRUE))
hr_lul_descriptives <- t.test(hr_lul_means$mean_lul_hr) %>%
tidy() %>%
mutate(sd = sd(hr_lul_means$mean_lul_hr, na.rm = TRUE)) %>%
mutate(cohen.d = cohensD(hr_lul_means$mean_lul_hr, mu = 0))
hr_nlul_means <- hr %>%
filter(lultrial == 0) %>%
group_by(id) %>%
summarise(mean_nlul_hr = mean(zhr_pt, na.rm = TRUE))
hr_nlul_descriptives <- t.test(hr_nlul_means$mean_nlul_hr) %>%
tidy() %>%
mutate(sd = sd(hr_nlul_means$mean_nlul_hr, na.rm = TRUE)) %>%
mutate(cohen.d = cohensD(hr_nlul_means$mean_nlul_hr, mu = 0))
hr_joined <- inner_join(hr_lul_means, hr_nlul_means)
t_mean_hr <-
t.test(hr_joined$mean_lul_hr, hr_joined$mean_nlul_hr, paired = TRUE)
# predict hr effect size from age
hr_age <-
inner_join(ages, hr_joined) %>% mutate(hr_diff = mean_lul_hr - mean_nlul_hr)
hr_age_effect <- lm(hr_diff ~ age, hr_age) %>%
glance()
# gender effects
hr_gender <- hr %>%
select(id, zhr_pt, lultrial, female_parent, female_singer) %>%
filter(!is.na(lultrial))
hr_gender_same <- hr_gender %>%
filter((female_parent == 1 &
female_singer == 1) |
(female_parent == 0 & female_singer == 0)) %>%
group_by(id, lultrial) %>%
summarise(mean_hr = mean(zhr_pt, na.rm = TRUE)) %>%
spread(lultrial, mean_hr) %>%
mutate(diff = `1` - `0`)
hr_gender_different <- hr_gender %>%
filter((female_parent == 0 &
female_singer == 1) |
(female_parent == 1 & female_singer == 0)) %>%
group_by(id, lultrial) %>%
summarise(mean_hr = mean(zhr_pt, na.rm = TRUE)) %>%
spread(lultrial, mean_hr) %>%
mutate(diff = `1` - `0`)
hr_gendersame_descriptives <- t.test(hr_gender_same$diff) %>%
tidy() %>%
mutate(sd = sd(hr_gender_same$diff, na.rm = TRUE))
hr_genderdifferent_descriptives <-
t.test(hr_gender_different$diff) %>%
tidy() %>%
mutate(sd = sd(hr_gender_different$diff, na.rm = TRUE))
t_hr_gender <-
t.test(hr_gender_different$diff, hr_gender_same$diff, paired = FALSE) %>%
tidy()
```
We preregistered the prediction that infants' heart rate would decrease more substantially as a result of listening to foreign lullabies than non-lullabies (the preregistration is available at https://osf.io/f69mn). To this end, we normalized heart rate values during singing trials relative to the previous trial (where the previous trial was either a singing trial or a silent preference trial), such that *z*-scores are interpretable as immediate changes in heart rate, indexing moment-to-moment relaxation (n.b., this normalization procedure was also preregistered): positive *z*-scores thus indicate an increase in heart rate from the previous trial, and negative scores a decrease.
In the main analyses, we analyzed trial-wise mean *z*-scores for each infant, split by song type. As in previous work [@Mehr2017a; @Kotler2019], we trimmed (a) all values on trials for which there were fewer than 5 heart rate observations during the normalization period (the previous trial), as this would produce uninterpretable standard deviation values with which to compute *z*-scores; and (b) extreme values, defined as $|z|$ > 5. These trimming rules dropped 2.19% and 0.31% of the heart rate observations, respectively, and 2 of the 144 participants. These decisions did not substantively affect any of the results.
Mean normalized heart rate during lullabies (Fig. 2a) differed significantly from 0, indicating a decrease in heart rate relative to the previous trial (in *z*-scores, *M* = `r hr_lul_descriptives$estimate %>% round(2)`, *SD* = `r hr_lul_descriptives$sd %>% round(2)`, 95% CI [`r hr_lul_descriptives$conf.low %>% round(2)`, `r hr_lul_descriptives$conf.high %>% round(2)`]; *t*(`r hr_lul_descriptives$parameter`) = `r hr_lul_descriptives$statistic %>% round(2)`, *p* < .001), *d* = `r hr_lul_descriptives$cohen.d %>% round(2)`, one-sample *t*-test). In contrast, heart rate during non-lullabies was comparable to 0, indicating no change in heart rate relative to the previous trial (*M* = `r hr_nlul_descriptives$estimate %>% round(2) %>% format(nsmall = 2)`, *SD* = `r hr_nlul_descriptives$sd %>% round(2)`, 95% CI [`r hr_nlul_descriptives$conf.low %>% round(2)`, `r hr_nlul_descriptives$conf.high %>% round(2)`]; *t*(`r hr_nlul_descriptives$parameter`) = `r hr_nlul_descriptives$statistic %>% round(2)`, *p* = `r hr_nlul_descriptives$p.value %>% round(2)`). The within-subjects difference between mean heart rates (i.e., the main pre-registered analysis) showed a clear difference between song types, such that lullabies decreased heart rates significantly more than non-lullabies (Fig. 2a; *t*(`r t_mean_hr$parameter`) = `r t_mean_hr$statistic %>% round(2)`, *p* = `r t_mean_hr$p.value %>% signif(1)`, paired *t*-test). These findings confirm the preregistered prediction of reduced heart rate in response to unfamiliar foreign lullabies.
```{r fig2, fig.width = 7, fig.height = 4, fig.cap = "\\textbf{Fig. 2 | Lullabies reduce infant heart rate.} \\textbf{a}, The points depict mean trial-wise heart rates, normalized to the previous 14 s trial (regardless of its type), for each infant, with the gray lines indicating the pairs of points that represent the same infants; the violin plots (coloured areas) are kernel density estimations; the horizontal black lines indicate the means across all participants; and the shaded white boxes indicate the 95\\% confidence intervals of the means. The points are jittered to improve clarity. Heart rates were reduced during lullabies (the mean $z$-score was negative and significantly different than 0, denoted by the horizontal dotted line), relative to the previous trial, but no such effect was found for non-lullabies. Within-infants, heart rate during lullabies was significantly lower than during non-lullabies. \\textbf{b}, An analysis of heart rate over time, averaged across all trials, shows that while heart rate drops initially in all singing trials, the drop is more pronounced in lullabies, driving the overall effect. The lines and confidence bands are from a generalized additive model that does not account for nesting. \\textsuperscript{\\ast\\ast\\ast}$p<.001$; \\textsuperscript{\\ast\\ast}$p<.01$"}
# fig 2a: mean heart rate violinplots
hr_reshape <-
hr_joined %>% pivot_longer(
cols = c(mean_lul_hr, mean_nlul_hr),
values_to = "zhr",
names_to = "songtype"
)
ylab <- expression(paste("Mean heart rate (", italic("z"), ")"))
title2a <- expression(bold("a"))
fig2a <- ggplot(
data = hr_reshape,
aes(
y = zhr,
x = songtype
)
) +
geom_hline(
yintercept = 0,
linetype = "dashed",
alpha = .8,
size = .5
) +
geom_violin(aes(fill = songtype),
trim = FALSE,
alpha = .8
) +
scale_fill_manual(values = c("blue", "red")) +
geom_line(aes(group = id),
position = position_jitter(
width = .025,
seed = 6012
),
alpha = .1
) +
geom_point(
aes(y = zhr),
position = position_jitter(
width = .025,
seed = 6012
),
size = 1.1,
pch = 21,
fill = "white"
) +
stat_summary(
geom = "crossbar",
fun.data = mean_cl_normal,
fun.args = list(conf.int = 0.95),
fill = "white",
width = 0.8,
alpha = 0.8,
size = 0.4
) +
stat_summary(
fun = "mean",
width = 0.9,
size = 0.4,
geom = "crossbar"
) +
geom_segment(aes(
y = 1.75,
yend = 1.75,
x = 1.05,
xend = 1.95
),
size = 0.1
) +
annotate(
geom = "text",
x = 1.5,
y = 1.8,
label = "**",
size = 4
) +
annotate(
geom = "text",
x = 0.7,
y = -.9,
label = "***",
size = 4
) +
scale_x_discrete(labels = c("Lullaby", "Non-lullaby")) +
theme_bw() +
theme(
axis.text = element_text(colour = "black", size = 10),
axis.title.x = element_text(size = 10, color = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = "n"
) +
ylab(ylab) +
xlab("") +
ggtitle(title2a)
# fig 2b: timewise heart rates within trials
hrsongs <- hr %>% drop_na(lultrial)
ylab <- expression(paste("Heart rate (", italic("z"), ")"))
titleb <- expression(bold("b"))
fig2b <- ggplot(data = hrsongs) +
geom_smooth(aes(
x = time_trial,
y = zhr_pt,
color = factor(lultrial)
),
method = "gam"
) +
scale_color_manual(values = c("red", "blue")) +
scale_x_continuous(breaks = seq(0, 14, by = 1)) +
annotate(
geom = "text",
x = 11,
y = -0.25,
label = 'bold("Lullaby")',
color = "blue",
parse = TRUE,
vjust = "inward",
hjust = "inward",
size = 3.5
) +
annotate(
geom = "text",
x = 10,
y = 0.1,
label = 'bold("Non-lullaby")',
color = "red",
parse = TRUE,
vjust = "inward",
hjust = "inward",
size = 3.5
) +
theme_bw() +
theme(
axis.text = element_text(colour = "black", size = 10),
axis.title.x = element_text(size = 10, color = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = "n"
) +
ylab(ylab) +
xlab("Time during trial (s)") +
ggtitle(titleb)
# plot
fig2a + fig2b + plot_layout(widths = c(1, 2))
```
We conducted three planned follow-up analyses. First, to determine what drove the mean difference in heart rate across lullabies and non-lullabies, we visualized the trajectory of heart rate within singing trials in a time-series analysis (Fig. 2b). While heart rates dropped almost immediately following the onset of singing, regardless of song type, this drop was more pronounced during lullabies. Because time-wise heart rate trends were nonlinear, and in the absence of any a priori predictions about those trends, we elected not to model them directly.
Second, we tested whether the heart rate effects were driven by any particular age range of infants. They were not: a regression of the within-subjects difference between mean heart rate during lullabies vs. non-lullabies on infant age found no significant effect (Supplementary Fig. 3; *F*(1, `r hr_age_effect$df.residual`) = `r hr_age_effect$statistic %>% round(2)`, *p* = `r hr_age_effect$p.value %>% round(2)`, *R*^2^ = `r hr_age_effect$r.squared %>% round(2)`, omnibus test).
Third, we tested whether a match between the gender of the infant's primary caregiver (as specified by the parent who attended the experiment with the infant) and the perceived gender of the singers predicted any difference in within-subjects main effects, because, for instance, when hearing male-sounding lullabies, those infants who have male primary caregivers may be likely to relax more than those infants with female primary caregivers, since male singers may sound more familiar to them. We found no evidence for such an effect: the within-subjects main effect was of comparable size across infants (main effect when gender of singer was matched to primary caregiver: *M* = `r hr_gendersame_descriptives$estimate %>% round(2)`, *SD* = `r hr_gendersame_descriptives$sd %>% round(2)`, 95% CI [`r hr_gendersame_descriptives$conf.low %>% round(2)`, `r hr_gendersame_descriptives$conf.high %>% round(2)`]; main effect when gender of singer was not matched to primary caregiver: *M* = `r hr_genderdifferent_descriptives$estimate %>% round(2)`, *SD* = `r hr_genderdifferent_descriptives$sd %>% round(2)`, 95% CI [`r hr_genderdifferent_descriptives$conf.low %>% round(2)`, `r hr_genderdifferent_descriptives$conf.high %>% round(2)`]; *t*(`r t_hr_gender$parameter %>% round(2)`) = `r t_hr_gender$statistic %>% round(2)`, *p* = `r t_hr_gender$p.value %>% round(2)`, independent samples *t*-test).
## Exploratory analyses
We conducted a series of exploratory analyses to test for convergent evidence supporting the preregistered result reported above, and to examine an alternate interpretation of the heart rate findings suggested by an anonymous reviewer: that rather than relaxing infants, the lullabies simply captured their attention more so than the other songs. Indeed, in some contexts, heart rate decreases can indicate increased attention to a stimulus [@Richards2000], and music is known to attract infants' attention [@Corbeil2016]. Additional measures can arbitrate between these interpretations.
First, we analyzed infants' pupil dilation, an indicator of both attention to a stimulus [@Laeng2012] and emotional arousal in response to it [@Bradley2008], including during music listening [@Laeng2016; @Widmann2018]. If the lullabies relaxed infants, then pupil size should decrease during lullabies, relative to non-lullabies — contrasting sharply with an attention account for the heart rate findings, which would predict increases in pupil size.
Second, we analyzed infants' electrodermal activity, an indicator of arousal used in prior studies of relaxation responses to music [@Cirelli2020; @Cirelli2019]. If the lullabies relaxed infants, then electrodermal activity should decrease during lullabies, relative to non-lullabies. Increased attention, however, does not imply a directional effect on electrodermal activity.
Third, we analyzed infants' gaze and rate of blinking, as measures of interest in the songs. These measures do not bear on the relaxation hypothesis, but rather, they test the degree to which infants' attention to the animated characters varied as a function of whether they were singing lullabies or non-lullabies.
Last, in two additional analyses (unrelated to the relaxation and attention accounts described above), we explored the degree to which the perceived infant-directedness of the songs was predictive of infants' heart rates; and the degree to which *parents* made inferences about the different song types.
### Relaxation response as indexed by pupillometry
```{r pupillometry}
# remove reliability annotations
pupil_annotations <- all_pupil_annotations %>%
filter(nth_annotation == 1)
# trim <1%ile & >99%ile
percentiles <-
quantile(
pupil_annotations %>% pull(pupil_area_rel),
probs = c(.01, .99),
na.rm = T
)
pupil_annotations <- pupil_annotations %>%
filter(pupil_area_rel > percentiles[1] &
pupil_area_rel < percentiles[2])
# bin by second
pupil_annotations_binned <- pupil_annotations %>%
group_by(participant, trial) %>%
mutate( # compute frame relative to beginning of trial
rel_frame = frame - min(frame),
# compute seconds for binning from relative frames
rel_second = floor(rel_frame / 60)
) %>%
group_by(participant, trial, rel_second) %>%
summarise(
frame = min(frame),
# count frames and NAs in this second
n_frames_in_sec = n(),
n_NAs_in_sec = sum(is.na(width)),
# compute mean of values / annotations
pupil_area_rel = mean(pupil_area_rel, na.rm = T),
# compute first values for categorical vars that are the same anyways
lultrial = first(lultrial),
eye = first(eye)
) %>%
ungroup()
pupil_annotations_binned <- pupil_annotations_binned %>%
group_by(participant) %>%
mutate( # a-score relative pupil area
z_area_rel = scale(pupil_area_rel)
) %>%
ungroup() %>%
mutate( # recompute frame_rel, as this might be slightly offset if the first frame in the second is NA
frame_rel = rel_second * 60
)
# model
pupil_model <- lmer(z_area_rel ~ (1 | trial) + rel_second + lultrial, data = pupil_annotations_binned)
pupil_model_coef <- summary(pupil_model)[["coefficients"]]
pupil_omnibus_test <-
linearHypothesis(pupil_model, c("rel_second = 0", "lultrialnon-lullaby = 0"))
```
We only obtained pupil size annotations for the singing trials, so they could not always be normalized to the previous trial (as in the heart rate analyses). Instead, we normalized across all available data from each infant, after binning observations by second to reduce noise. We analyzed changes in pupil dilation over the course of a singing trial, collapsing across all trials; and tested for differences between lullabies and non-lullabies.
Consistent with a relaxation account, and in contrast to an attention account, pupils were smaller during lullabies than during non-lullabies (Fig. 3). We fit a random-effects linear model to the *z*-scored observations, predicted from the time course of each trial, with a random effect of trial (*N* = `r formatC(pupil_annotations_binned %>% nrow(), format = "d", big.mark = ",")` binned relative pupil size observations from 30 infants, mean `r pupil_annotations_binned %>% group_by(participant) %>% count() %>% pull(n) %>% mean()` observations per infant; likelihood ratio $\chi^{2} = `r pupil_omnibus_test['Chisq'][2,] %>% round(3)`$, $p `r pupil_omnibus_test['Pr(>Chisq)'][2,] %>% format_p()`$). The model showed that pupil size was smaller during lullabies than non-lullabies, on average ($t(`r pupil_model_coef['lultrialnon-lullaby', 'df']%>% round()`) = `r pupil_model_coef['lultrialnon-lullaby', 't value']%>% round(3)`$, $p `r pupil_model_coef['lultrialnon-lullaby', 'Pr(>|t|)']%>% format_p()`$, $\beta = `r pupil_model_coef['lultrialnon-lullaby', 'Estimate']%>% round(3)`$). We found no time-by-trial-type interaction; this is likely because pupil size appeared to regress to the mean by the end of each trial (see Fig. 3).
```{r fig3, fig.height = 3.4, fig.width = 4.2, fig.cap = "\\textbf{Fig. 3 | Pupil dilation is reduced during lullabies.} Collapsing across all singing trials, pupil size was lower during lullabies than non-lullabies, in the subset of the participants studied ($N = 30$). The blue and red lines and confidence bands are from a LOESS regression that does not account for nesting."}
# fig 3: time-wise pupillometry
ggplot(
data = pupil_annotations_binned,
aes(
x = rel_second,
y = z_area_rel,
color = factor(lultrial)
)
) +
geom_smooth(
method = "loess",
span = 1.5
) +
scale_color_manual(values = c("non-lullaby" = "red", "lullaby" = "blue")) +
scale_x_continuous(breaks = seq(0, 14, by = 1)) +
scale_y_continuous(
breaks = seq(-0.2, 0.2, by = 0.05),
expand = expand_scale(mult = 0.05)
) +
annotate(
geom = "text",
x = 10.5,
y = -0.075,
label = 'bold("Lullaby")',
color = "blue",
parse = TRUE,
vjust = "inward",
hjust = "inward",
size = 3.5
) +
annotate(
geom = "text",
x = 10,
y = 0.11,
label = 'bold("Non-lullaby")',
color = "red",
parse = TRUE,
vjust = "inward",
hjust = "inward",
size = 3.5
) +
theme_bw() +
theme(
axis.text = element_text(colour = "black", size = 10),
axis.title.x = element_text(size = 10, color = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = "n"
) +
ylab("Pupil size relative to eye size (z)") +
xlab("Time during trial (s)")
```
### Relaxation response as indexed by electrodermal activity
```{r eda}
# load in data
eda <- read.csv("./data/IPL_eda_clean.csv")
# cleaning
eda_clean <- eda %>% filter(abs(zeda) < 5)
eda_time <- eda_clean %>%
group_by(id, time_allbin) %>%
summarise(zeda = mean(zeda))
# model eda during whole experiment
eda_model <- lmer(zeda ~ time_allbin + (1 | id), eda_time)
eda_model_coef <- summary(eda_model)[["coefficients"]]
# get starting edas for each id/trial pair
eda_deda <- eda_clean %>%
select(id, trial, lultrial, time_allbin, time_trial, zeda) %>%
filter(!is.na(lultrial)) %>%
group_by(id, trial) %>%
mutate(startbin = min(time_allbin))
# merge in starting edas and compute zeda change scores (deda)
starteda <- eda_deda %>%
filter(time_allbin == startbin) %>%
group_by(id, trial) %>%
summarise(starteda = as.numeric(mean(zeda, na.rm = TRUE)))
eda_deda <- left_join(eda_deda, starteda, by = c("id", "trial")) %>%
mutate(deda = zeda - starteda)
# change in centered zeda within trial, with interaction
eda_model_song <- lmer(deda ~ time_trial * lultrial + (1 | trial) + (1 | id), eda_deda)
eda_model_song_coef <- summary(eda_model_song)[["coefficients"]]
eda_model_song_CI <- confint(eda_model_song) # This takes quite a while
eda_song_omnibus_test <- linearHypothesis(eda_model_song, c("time_trial = 0", "lultrial = 0", "time_trial:lultrial = 0"))
# end-of-trial predictions
eda_song_14s_diff_test <- linearHypothesis(eda_model_song, c("lultrial + 14 * time_trial:lultrial= 0"))
eda_song_14s_diff_beta <- eda_model_song_coef["lultrial", "Estimate"] + 14 * eda_model_song_coef["time_trial:lultrial", "Estimate"]
eda_song_14s_diff_beta_CI.lower <- eda_model_song_CI["lultrial", "2.5 %"] + 14 * eda_model_song_CI["time_trial:lultrial", "2.5 %"]
eda_song_14s_diff_beta_CI.upper <- eda_model_song_CI["lultrial", "97.5 %"] + 14 * eda_model_song_CI["time_trial:lultrial", "97.5 %"]
```
We used the same normalization approach as the pupillometry analysis, because normalizing to the previous trial, as in the heart rate analyses, produced a distribution with unacceptably long tails (*z*s > 100). This is likely because the short trial length (14 s) affords only minimal variability in electrodermal activity, which generally changes much more slowly than does heart rate, inflating *z*-scored values. Normalization to the full experiment period produced a more acceptably narrow range of *z*-scores, such that applying the same trimming criterion as we used for heart rate (|*z*| > 5) resulted in the removal of only 4 observations of nearly 100,000.
First, we noted an overall positive trend in electrodermal activity throughout the study, irrespective of the songs the infant was listening to. We fit a random-effects linear model to all *z*-scored observations (*N* = `r formatC(nobs(eda_model), format = "d", big.mark = ",")` from `r summary(eda_model)[['ngrps']]` infants, mean 180 observations per infant), which showed that electrodermal activity steadily increased throughout the experiment, on average ($t(`r eda_model_coef['time_allbin', 'df']%>% round()`) = `r eda_model_coef['time_allbin', 't value']%>% round(1)`$, $p `r eda_model_coef['time_allbin', 'Pr(>|t|)'] %>% format_p()`$, $\beta = `r eda_model_coef['time_allbin', 'Estimate']%>% round(3)`$).
Note that this result contrasts sharply with infants' responses during a distress induction procedure, as in previous research on the calming effects of singing [@Cirelli2020]. In that type of study, arousal and fussiness increase during a negative interaction (e.g., a still-face procedure), and subsequently decrease during a positive "recovery phase". It is unsurprising, however, given the structure of this experiment: infants often become bored and fussy during repetitive experiments, increasing arousal.
As such, we measured the rate of increase in electrodermal activity, and analyzed changes in electrodermal activity as a function of lullaby or non-lullaby listening *relative to this increase*. This required centering the *z*-scores infant- and trial-wise. The key question is thus whether listening to a lullaby yields lower electrodermal activity than the predicted overall trial-wise increase, all else equal.
The results supported the relaxation account (Fig. 4). We fit a random-effects linear model of electrodermal activity change scores over time, trial-wise, so as to test for a time by song type interaction. The model fit was acceptable (likelihood ratio $\chi^2 = `r eda_song_omnibus_test['Chisq'][2,] %>% round(1)`$, $p `r eda_song_omnibus_test['Pr(>Chisq)'][2,] %>% format_p()`$), the interaction term was significant ($t(`r eda_model_song_coef['time_trial:lultrial', 'df']%>% round()`) = `r eda_model_song_coef['time_trial:lultrial', 't value']%>% round(1)`$, $p `r eda_model_song_coef['time_trial:lultrial', 'Pr(>|t|)'] %>% format_p()`$, $\beta = `r eda_model_song_coef['time_trial:lultrial', 'Estimate']%>% round(3)`$), and a general linear hypothesis test showed an expected difference in electrodermal activity between lullabies and non-lullabies at the end of the trial (time = 14 s; $\beta = `r eda_song_14s_diff_beta %>% round(3)`$, 95% CI [`r eda_song_14s_diff_beta_CI.lower %>% round(3)`, `r eda_song_14s_diff_beta_CI.upper %>% round(3)`], $\chi^2 = `r eda_song_14s_diff_test[[2, 'Chisq']] %>% round(1)`$, $p `r eda_song_14s_diff_test[[2, 'Pr(>Chisq)']] %>% format_p()`$, $d = `r (eda_song_14s_diff_beta / sd(eda_deda[['deda']])) %>% round(2)`$). These results indicate that lullabies attenuated increases in electrodermal activity.
```{r fig4, fig.height = 3.4, fig.width = 4.2, fig.cap = "\\textbf{Fig. 4 | Lullabies attenuate increases in arousal.} The black dotted line denotes the expected rise in electrodermal activity during a trial, from a linear model. This rise is attenuated during lullaby trials but not during non-lullaby trials, such that the expected level of electrodermal activity by the end of a lullaby trial is reduced. The blue and red lines and confidence bands are from a generalized additive model that does not account for nesting."}
# fig 4: time-wise electrodermal activity
ylab <- expression(paste("Electrodermal activity (centered ", italic("z"), ")"))
ggplot(data = eda_deda) +
geom_smooth(
aes(
x = time_trial,
y = deda
),
color = "black",
size = 0.5,
linetype = "dashed",
method = "lm",
se = FALSE
) +
geom_smooth(aes(
x = time_trial,
y = deda,
color = factor(lultrial)
),
method = "gam"
) +
scale_color_manual(values = c("red", "blue")) +
scale_x_continuous(breaks = seq(0, 14, by = 1)) +
scale_y_continuous(breaks = seq(-0.02, 0.1, by = 0.02)) +
annotate(
geom = "text",
x = 11,
y = -0.01,
label = 'bold("Lullaby")',
color = "blue",
parse = TRUE,
vjust = "inward",
hjust = "inward",
size = 3.5
) +
annotate(
geom = "text",
x = 10,
y = 0.07,
label = 'bold("Non-lullaby")',
color = "red",
parse = TRUE,
vjust = "inward",
hjust = "inward",
size = 3.5
) +
theme_bw() +
theme(
axis.text = element_text(colour = "black", size = 10),
axis.title.x = element_text(size = 10, color = "black"),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
legend.position = "n"
) +
ylab(ylab) +
xlab("Time during trial (s)")
```
### Visual attention to singers
```{r gaze}
# get data
gaze <- read.csv("./data/IPL_gaze_clean.csv")
# attention check: how long did babies look at the singer during the singing trials?
gaze_attention <- gaze %>%
filter(lultrial == 0 | lultrial == 1) %>% # during singing trials
filter(
lullaby_side == "left" & lultrial == 1 & lookdir == "l" |
lullaby_side == "left" &
lultrial == 0 & lookdir == "r" |
lullaby_side == "right" &
lultrial == 1 & lookdir == "r" |
lullaby_side == "right" &
lultrial == 0 & lookdir == "l"
) %>%
group_by(id) %>%
summarise(looktime = mean(look))
# gaze during singing trials (lullaby singers vs non-lullaby singers)
gaze_lul <- gaze %>%
filter(lultrial == 1) %>%
filter(lullaby_side == "left" &
lookdir == "l" | lullaby_side == "right" & lookdir == "r") %>%
group_by(id) %>%
summarise(lt_lul = mean(look, na.rm = TRUE) / 1000) # converting to seconds
gaze_nlul <- gaze %>%
filter(lultrial == 0) %>%
filter(lullaby_side == "left" &
lookdir == "r" | lullaby_side == "right" & lookdir == "l") %>%
group_by(id) %>%
summarise(lt_nlul = mean(look, na.rm = TRUE) / 1000)
gaze_singers <- inner_join(gaze_lul, gaze_nlul)
# describe & test
gaze_singing_lul <- t.test(gaze_singers$lt_lul) %>%
tidy() %>%
mutate(sd = sd(gaze_singers$lt_lul))
gaze_singing_nlul <- t.test(gaze_singers$lt_nlul) %>%
tidy() %>%
mutate(sd = sd(gaze_singers$lt_nlul))
t_gaze_singing <-
t.test(gaze_singers$lt_lul, gaze_singers$lt_nlul, paired = TRUE) %>%
tidy()
# gaze during preference trials (lullaby singers vs non-lullaby singers)
gaze_lul_pref <- gaze %>%
filter(is.na(lultrial)) %>%
filter(trial > 1) %>% # first trial is silent, but not a preference trial
filter(lullaby_side == "left" &
lookdir == "l" | lullaby_side == "right" & lookdir == "r") %>%
group_by(id) %>%
summarise(lt_lul = mean(look, na.rm = TRUE) / 1000)
gaze_nlul_pref <- gaze %>%
filter(is.na(lultrial)) %>%
filter(trial > 1) %>%
filter(lullaby_side == "left" &
lookdir == "r" | lullaby_side == "right" & lookdir == "l") %>%
group_by(id) %>%
summarise(lt_nlul = mean(look, na.rm = TRUE) / 1000)
gaze_pref <- inner_join(gaze_lul_pref, gaze_nlul_pref)
# describe & test
gaze_pref_lul <- t.test(gaze_pref$lt_lul) %>%
tidy() %>%
mutate(sd = sd(gaze_pref$lt_lul))
gaze_pref_nlul <- t.test(gaze_pref$lt_nlul) %>%
tidy() %>%
mutate(sd = sd(gaze_pref$lt_nlul))
t_gaze_pref <-
t.test(gaze_pref$lt_lul, gaze_pref$lt_nlul, paired = TRUE) %>%
tidy()
# equivalence tests
eq_raw_difference_value <- 1
eq_test_alpha <- .05
gaze_singing_eq_test <- TOSTpaired.raw(
n = nrow(gaze_singers),
m1 = gaze_singing_lul$estimate,
m2 = gaze_singing_nlul$estimate,
sd1 = gaze_singing_lul$sd,
sd2 = gaze_singing_nlul$sd,
r12 = cor(gaze_singers$lt_lul, gaze_singers$lt_nlul),
low_eqbound = -eq_raw_difference_value,
high_eqbound = eq_raw_difference_value,
alpha = eq_test_alpha,
verbose = F,
plot = F
)
gaze_pref_eq_test <- TOSTpaired.raw(
n = nrow(gaze_pref),
m1 = gaze_pref_lul$estimate,
m2 = gaze_pref_nlul$estimate,
sd1 = gaze_pref_lul$sd,
sd2 = gaze_pref_nlul$sd,
r12 = cor(gaze_pref$lt_lul, gaze_pref$lt_nlul),
low_eqbound = -eq_raw_difference_value,
high_eqbound = eq_raw_difference_value,
alpha = eq_test_alpha,
verbose = F,
plot = F
)
```
Last, we ran two sets of exploratory analyses concerning infants' visual attention to the animated characters. In previous research, infants demonstrated social preferences for a person who had previously sung a song familiar to the infant [@Mehr2016; @Mehr2017b]; as such, we explored whether such a preference could be elicited purely on the basis of a difference in the types of songs a singer produced.
We found no evidence for such an effect. Infants looked for comparable durations to the two characters during singing trials (Supplementary Fig. 4; in seconds, lullabies: *M* = `r gaze_singing_lul$estimate %>% round(2)`, *SD* = `r gaze_singing_lul$sd %>% round(2)`, 95% CI [`r gaze_singing_lul$conf.low %>% round(2)`, `r gaze_singing_lul$conf.high %>% round(2)`]; non-lullabies: *M* = `r gaze_singing_nlul$estimate %>% round(2)`, *SD* = `r gaze_singing_nlul$sd %>% round(2)`, 95% CI [`r gaze_singing_nlul$conf.low %>% round(2)`, `r gaze_singing_nlul$conf.high %>% round(2)`]; *t*(`r t_gaze_singing$parameter`) = `r t_gaze_singing$statistic %>% round(2)`, *p* = `r t_gaze_singing$p.value %>% round(2)`). The two one-sided test procedure for equivalence testing [@Lakens2018] confirmed that these rates of attention were statistically equivalent ($\Delta = `r eq_raw_difference_value`$ s; $\Delta_{L}: t(`r gaze_singing_eq_test[['TOST_df']]`) = `r gaze_singing_eq_test[['TOST_t1']] %>% round(3)`, p `r gaze_singing_eq_test[['TOST_p1']] %>% format_p()`; \Delta_{U}: t(`r gaze_singing_eq_test[['TOST_df']]`) =`r gaze_singing_eq_test[['TOST_t2']] %>% round(3)`, p `r gaze_singing_eq_test[['TOST_p2']] %>% format_p()`$).
The same pattern was observed during the preference trials: attention to the two characters in silence, and after they had each sung a lullaby or non-lullaby, did not differ (Supplementary Fig. 4; attention in seconds to lullaby singer: *M* = `r gaze_pref_lul$estimate %>% round(2)`, *SD* = `r gaze_pref_lul$sd %>% round(2)`, 95% CI [`r gaze_pref_lul$conf.low %>% round(2)`, `r gaze_pref_lul$conf.high %>% round(2)`]; non-lullabies: *M* = `r gaze_pref_nlul$estimate %>% round(2)`, *SD* = `r gaze_pref_nlul$sd %>% round(2)`, 95% CI [`r gaze_pref_nlul$conf.low %>% round(2)`, `r gaze_pref_nlul$conf.high %>% round(2)`]; *t*(`r t_gaze_pref$parameter`) = `r t_gaze_pref$statistic %>% round(2)`, *p* = `r t_gaze_pref$p.value %>% round(2)`). These rates were statistically equivalent ($\Delta = `r eq_raw_difference_value`$ s; $\Delta_{L}: t(`r gaze_pref_eq_test[['TOST_df']]`) = `r gaze_pref_eq_test[['TOST_t1']] %>% round(3)`, p `r gaze_pref_eq_test[['TOST_p1']] %>% format_p()`; \Delta_{U}: t(`r gaze_pref_eq_test[['TOST_df']]`) =`r gaze_pref_eq_test[['TOST_t2']] %>% round(3)`, p `r gaze_pref_eq_test[['TOST_p2']] %>% format_p()`$). Note that these analyses include a few more infants than the heart rate analyses do; this is because some infants completed the study and were subsequently excluded from the heart rate analyses due to a poor physiology monitor signal, but had usable gaze data.
```{r blinks}
# get data
blinks <- read.csv("./data/IPL_blink_clean.csv") %>%
filter(!is.na(lultrial)) %>%
group_by(id, lultrial) %>%
summarise(n_blink = median(blink))
# summarize during lullabies vs non-lullabies
blink_lul <- blinks %>%
filter(lultrial == 1) %>%
ungroup() %>%
summarise(
Q1 = quantile(n_blink, 0.25),
median = median(n_blink),
Q3 = quantile(n_blink, 0.75)
)
blink_nlul <- blinks %>%
filter(lultrial == 0) %>%
ungroup() %>%
summarise(
Q1 = quantile(n_blink, 0.25),
median = median(n_blink),
Q3 = quantile(n_blink, 0.75)
)
# test
blink_test <-
wilcox.test(filter(blinks, lultrial == 0)$n_blink,
filter(blinks, lultrial == 1)$n_blink,
paired = TRUE
)
blink_z <- qnorm(blink_test$p.value / 2)
```
As an additional exploratory measure, we counted the number of eye blinks during the singing trials, as blinks may index perceived stimulus salience [@Shultz2011a]. Infants blinked slightly less during lullabies (number of blinks per trial: median = `r blink_lul$median`, interquartile range: `r blink_lul$Q1`-`r blink_lul$Q3`) than non-lullabies (median = `r blink_nlul$median`, interquartile range: `r blink_nlul$Q1`-`r blink_nlul$Q3`), suggesting that they were more interested in the singers during lullabies than during non-lullabies (*z* = `r blink_z %>% round(2)`, *p* = `r blink_test$p.value %>% round(2)`, Wilcoxon signed-rank test). But blinking was rare, so this exploratory result should be interpreted with caution, as it may be an artifact of restricted range.
### Relation between songs' infant-directedness and relaxation effects
```{r infantDirectedness}
# add song identifiers to hr data
ids <- hr %>%
filter(!is.na(zhr_pt), !is.na(lultrial)) %>%
mutate(song = NA)
# add song identifiers from Natural History of Song corpus
# lullaby first, female singers
ids[ids$female_singer == 1 & ids$lullaby_order == "first" & ids$trial == 2, ]$song <- 21
ids[ids$female_singer == 1 & ids$lullaby_order == "first" & ids$trial == 6, ]$song <- 93
ids[ids$female_singer == 1 & ids$lullaby_order == "first" & ids$trial == 8, ]$song <- 9
ids[ids$female_singer == 1 & ids$lullaby_order == "first" & ids$trial == 12, ]$song <- 99
ids[ids$female_singer == 1 & ids$lullaby_order == "first" & ids$trial == 3, ]$song <- 26
ids[ids$female_singer == 1 & ids$lullaby_order == "first" & ids$trial == 5, ]$song <- 18
ids[ids$female_singer == 1 & ids$lullaby_order == "first" & ids$trial == 9, ]$song <- 97
ids[ids$female_singer == 1 & ids$lullaby_order == "first" & ids$trial == 11, ]$song <- 78
# lullaby second, female singers
ids[ids$female_singer == 1 & ids$lullaby_order == "second" & ids$trial == 3, ]$song <- 21
ids[ids$female_singer == 1 & ids$lullaby_order == "second" & ids$trial == 5, ]$song <- 93
ids[ids$female_singer == 1 & ids$lullaby_order == "second" & ids$trial == 9, ]$song <- 9
ids[ids$female_singer == 1 & ids$lullaby_order == "second" & ids$trial == 11, ]$song <- 99
ids[ids$female_singer == 1 & ids$lullaby_order == "second" & ids$trial == 2, ]$song <- 26
ids[ids$female_singer == 1 & ids$lullaby_order == "second" & ids$trial == 6, ]$song <- 18
ids[ids$female_singer == 1 & ids$lullaby_order == "second" & ids$trial == 8, ]$song <- 97
ids[ids$female_singer == 1 & ids$lullaby_order == "second" & ids$trial == 12, ]$song <- 78
# lullaby first, male singers
ids[ids$female_singer == 0 & ids$lullaby_order == "first" & ids$trial == 2, ]$song <- 101
ids[ids$female_singer == 0 & ids$lullaby_order == "first" & ids$trial == 6, ]$song <- 111
ids[ids$female_singer == 0 & ids$lullaby_order == "first" & ids$trial == 8, ]$song <- 95
ids[ids$female_singer == 0 & ids$lullaby_order == "first" & ids$trial == 12, ]$song <- 43
ids[ids$female_singer == 0 & ids$lullaby_order == "first" & ids$trial == 3, ]$song <- 104
ids[ids$female_singer == 0 & ids$lullaby_order == "first" & ids$trial == 5, ]$song <- 81
ids[ids$female_singer == 0 & ids$lullaby_order == "first" & ids$trial == 9, ]$song <- 94
ids[ids$female_singer == 0 & ids$lullaby_order == "first" & ids$trial == 11, ]$song <- 23
# lullaby second, male singer
ids[ids$female_singer == 0 & ids$lullaby_order == "second" & ids$trial == 3, ]$song <- 101
ids[ids$female_singer == 0 & ids$lullaby_order == "second" & ids$trial == 5, ]$song <- 111
ids[ids$female_singer == 0 & ids$lullaby_order == "second" & ids$trial == 9, ]$song <- 95
ids[ids$female_singer == 0 & ids$lullaby_order == "second" & ids$trial == 11, ]$song <- 43
ids[ids$female_singer == 0 & ids$lullaby_order == "second" & ids$trial == 2, ]$song <- 104
ids[ids$female_singer == 0 & ids$lullaby_order == "second" & ids$trial == 6, ]$song <- 81
ids[ids$female_singer == 0 & ids$lullaby_order == "second" & ids$trial == 8, ]$song <- 94
ids[ids$female_singer == 0 & ids$lullaby_order == "second" & ids$trial == 12, ]$song <- 23
# get naive listener ratings regarding perceived song functions
# from Mehr & Singh et al. (2018, Curr Bio) Exp. 1 (data at https://osf.io/d7cn9)
naiv <- read.csv("./data/NAIV_Exp1.csv") %>%
select(starts_with("baby"))
# collapse to per-song means of perceived infant-directedness
naiv_means <- colMeans(naiv, na.rm = TRUE) %>%
as.data.frame() %>%
mutate(song = 1:118) %>%
rename("idsness" = ".")
# merge in hr data
ids_dat <- left_join(ids, naiv_means, by = "song")
# model
ids_model <- lmer(zhr_pt ~ (1 | id) + (1 | trial) + idsness, ids_dat)
ids_model_coefficients <- summary(ids_model)[["coefficients"]]
```
The lullabies we studied differ acoustically from non-lullabies in a number of ways: they tend to be less accented, slower in tempo, have smaller pitch ranges, and have more variable macrometers than the other songs [@Mehr2019]. These features are reflected in naïve listeners' ratings: the lullabies are perceived to have lower melodic and rhythmic complexity, slower tempo, less steady beat, lower arousal, lower valence, and lower pleasantness [@Mehr2018]. Together, these features predict the degree to which listeners perceive a song as infant-directed [@Moser2020; @Mehr2018].