analysis/exp3_analysis_main.Rmd

---
title: 'Experiment 3: Main Analyses'
date: "`r Sys.Date()`"
output:
  github_document:
    toc: yes
    toc_depth: 3
  pdf_document:
    toc: yes
    toc_depth: 3
editor_options:
  markdown:
    wrap: 72
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(tidyverse)
options(dplyr.summarise.inform = FALSE)
library(magrittr)
library(lme4)
library(lmerTest)
library(broom.mixed)
library(insight)
library(kableExtra)
```

# Setup

Variable names:

-   Experiment: exp3\_
-   Data (\_d\_)
    -   d = main df
    -   count = sums of response types
    -   noOther = just *he* and *she* responses
-   Models (\_m\_)
    -   all = effect of Condition and Name Gender Rating, including
        *other* responses
    -   cond = effect of Condition only
    -   noOther = effect of Conditions (Last vs First+Full) and Name
        Gender Rating, only on *he* and *she* responses
    -   FF = dummy coded with First + Full Name conditions as 0, Last
        Name condition as 1
    -   L = dummy coded with Last Name condition as 0, First + Full Name
        conditions as 1

Load data and select columns used in model. See data/exp3_data_about.txt
for more details.

```{r load-data}
exp3_d <- read.csv("../data/exp3_data.csv",
                   stringsAsFactors = TRUE) %>%
  rename("Participant" = "SubjID", "Item" = "Name") %>%
  select(
    Participant, Condition, GenderRating,
    Item, He, She, Other
  )
str(exp3_d)
```

Center gender rating for names: Original scale from 1 to 7, with 1 as
most masculine and 7 as most feminine. Mean-centered with higher still
as more feminine.

```{r center-gender-rating}
exp3_d %<>% mutate(GenderRatingCentered = scale(GenderRating, scale = FALSE))
```

Set contrasts for name conditions, now weighted to account for uneven
sample sizes. This uses Scott Fraundorf's function for weighted
contrasts. (The psycholing package version doesn't support doing 2v1
comparisons, only 1v1.) Condition1 is Last vs First+Full. Condition2 is
First vs Full.

```{r contrast-coding}
source("centerfactor.R")
contrasts(exp3_d$Condition) <- centerfactor(
  exp3_d$Condition, c("last", "first")
)
contrasts(exp3_d$Condition)
```

# Data Summary

Responses by condition.

```{r count-responses}
exp3_d %<>% mutate(ResponseAll = case_when(
  He    == 1 ~ "He",
  She   == 1 ~ "She",
  Other == 1 ~ "Other"
))

exp3_d_count <- exp3_d %>%
  group_by(Condition, ResponseAll) %>%
  summarise(n = n()) %>%
  pivot_wider(
    names_from = ResponseAll,
    values_from = n
  ) %>%
  mutate(
    She_HeOther = She / (He + Other),
    She_He = She / He
  ) %>%
  select(She, He, Other, She_HeOther, She_He)

kable(exp3_d_count, digits = 3)
```

# Model 1: With *Other* Responses

Effects of Condition (first name, last name, full name) and Gender
Rating on the likelihood of a *she* response, as opposed to a *he* or
*other* response. Participant and Item are included as random
intercepts, with items defined as the unique first, last and first +
last name combinations. Because the condition manipulations were fully
between-subject and between-item, fitting a random slope model was not
possible.

Because Experiment 3 always introduces the character with a full name,
then manipulates the name form in the subsequent 3 references, the main
analysis is one model, as opposed to the 2 for Experiment 1.

Condition1 is the contrast between last and first+full. Condition2 is
the contrast between first and full.

```{r model-all}
exp3_m_all <- glmer(
  She ~ Condition * GenderRatingCentered + (1 | Participant) + (1 | Item),
  data = exp3_d, family = binomial
)
summary(exp3_m_all)
```

-   Fewer *she* responses overall

-   Last Name vs First+Full Names condition effect only trending

-   More *she* responses as first names become more feminine

-   Larger effect of first name gender in First+Full Name conditions
    than in Last Name conditions, which makes sense because there are 4
    repetitions of the gendered first name, as opposed to only 1.

## Odds Ratios: Intercept

```{r OR-intercept-all}
exp(get_intercept(exp3_m_all))
exp(-get_intercept(exp3_m_all))
```

0.22x less likely to use *she* overall (or: 4.59x more likely to use
*he* and *other* overall), p\<.001

## Odds Ratios: Last vs First+Full

```{r OR-L-FF-all}
exp3_m_all %>%
  tidy() %>%
  filter(term == "Condition1") %>%
  pull(estimate) %>%
  exp()
```

1.17x more likely to use *she* than *he* and *other* in First + Full
compared to Last, p=0.09

## Odds Ratios: Last Only

Dummy code with Last Name as 0, so that intercept is the Last Name
condition only.

```{r dummy-code-L-all}
exp3_d %<>% mutate(Condition_Last = case_when(
  Condition == "first" ~ 1,
  Condition == "full"  ~ 1,
  Condition == "last"  ~ 0
))
exp3_d$Condition_Last %<>% as.factor()
```

Model with just Condition (to more directly compare to Exp 1).

```{r model-L-all}
exp3_m_cond_L <- glmer(
  She ~ Condition_Last + (1 | Participant) + (1 | Item),
  data = exp3_d, family = binomial
)
summary(exp3_m_cond_L)
```

```{r OR-L-all}
exp(get_intercept(exp3_m_cond_L))
exp(-get_intercept(exp3_m_cond_L))
```

0.17x times less likely to use *she* than *he* and *other* in the Last
Name condition (or: 5.72x more likely to use *he* and *other* in the
Last Name condition), p\<.001

## Odds Ratios: First and Full Only

Dummy code with First and Full Name as 0, so the intercept is the
combination of those two.

```{r dummy-code-FF-all}
exp3_d %<>% mutate(Condition_FF = case_when(
  Condition == "first" ~ 0,
  Condition == "full"  ~ 0,
  Condition == "last"  ~ 1
))
exp3_d$Condition_FF %<>% as.factor()
```

Model with just Condition (to more directly compare to Exp 1).

```{r model-FF-all}
exp3_m_cond_FF <- glmer(
  She ~ Condition_FF + (1 | Participant) + (1 | Item),
  data = exp3_d, family = binomial
)
summary(exp3_m_cond_FF)
```

```{r OR-FF-all}
exp(get_intercept(exp3_m_cond_FF))
exp(-get_intercept(exp3_m_cond_FF))
```

0.22x times less likely to use *she* than *he* and *other* in the First
and Full Name conditions (or: 4.46x more likely to use *he* and *other*
in the First and Full Name conditions), p\<.001

# Model 2: Without *Other* Responses

The sentence completion prompt for Experiment 3 is more open-ended than
in Experiment 1. So, we get a much higher proportion of *other*
responses (31% vs 7%), which I didn't anticipate.

```{r count-other}
sum(exp3_d$Other)
sum(exp3_d$Other) / length(exp3_d$Other)
```

```{r ubset-other}
exp3_d_noOther <- exp3_d %>% filter(Other == 0)
```

So, rerun the main model predicting the likelihood of *she* responses vs
*he* responses, with *other* responses excluded.

```{r model-other}
exp3_m_noOther <- glmer(
  She ~ Condition * GenderRatingCentered + (1 | Participant) + (1 | Item),
  data = exp3_d_noOther, family = binomial
)
summary(exp3_m_noOther)
```

These results are more similar to what we predicted from the previous
experiments:

-   Fewer *she* responses overall
-   Fewer *she* responses in the Last Name condition as compared to the
    First + Full Name conditions (although we wouldn't predict as large
    as a difference as in Exp1, because here there is one instance of
    the first name in the Last Name condition)
-   More *she* responses as first names become more feminine
-   Larger effect of first name gender in First+Full Name conditions
    than in Last Name conditions (which makes sense because there are
    4repetitions of the gendered first name, as opposed to only 1.)

But, to keep the analyses consistent between experiments and avoid
post-hoc decision weirdness, both versions are reported.

## Odds Ratios: Intercept

```{r OR-intercept-other}
exp(get_intercept(exp3_m_noOther))
exp(-get_intercept(exp3_m_noOther))
```

0.65x less likely to use *she* than *he* overall (or: 1.53x more likely
to use *he* than *she* overall), p\<.001

## Odds Ratios: Last vs First+Full

```{r OR-L-FF-other}
exp3_m_noOther %>%
  tidy() %>%
  filter(term == "Condition1") %>%
  pull(estimate) %>%
  exp()
```

1.29x more likely to use *she* than *he* in First+Full than in Last (or:
1.29x more likely to use *he* than *she* in Last than in First+Full),
p\<.001

## Odds Ratios: Last Only

Dummy code with Last Name as 0, so that intercept is the Last Name
condition only.

```{r dummy-code-L-other}
exp3_d_noOther %<>% mutate(Condition_Last = case_when(
  Condition == "first" ~ 1,
  Condition == "full"  ~ 1,
  Condition == "last"  ~ 0
))
exp3_d_noOther$Condition_Last %<>% as.factor()
```

```{r model-L-other}
exp3_m_noOther_L <- glmer(
  She ~ Condition_Last + (1 | Participant) + (1 | Item),
  data = exp3_d_noOther, family = binomial
)
summary(exp3_m_noOther_L)
```

```{r OR-L-other}
exp(get_intercept(exp3_m_noOther_L))
exp(-get_intercept(exp3_m_noOther_L))
```

0.51x times less likely to use *she* than *he* in the Last Name
condition (or: 1.97x more likely to use *he* than *she* in the Last Name
condition), p=.10

## Odds Ratios: First and Full Only

Dummy code with First and Full Name as 0, so the intercept is the
combination of those two.

```{r dummy-code-FF-other}
exp3_d_noOther %<>% mutate(Condition_FF = case_when(
  Condition == "first" ~ 0,
  Condition == "full"  ~ 0,
  Condition == "last"  ~ 1
))
exp3_d_noOther$Condition_FF %<>% as.factor()
```

```{r model-FF-other}
exp3_m_noOther_FF <- glmer(
  She ~ Condition_FF + (1 | Participant) + (1 | Item),
  data = exp3_d_noOther, family = binomial
)
summary(exp3_m_noOther_FF)
```

```{r OR-FF-other}
exp(get_intercept(exp3_m_noOther_FF))
exp(-get_intercept(exp3_m_noOther_FF))
```

0.74x times less likely to use *she* than *he* and *other* in the First
and Full Name conditions (or: 1.35x more likely to use *he* and *other*
in the First and Full Name conditions), p=.46