.se_type_1, .se_type_2 hard coded? #32

acoppock · 2023-09-29T05:55:53Z

Thanks to the team for all the work on this project and package. In projoint_level, I think that the values of .se_type_1 and .se_type_2 aren't being passed to pj_estimate, which seems to hard code both values as "classical".

as a result, the two calls below yield the same standard error estimates.

data("exampleData1")

outcomes <- paste0("choice", seq(from = 1, to = 8, by = 1))
outcomes <- c(outcomes, "choice1_repeated_flipped")

reshaped_data <- reshape_projoint(
  .dataframe = exampleData1, 
  .outcomes = outcomes)

summary(projoint(reshaped_data, .se_type_2 = "classical"))
summary(projoint(reshaped_data, .se_type_2 = "CR2"))

The text was updated successfully, but these errors were encountered:

yhoriuchi · 2023-09-30T15:05:15Z

Thank you, @acoppock, for noting this issue. This is due to my lack of sufficient experience in package development. I set defaults in some internal functions used within projoint(). I believe I fixed this problem. See:
5f0ad37
1ac1818

But I still see the same standard errors by running the following code:

library(projoint)

data("exampleData1")

outcomes <- paste0("choice", seq(from = 1, to = 8, by = 1))
outcomes <- c(outcomes, "choice1_repeated_flipped")

reshaped_data <- reshape_projoint(
  .dataframe = exampleData1, 
  .outcomes = outcomes)

summary(projoint(reshaped_data))
summary(projoint(reshaped_data, .se_type_2 = "stata"))

If you added .se_type_2 = "CR2", then you would see an error message:

> summary(projoint(reshaped_data, .se_type_2 = "CR2"))
Error in check_se_type(se_type, clustered) : 
  `se_type` must be either 'HC0', 'HC1', 'stata', 'HC2', 'HC3', 'classical' or 'none' with no `clusters`.
You passed: CR2 which is reserved for a case with clusters.

This is because the default for .clusters_2 is NULL. This error message also suggests that the hand-coded issue was resolved.

yhoriuchi · 2023-09-30T15:50:50Z

@acoppock , can you also try the following script?

library(projoint)
library(tidyverse)
library(estimatr)

data("exampleData1")

outcomes <- paste0("choice", seq(from = 1, to = 8, by = 1))
outcomes <- c(outcomes, "choice1_repeated_flipped")

reshaped_data <- reshape_projoint(
  .dataframe = exampleData1, 
  .outcomes = outcomes)

cjdata <- reshaped_data@data %>% 
  select(id, task, profile, contains("att"), selected) %>% 
  pivot_longer(cols = contains("att"), 
               names_to = "attribute", 
               values_to = "attribute_level") %>% 
  mutate(attribute_level = as.character(attribute_level))

out1a <- projoint(reshaped_data, .remove_ties = FALSE) %>% 
  summary() %>% 
  filter(estimand == "mm_uncorrected") %>% 
  select("attribute_level" = att_level_choose, 
         "projoint_classical" = se)

out1b <- projoint(reshaped_data, .remove_ties = FALSE, .se_type_2 = "stata") %>% 
  summary() %>% 
  filter(estimand == "mm_uncorrected") %>% 
  select("attribute_level" = att_level_choose, 
         "projoint_stata" = se) 

out2a <- cjdata %>% 
  group_by(attribute_level) %>% 
  reframe(tidy(lm_robust(selected ~ 1, se_type = "classical", data = pick(everything())))) %>% 
  select(attribute_level, 
         "lm_robust_classical" = std.error) 

out2b <- cjdata %>% 
  group_by(attribute_level) %>% 
  reframe(tidy(lm_robust(selected ~ 1, se_type = "stata", data = pick(everything())))) %>% 
  select(attribute_level, 
         "lm_robust_stata" = std.error) 

out <- out1a %>% 
  left_join(out1b, by = "attribute_level") %>% 
  left_join(out2a, by = "attribute_level") %>% 
  left_join(out2b, by = "attribute_level")

All of these methods produce the same standard errors.

acoppock · 2023-10-02T10:08:41Z

Thank you for looking into this @yhoriuchi !

On the clusters issue, I had assumed you would be clustering standard errors at the respondent level ("id") -- I believe that's standard practice? I have heard the argument that we don't need to cluster because the random assignment is at the profile level. However, following the rule to "cluster at the level of sampling or assignment, whichever is higher" would suggest clustering at the respondent level.

All that to say, I would have thought the default for .clusters_2 would have been to pass the id variable.

by the way, I'm not sure how to correctly pass the id variable to clusters_2, none of the following work just yet.

summary(projoint(reshaped_data, .clusters_2 = id, .se_type_2 = "CR2"))
summary(projoint(reshaped_data, .clusters_2 = "id", .se_type_2 = "CR2"))
summary(projoint(reshaped_data, .clusters_2 = reshaped_data@data$id, .se_type_2 = "CR2"))

yhoriuchi added the bug Something isn't working label Sep 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.se_type_1, .se_type_2 hard coded? #32

.se_type_1, .se_type_2 hard coded? #32

acoppock commented Sep 29, 2023

yhoriuchi commented Sep 30, 2023

yhoriuchi commented Sep 30, 2023

acoppock commented Oct 2, 2023

.se_type_1, .se_type_2 hard coded? #32

.se_type_1, .se_type_2 hard coded? #32

Comments

acoppock commented Sep 29, 2023

yhoriuchi commented Sep 30, 2023

yhoriuchi commented Sep 30, 2023

acoppock commented Oct 2, 2023