Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.se_type_1, .se_type_2 hard coded? #32

Open
acoppock opened this issue Sep 29, 2023 · 3 comments
Open

.se_type_1, .se_type_2 hard coded? #32

acoppock opened this issue Sep 29, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@acoppock
Copy link

Thanks to the team for all the work on this project and package. In projoint_level, I think that the values of .se_type_1 and .se_type_2 aren't being passed to pj_estimate, which seems to hard code both values as "classical".

as a result, the two calls below yield the same standard error estimates.

data("exampleData1")

outcomes <- paste0("choice", seq(from = 1, to = 8, by = 1))
outcomes <- c(outcomes, "choice1_repeated_flipped")

reshaped_data <- reshape_projoint(
  .dataframe = exampleData1, 
  .outcomes = outcomes)

summary(projoint(reshaped_data, .se_type_2 = "classical"))
summary(projoint(reshaped_data, .se_type_2 = "CR2"))
@yhoriuchi yhoriuchi added the bug Something isn't working label Sep 30, 2023
@yhoriuchi
Copy link
Owner

Thank you, @acoppock, for noting this issue. This is due to my lack of sufficient experience in package development. I set defaults in some internal functions used within projoint(). I believe I fixed this problem. See:
5f0ad37
1ac1818

But I still see the same standard errors by running the following code:

library(projoint)

data("exampleData1")

outcomes <- paste0("choice", seq(from = 1, to = 8, by = 1))
outcomes <- c(outcomes, "choice1_repeated_flipped")

reshaped_data <- reshape_projoint(
  .dataframe = exampleData1, 
  .outcomes = outcomes)

summary(projoint(reshaped_data))
summary(projoint(reshaped_data, .se_type_2 = "stata"))

If you added .se_type_2 = "CR2", then you would see an error message:

> summary(projoint(reshaped_data, .se_type_2 = "CR2"))
Error in check_se_type(se_type, clustered) : 
  `se_type` must be either 'HC0', 'HC1', 'stata', 'HC2', 'HC3', 'classical' or 'none' with no `clusters`.
You passed: CR2 which is reserved for a case with clusters.

This is because the default for .clusters_2 is NULL. This error message also suggests that the hand-coded issue was resolved.

@yhoriuchi
Copy link
Owner

@acoppock , can you also try the following script?

library(projoint)
library(tidyverse)
library(estimatr)

data("exampleData1")

outcomes <- paste0("choice", seq(from = 1, to = 8, by = 1))
outcomes <- c(outcomes, "choice1_repeated_flipped")

reshaped_data <- reshape_projoint(
  .dataframe = exampleData1, 
  .outcomes = outcomes)

cjdata <- reshaped_data@data %>% 
  select(id, task, profile, contains("att"), selected) %>% 
  pivot_longer(cols = contains("att"), 
               names_to = "attribute", 
               values_to = "attribute_level") %>% 
  mutate(attribute_level = as.character(attribute_level))

out1a <- projoint(reshaped_data, .remove_ties = FALSE) %>% 
  summary() %>% 
  filter(estimand == "mm_uncorrected") %>% 
  select("attribute_level" = att_level_choose, 
         "projoint_classical" = se)

out1b <- projoint(reshaped_data, .remove_ties = FALSE, .se_type_2 = "stata") %>% 
  summary() %>% 
  filter(estimand == "mm_uncorrected") %>% 
  select("attribute_level" = att_level_choose, 
         "projoint_stata" = se) 

out2a <- cjdata %>% 
  group_by(attribute_level) %>% 
  reframe(tidy(lm_robust(selected ~ 1, se_type = "classical", data = pick(everything())))) %>% 
  select(attribute_level, 
         "lm_robust_classical" = std.error) 

out2b <- cjdata %>% 
  group_by(attribute_level) %>% 
  reframe(tidy(lm_robust(selected ~ 1, se_type = "stata", data = pick(everything())))) %>% 
  select(attribute_level, 
         "lm_robust_stata" = std.error) 

out <- out1a %>% 
  left_join(out1b, by = "attribute_level") %>% 
  left_join(out2a, by = "attribute_level") %>% 
  left_join(out2b, by = "attribute_level")

All of these methods produce the same standard errors.

@acoppock
Copy link
Author

acoppock commented Oct 2, 2023

Thank you for looking into this @yhoriuchi !

On the clusters issue, I had assumed you would be clustering standard errors at the respondent level ("id") -- I believe that's standard practice? I have heard the argument that we don't need to cluster because the random assignment is at the profile level. However, following the rule to "cluster at the level of sampling or assignment, whichever is higher" would suggest clustering at the respondent level.

All that to say, I would have thought the default for .clusters_2 would have been to pass the id variable.

by the way, I'm not sure how to correctly pass the id variable to clusters_2, none of the following work just yet.

summary(projoint(reshaped_data, .clusters_2 = id, .se_type_2 = "CR2"))
summary(projoint(reshaped_data, .clusters_2 = "id", .se_type_2 = "CR2"))
summary(projoint(reshaped_data, .clusters_2 = reshaped_data@data$id, .se_type_2 = "CR2"))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants