Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add by= variable to tbl_svysummary(include=) by default #925

Closed
coreysparks opened this issue Jun 22, 2021 · 2 comments · Fixed by #927
Closed

Add by= variable to tbl_svysummary(include=) by default #925

coreysparks opened this issue Jun 22, 2021 · 2 comments · Fixed by #927

Comments

@coreysparks
Copy link

I may have found a bug, below is reproducible code to generate the error. I expect to get a simple summary of variable v151 for the levels of v139, but I get an error stating

Error: Problem with `mutate()` column `df_stats`.
i `df_stats = pmap(...)`.
x Can't subset columns that don't exist.
x Column `v139` doesn't exist.
Run `rlang::last_error()` to see where the error occurred

Brief description of the problem.

dhs<-haven::read_dta("https://github.com/coreysparks/data/blob/master/ZZIR62FL.DTA?raw=true")
dhs<-haven::zap_labels(dhs)
dhs$v005<- dhs$v005/1000000
dhs2<- select(dhs, v021, v022, v005, v139, v151)

names(dhs2)
library(dplyr); library(gtsummary)
options(survey.lonely.psu = "adjust")
t1<- survey::svydesign(ids = ~v021,
               strata=~v022,
               weights = ~v005,
               data=dhs2)%>%
  gtsummary::tbl_svysummary(by = "v139", include = c(v151))
@ddsjoberg
Copy link
Owner

hi @coreysparks ! thanks for the post! you'll need to add the by= variable to the inlcude= argument to get your code working. it's added by default in tbl_summary() but looks like i need to add it for tbl_svysummary() as well

library(gtsummary)

dhs<-haven::read_dta("https://github.com/coreysparks/data/blob/master/ZZIR62FL.DTA?raw=true")
dhs<-haven::zap_labels(dhs)
dhs$v005<- dhs$v005/1000000
dhs2<- select(dhs, v021, v022, v005, v139, v151)

names(dhs2)
#> [1] "v021" "v022" "v005" "v139" "v151"
library(dplyr); library(gtsummary)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
options(survey.lonely.psu = "adjust")
survey::svydesign(ids = ~v021,
                  strata=~v022,
                  weights = ~v005,
                  data=dhs2)%>%
  tbl_svysummary(by = "v139", include = c(v151, v139)) %>%
  as_kable() # convert to kable to display on github
Characteristic 1, N = 2,850 2, N = 1,599 3, N = 2,279 4, N = 1,562 97, N = 58
sex of household head
1 1,975 (69%) 1,180 (74%) 1,494 (66%) 1,162 (74%) 41 (72%)
2 876 (31%) 418 (26%) 786 (34%) 400 (26%) 16 (28%)

Created on 2021-06-22 by the reprex package (v2.0.0)

@ddsjoberg ddsjoberg changed the title Issue with complex survey and tbl_svysummary, column does not exist Add by= variable to tbl_svysummary(include=) by default Jun 22, 2021
@coreysparks
Copy link
Author

Gotcha! Thanks for the swift reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants