Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug? Monotonic effect ignores ordered factor levels that are not present in the data (but defined as factor levels) #1536

Closed
luwidmer opened this issue Aug 9, 2023 · 2 comments
Labels
Milestone

Comments

@luwidmer
Copy link

luwidmer commented Aug 9, 2023

In brms 2.19, monotonic effects ignore ordered factor levels that are not present in the data (but defined as factor levels) - even with drop_unused_levels = FALSE. The monotonic term vignette can be modified as follows to provoke the issue (by only having 2 of the 4 factor levels present in the data):

library(brms)
#> Loading required package: Rcpp
#> Warning: package 'Rcpp' was built under R version 4.3.1
#> Loading 'brms' package (version 2.19.0). Useful instructions
#> can be found by typing help('brms'). A more detailed introduction
#> to the package is available through vignette('brms_overview').
#> 
#> Attaching package: 'brms'
#> The following object is masked from 'package:stats':
#> 
#>     ar

income_options <- c("below_20", "20_to_40", "40_to_100", "greater_100")

income <- factor(
  income_options[1:2], # income_options[1:4] works
  levels = income_options, 
  ordered = TRUE
)

mean_ls <- c(30, 60, 70, 75)
ls <- mean_ls[income] + rnorm(length(income), sd = 7)

dat <- data.frame(income, ls)

income_model <- bf(ls ~ 1 + mo(income))
get_prior(income_model, data = dat)
#>                     prior     class      coef group resp dpar nlpar lb ub
#>                    (flat)         b                                      
#>                    (flat)         b  moincome                            
#>  student_t(3, 38.6, 16.1) Intercept                                      
#>     student_t(3, 0, 16.1)     sigma                                  0   
#>              dirichlet(1)      simo moincome1                            
#>        source
#>       default
#>  (vectorized)
#>       default
#>       default
#>       default

# This runs but only creates a 1-dimensional dirichlet distribution for the 
# monotone term if the data frame has only 2 rows (but 4 defined income levels)
fit1 <- brm(income_model, data = dat, drop_unused_levels = FALSE, refresh = 0) 
#> Compiling Stan program...
#> Start sampling
#> Warning: There were 271 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: The largest R-hat is NA, indicating chains have not mixed.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#r-hat
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#tail-ess
summary(fit1)
#> Warning: Parts of the model have not converged (some Rhats are > 1.05). Be
#> careful when analysing the results! We recommend running more iterations and/or
#> setting stronger priors.
#> Warning: There were 271 divergent transitions after warmup. Increasing
#> adapt_delta above 0.8 may help. See
#> http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#>  Family: gaussian 
#>   Links: mu = identity; sigma = identity 
#> Formula: ls ~ 1 + mo(income) 
#>    Data: dat (Number of observations: 2) 
#>   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
#>          total post-warmup draws = 4000
#> 
#> Population-Level Effects: 
#>           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> Intercept    30.97     10.16    11.29    54.52 1.01     1045     1366
#> moincome     18.76     20.62   -26.64    59.40 1.02     1061      935
#> 
#> Simplex Parameters: 
#>              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> moincome1[1]     1.00      0.00     1.00     1.00   NA       NA       NA
#> 
#> Family Specific Parameters: 
#>       Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
#> sigma    13.73     13.62     0.92    46.72 1.07       43       22
#> 
#> Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).

income_model_prior <- prior(normal(30, 1), class="Intercept") +
  prior(normal(45, 22.5), class="b", coef="moincome") +
  prior(dirichlet(c(2, 1, 1)), class = "simo", coef = "moincome1")

# This fails if the data frame has only 2 rows (but 4 defined income levels)
# with "Error: Invalid Dirichlet prior. Expected input of length 1."
fit2 <- brm(income_model, data = dat, prior = income_model_prior, drop_unused_levels = FALSE, refresh = 0) 
#> Error: Invalid Dirichlet prior. Expected input of length 1.
summary(fit2)
#> Error in eval(expr, envir, enclos): object 'fit2' not found

Created on 2023-08-09 with reprex v2.0.2

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.0 (2023-04-21 ucrt)
#>  os       Windows 10 x64 (build #####)
#>  system   x86_64, mingw32
#>  ui       RTerm
#>  language EN
#>  collate  German_Switzerland.utf8
#>  ctype    German_Switzerland.utf8
#>  tz       Europe/Zurich
#>  date     2023-08-09
#>  pandoc   3.1.1 @ ####/RStudio-2023.06.1-524/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  ! package        * version date (UTC) lib source
#>    abind            1.4-5   2016-07-21 [1] CRAN (R 4.3.0)
#>    backports        1.4.1   2021-12-13 [1] CRAN (R 4.3.0)
#>    base64enc        0.1-3   2015-07-28 [1] CRAN (R 4.3.0)
#>    bayesplot        1.10.0  2022-11-16 [1] CRAN (R 4.3.0)
#>    bridgesampling   1.1-2   2021-04-16 [1] CRAN (R 4.3.0)
#>    brms           * 2.19.0  2023-03-14 [1] CRAN (R 4.3.0)
#>    Brobdingnag      1.2-9   2022-10-19 [1] CRAN (R 4.3.0)
#>    callr            3.7.3   2022-11-02 [1] CRAN (R 4.3.0)
#>    checkmate        2.2.0   2023-04-27 [1] CRAN (R 4.3.0)
#>    cli              3.6.1   2023-03-23 [1] CRAN (R 4.3.0)
#>    coda             0.19-4  2020-09-30 [1] CRAN (R 4.3.0)
#>    codetools        0.2-19  2023-02-01 [1] CRAN (R 4.3.0)
#>    colorspace       2.1-0   2023-01-23 [1] CRAN (R 4.3.0)
#>    colourpicker     1.2.0   2022-10-28 [1] CRAN (R 4.3.0)
#>    crayon           1.5.2   2022-09-29 [1] CRAN (R 4.3.0)
#>    crosstalk        1.2.0   2021-11-04 [1] CRAN (R 4.3.0)
#>    curl             5.0.1   2023-06-07 [1] CRAN (R 4.3.1)
#>    digest           0.6.33  2023-07-07 [1] CRAN (R 4.3.1)
#>    distributional   0.3.2   2023-03-22 [1] CRAN (R 4.3.0)
#>    dplyr            1.1.2   2023-04-20 [1] CRAN (R 4.3.0)
#>    DT               0.28    2023-05-18 [1] CRAN (R 4.3.1)
#>    dygraphs         1.1.1.6 2018-07-11 [1] CRAN (R 4.3.0)
#>    ellipsis         0.3.2   2021-04-29 [1] CRAN (R 4.3.0)
#>    evaluate         0.21    2023-05-05 [1] CRAN (R 4.3.0)
#>    fansi            1.0.4   2023-01-22 [1] CRAN (R 4.3.0)
#>    farver           2.1.1   2022-07-06 [1] CRAN (R 4.3.0)
#>    fastmap          1.1.1   2023-02-24 [1] CRAN (R 4.3.0)
#>    fs               1.6.3   2023-07-20 [1] CRAN (R 4.3.1)
#>    generics         0.1.3   2022-07-05 [1] CRAN (R 4.3.0)
#>    ggplot2          3.4.2   2023-04-03 [1] CRAN (R 4.3.0)
#>    glue             1.6.2   2022-02-24 [1] CRAN (R 4.3.0)
#>    gridExtra        2.3     2017-09-09 [1] CRAN (R 4.3.0)
#>    gtable           0.3.3   2023-03-21 [1] CRAN (R 4.3.0)
#>    gtools           3.9.4   2022-11-27 [1] CRAN (R 4.3.0)
#>    htmltools        0.5.5   2023-03-23 [1] CRAN (R 4.3.0)
#>    htmlwidgets      1.6.2   2023-03-17 [1] CRAN (R 4.3.0)
#>    httpuv           1.6.11  2023-05-11 [1] CRAN (R 4.3.0)
#>    igraph           1.5.0.1 2023-07-23 [1] CRAN (R 4.3.1)
#>    inline           0.3.19  2021-05-31 [1] CRAN (R 4.3.0)
#>    jsonlite         1.8.7   2023-06-29 [1] CRAN (R 4.3.1)
#>    knitr            1.43    2023-05-25 [1] CRAN (R 4.3.1)
#>    later            1.3.1   2023-05-02 [1] CRAN (R 4.3.0)
#>    lattice          0.21-8  2023-04-05 [1] CRAN (R 4.3.0)
#>    lifecycle        1.0.3   2022-10-07 [1] CRAN (R 4.3.0)
#>    loo              2.6.0   2023-03-31 [1] CRAN (R 4.3.0)
#>    magrittr         2.0.3   2022-03-30 [1] CRAN (R 4.3.0)
#>    markdown         1.7     2023-05-16 [1] CRAN (R 4.3.0)
#>    Matrix           1.6-0   2023-07-08 [1] CRAN (R 4.3.1)
#>    matrixStats      1.0.0   2023-06-02 [1] CRAN (R 4.3.1)
#>    mime             0.12    2021-09-28 [1] CRAN (R 4.3.0)
#>    miniUI           0.1.1.1 2018-05-18 [1] CRAN (R 4.3.0)
#>    munsell          0.5.0   2018-06-12 [1] CRAN (R 4.3.0)
#>    mvtnorm          1.2-2   2023-06-08 [1] CRAN (R 4.3.1)
#>    nlme             3.1-162 2023-01-31 [1] CRAN (R 4.3.0)
#>    pillar           1.9.0   2023-03-22 [1] CRAN (R 4.3.0)
#>    pkgbuild         1.4.2   2023-06-26 [1] CRAN (R 4.3.1)
#>    pkgconfig        2.0.3   2019-09-22 [1] CRAN (R 4.3.0)
#>    plyr             1.8.8   2022-11-11 [1] CRAN (R 4.3.0)
#>    posterior        1.4.1   2023-03-14 [1] CRAN (R 4.3.0)
#>    prettyunits      1.1.1   2020-01-24 [1] CRAN (R 4.3.0)
#>    processx         3.8.2   2023-06-30 [1] CRAN (R 4.3.1)
#>    promises         1.2.0.1 2021-02-11 [1] CRAN (R 4.3.0)
#>    ps               1.7.5   2023-04-18 [1] CRAN (R 4.3.0)
#>    purrr            1.0.1   2023-01-10 [1] CRAN (R 4.3.0)
#>    R.cache          0.16.0  2022-07-21 [1] CRAN (R 4.3.1)
#>    R.methodsS3      1.8.2   2022-06-13 [1] CRAN (R 4.3.0)
#>    R.oo             1.25.0  2022-06-12 [1] CRAN (R 4.3.0)
#>    R.utils          2.12.2  2022-11-11 [1] CRAN (R 4.3.1)
#>    R6               2.5.1   2021-08-19 [1] CRAN (R 4.3.0)
#>    Rcpp           * 1.0.11  2023-07-06 [1] CRAN (R 4.3.1)
#>  D RcppParallel     5.1.7   2023-02-27 [1] CRAN (R 4.3.0)
#>    reprex           2.0.2   2022-08-17 [1] CRAN (R 4.3.1)
#>    reshape2         1.4.4   2020-04-09 [1] CRAN (R 4.3.0)
#>    rlang            1.1.1   2023-04-28 [1] CRAN (R 4.3.0)
#>    rmarkdown        2.23    2023-07-01 [1] CRAN (R 4.3.1)
#>    rstan            2.26.22 2023-08-02 [1] local
#>    rstantools       2.3.1.1 2023-07-18 [1] CRAN (R 4.3.1)
#>    rstudioapi       0.15.0  2023-07-07 [1] CRAN (R 4.3.1)
#>    scales           1.2.1   2022-08-20 [1] CRAN (R 4.3.0)
#>    sessioninfo      1.2.2   2021-12-06 [1] CRAN (R 4.3.1)
#>    shiny            1.7.4.1 2023-07-06 [1] CRAN (R 4.3.1)
#>    shinyjs          2.1.0   2021-12-23 [1] CRAN (R 4.3.0)
#>    shinystan        2.6.0   2022-03-03 [1] CRAN (R 4.3.0)
#>    shinythemes      1.2.0   2021-01-25 [1] CRAN (R 4.3.0)
#>    StanHeaders      2.26.27 2023-06-14 [1] CRAN (R 4.3.1)
#>    stringi          1.7.12  2023-01-11 [1] CRAN (R 4.3.0)
#>    stringr          1.5.0   2022-12-02 [1] CRAN (R 4.3.0)
#>    styler           1.10.1  2023-06-05 [1] CRAN (R 4.3.1)
#>    tensorA          0.36.2  2020-11-19 [1] CRAN (R 4.3.0)
#>    threejs          0.3.3   2020-01-21 [1] CRAN (R 4.3.0)
#>    tibble           3.2.1   2023-03-20 [1] CRAN (R 4.3.0)
#>    tidyselect       1.2.0   2022-10-10 [1] CRAN (R 4.3.0)
#>    utf8             1.2.3   2023-01-31 [1] CRAN (R 4.3.0)
#>    V8               4.3.3   2023-07-18 [1] CRAN (R 4.3.1)
#>    vctrs            0.6.3   2023-06-14 [1] CRAN (R 4.3.1)
#>    withr            2.5.0   2022-03-03 [1] CRAN (R 4.3.0)
#>    xfun             0.39    2023-04-20 [1] CRAN (R 4.3.0)
#>    xtable           1.8-4   2019-04-21 [1] CRAN (R 4.3.0)
#>    xts              0.13.1  2023-04-16 [1] CRAN (R 4.3.0)
#>    yaml             2.3.7   2023-01-23 [1] CRAN (R 4.3.0)
#>    zoo              1.8-12  2023-04-13 [1] CRAN (R 4.3.0)
#> 
#>  [1] C:/Users/#####/AppData/Local/Programs/R/R-4.3.0/library
#> 
#>  D ── DLL MD5 mismatch, broken installation.
#> 
#> ──────────────────────────────────────────────────────────────────────────────
@luwidmer luwidmer changed the title Monotonic effect ignores ordered factor levels that are not present in the data (but defined as factor levels) Bug? Monotonic effect ignores ordered factor levels that are not present in the data (but defined as factor levels) Aug 9, 2023
paul-buerkner added a commit that referenced this issue Aug 9, 2023
@paul-buerkner
Copy link
Owner

Fixed :-)

@paul-buerkner paul-buerkner added this to the 2.20.0 milestone Aug 9, 2023
@luwidmer
Copy link
Author

luwidmer commented Aug 9, 2023

Awesome, thank you for this super-quick fix! Tested it and can confirm it works!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants