Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warning when imputing level 2 factor variables with 2lonly.pmm #555

Closed
reiniervlinschoten opened this issue May 15, 2023 · 2 comments
Closed

Comments

@reiniervlinschoten
Copy link

reiniervlinschoten commented May 15, 2023

A weird bug is occuring, maybe related to #410.
When imputing a factor variable that should be constant within a class (e.g. smoking for a patient with longitudinal measurements), something goes wrong.

Reprex (from the vignette of the 2lonly.pmm function) which gives as a warning:
Warning message:

In `[<-.factor`(`*tmp*`, cc, value = c(`64` = 1, `64` = 1, `64` = 1,  :
  invalid factor level, NA generated

And the resulting dataframe has missing values in column V.

# simulate some data
# x,y ... level 1 variables
# v,w ... level 2 variables

G <- 250 # number of groups
n <- 20 # number of persons
beta <- .3 # regression coefficient
rho <- .30 # residual intraclass correlation
rho.miss <- .10 # correlation with missing response
missrate <- .50 # missing proportion
y1 <- rep(rnorm(G, sd = sqrt(rho)), each = n) + rnorm(G * n, sd = sqrt(1 - rho))
w <- rep(round(rnorm(G), 2), each = n)
v <- rep(round(runif(G, 0, 3)), each = n)
x <- rnorm(G * n)
y <- y1 + beta * x + .2 * w + .1 * v
dfr0 <- dfr <- data.frame("group" = rep(1:G, each = n), "x" = x, "y" = y, "w" = w, "v" = v)
dfr[rho.miss * x + rnorm(G * n, sd = sqrt(1 - rho.miss)) < qnorm(missrate), "y"] <- NA
dfr[rep(rnorm(G), each = n) < qnorm(missrate), "w"] <- NA
dfr[rep(rnorm(G), each = n) < qnorm(missrate), "v"] <- NA

# empty mice imputation
imp0 <- mice(as.matrix(dfr), maxit = 0)
predM <- imp0$predictorMatrix
impM <- imp0$method

# multilevel imputation
predM1 <- predM
predM1[c("w", "y", "v"), "group"] <- -2
predM1["y", "x"] <- 1 # fixed x effects imputation
impM1 <- impM
impM1[c("y", "w", "v")] <- c("2l.pan", "2lonly.norm", "2lonly.pmm")

# turn v into a categorical variable
dfr$v <- as.factor(dfr$v)
levels(dfr$v) <- LETTERS[1:4]

# y ... imputation using pan
# w ... imputation at level 2 using norm
# v ... imputation at level 2 using pmm

# skip imputation on solaris
is.solaris <- function() grepl("SunOS", Sys.info()["sysname"])
if (!is.solaris()) {
  imp <- mice(dfr,
    m = 1, predictorMatrix = predM1,
    method = impM1, maxit = 1, paniter = 500
  )
}
@hanneoberman
Copy link
Member

Hi! A quick fix could be to convert the factor to a numeric variable before imputation, and recoding back afterwards?

@stefvanbuuren
Copy link
Member

Thanks.

I patched mice.impute.2lonly.pmm() in mice 3.15.4, so we should not see the warning anymore.

@stefvanbuuren stefvanbuuren changed the title Error when imputing level 2 factor variables with 2lonly.pmm Warning when imputing level 2 factor variables with 2lonly.pmm May 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants