Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow conditional sampling in 'sim.fmsm()' ? #152

Open
kkmann opened this issue Jan 20, 2023 · 4 comments
Open

allow conditional sampling in 'sim.fmsm()' ? #152

kkmann opened this issue Jan 20, 2023 · 4 comments

Comments

@kkmann
Copy link
Contributor

kkmann commented Jan 20, 2023

Hi,

this is related to #150 - I am trying to sample forward from the predictive distribution of an individual given current state and sojourn time in that state for a multistate model.

sim.fmsm() only seems to support sampling conditional on last state but cannot accept the respective sojourn time ('start' paramter in simulate.flexsurvreg()).

Would this be feasible to add to sim.fmsm()? I suspect the critical lines are around

https://github.com/chjackson/flexsurv-dev/blob/96dfe2ea0a360d4a79996e3b390d7cd27e0ab265/R/mstate.R#L862

@kkmann
Copy link
Contributor Author

kkmann commented Jan 20, 2023

@danielinteractive, you might find this interesting too.

@kkmann kkmann changed the title allow conditional sammpling in 'sim.fmsm()' ? allow conditional sampling in 'sim.fmsm()' ? Jan 20, 2023
@kkmann
Copy link
Contributor Author

kkmann commented Jan 20, 2023

Here is a suggestion (admittedly super inefficient) of how this could look like. One of the many problems is that it is super slow to not preallocate the return vecors. Maybe chunked pre-allocation could work. Ideally, the functionality could be integrated in the sim.fmsm() function though.

library(flexsurv)

mod_nobos_bos <- flexsurvreg(Surv(years, status) ~ 1, subset=(trans==1),
                             data = bosms3, dist = "weibull")
mod_nobos_death <- flexsurvreg(Surv(years, status) ~ 1, subset=(trans==2),
                               data = bosms3, dist = "weibull")
mod_bos_death <- flexsurvreg(Surv(years, status) ~ 1, subset=(trans==3),
                             data = bosms3, dist = "weibull")

tmat <- rbind(c(NA, 1, 2), c(NA, NA, 3), c(NA, NA, NA))

mdl <- fmsm(mod_nobos_bos, mod_nobos_death, mod_bos_death, trans = tmat)

sim <- function(mdl, newdata = data.frame(..null = NA), start = 0, nsim = 1L) {
  as.numeric(simulate(mdl, newdata = newdata, nsim = nsim)[1, 1:nsim])
}

simulate_fmsm <- function(mdl, state, newdata = data.frame(..null = NA),
                          start = 0, tmax = Inf) {
  tmat <- attr(mdl, "trans")
  terminal_states <- as.integer(which(rowSums(!is.na(tmat)) == 0))
  from <- integer()
  to <- integer()
  t <- numeric()
  tt <- 0
  while (!(state %in% terminal_states) & (tt < tmax)) {
    # sample next state
    from <- c(from, state)
    next_states <- as.integer(which(!is.na(tmat[state, ])))
    next_trans_ids <- tmat[state, next_states]
    # sample from transitions
    times <- numeric(length(next_states))
    for (i in seq_along(next_trans_ids)) {
      trans <- next_trans_ids[i]
      times[i] <- sim(mdl[[i]], newdata = newdata, start = start, nsim = 1L)
    }
    state <- next_states[which.min(times)]
    to <- c(to, state)
    tt <- tt + min(times)
    t <- c(t, tt)
    start <- sqrt(.Machine$double.eps) # future states are censored at 0
  }
  # combine vectors into a data frame and return
  res <- cbind(from, to, t)
  return(res)
}

simulate_fmsm(mdl, state = 1)

@chjackson
Copy link
Owner

sim.fmsm simulates the entire pathway through a multistate model though. Do I understand correctly that you want to simulate data where each individual spends a time period of at least "start" in every one of their states, so that their sojourn times follow standard parametric distributions that are left-truncated by this time? That seems quite specialised. What is the application?

@kkmann
Copy link
Contributor Author

kkmann commented Jan 20, 2023

No, only the current state ('start' in sim.fmsm) sojourn would be needed. Imagine simulating forward for each individual at an interim analysis in a CT setting. For simulate.flexsurvreg() this is possible via the start parameter (see #150) but sim.fmsm() seems to use a slightly different mechanism for sampling (or I missed the point) and I couldn't find a way to specify the time already spent in the starting state. Also, the 'start' paramter has a different meaning in sim.fmsm(). I think my example code achieves what I want (going directly via simulate.flexsurvreg()) but to me the sim.fmsm() function would be the natural place for this functionality.

Edit: also interesting link to #38.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants