Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prior in Gaussian random walk #34

Open
ivandebono opened this issue Oct 9, 2020 · 4 comments
Open

Prior in Gaussian random walk #34

ivandebono opened this issue Oct 9, 2020 · 4 comments

Comments

@ivandebono
Copy link

Can you explain the choice of prior for the Gaussian random walk? How did you choose this value for sigma?

           log_r_t = pm.GaussianRandomWalk(
                "log_r_t",
                sigma=0.035,
                dims=["date"] )
@michaelosthege
Copy link
Collaborator

The sigma here was chosen by Kevin while iterating on the model. This sigma just works well.

If you manage to put a prior on this & fit/sample the sigma, that would be very interesting.

@ivandebono
Copy link
Author

What do you mean when you say it works well? Has it been fitted to some data?

I found that changing the fixed parameters in the model makes a significant difference to the final result. So it's not just sigma, but also the seed population, the minimum exposure, and the maximum of the generation interval.

@michaelosthege
Copy link
Collaborator

For generation interval it's expected to make a difference. I expect the seed to make a difference for the first month and the sigma for the smoothing.
If you can share some plots &or make a case for how to improve the model, that would be greatly appreciated!

@ivandebono
Copy link
Author

Using United Kingdom data, I differences were not that significant, especially towards the end of the time series.

The original values in the code:
R_United Kingdom_0
Infections_United Kingdom_0

Now some different values. Sigma is a random variable with a uniform prior. Generation interval prior runs from 0 to 30 days. Lower limit of exposure is 0.05.
R_United Kingdom_1
Infections_United Kingdom_1

R_United Kingdom_2
Infections_United Kingdom_2

R_United Kingdom_3
Infections_United Kingdom_3

R_United Kingdom_4
Infections_United Kingdom_4

R_United Kingdom_5
Infections_United Kingdom_5

Fixed sigma, and generation interval prior cutoff at 30 days.
R_United Kingdom_6
Infections_United Kingdom_6

Cutoff at 40 days.
R_United Kingdom_7
Infections_United Kingdom_7

I have some of suggestions for the code, some of which I'm trying to implement myself. But I recognise the difficulty.

  1. A correction to find the number of true positives (given that most tests are now greater than cases, and most are PCR). I implemented this separately, using PyMC3.
  2. A cross-correlation with deaths. This is tricky, because the case fatality rate depends heavily on the age profile of the cases. However, we could set the CFR as a random variable.
  3. A question: Why do you use the median of the posterior sample for R(t) rather than the mean?
  4. Some kind of extrapolation of the test volume backwards, towards the beginning of the time series when data are unavailable.
  5. A question: The method relies on tests being greater than cases. Now this isn't always the case, especially in European data sets. How does this affect the estimate of R(t)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants