[New] Relationship of OxCGRT index and parameter values (short-term prediction) #280

Inglezos · 2020-10-26T19:56:54Z

What we need to document?

I am referring to https://lisphilar.github.io/covid19-sir/usage_policy.html, in specific to the (Experimental): Relationship of OxCGRT index and parameter values section.
What all these results actually mean? Could you provide a more detailed documentation and analysis, about the OxCGRT index usage and how it affects the parameter values for each country, with examples and practical explanations?

For example, what does that correlation table mean practically, how are these results interpreted and what's the results significance/impact on the general countries analyses?
The scatter plot at the bottom for every country, depicting the relationship between the Reproductive number (Rt) and the OxCGRT stringency index, what does it mean practically? Does it mean that for example for higher index values, the Rt is lower?
The results have to be reworked/scaled, in order to display properly all the points around 0-100 zone and to ignore the high outliers.
More detailed examples (preferably in notebook .ipynb format) should be provided about how the OxCGRT index and the policy measures affect specific countries (i.e. Japan, USA, India, Greece, Italy, Spain) and how such results are interpreted for each country.
Why is this feature "Experimental"? Does it not fully work? Is there an open issue for that and active development being done?

The text was updated successfully, but these errors were encountered:

lisphilar · 2020-10-27T12:59:29Z

This is just an experimental analysis to find the relasionship of the parameters and government responses. This is related to the discussion with @ilyasst and @joydisette in #3
They are the authors of https://ilylabs.github.io/projects/COVID-trackers/

We need to perform machine/deep learning. #204 and #205 are also related.
We should have the dataset of parameter values to enhance this experimental analysis.

However, parameter estimation for all countries is a very time-consuming task. We have too many countries and the number of phases are incleasing every day.

#225 must be solved in advance.

Inglezos · 2020-10-27T13:43:05Z

Yes, I think #225 must definitely be solved in advance. You mean machine learning for pattern recognition of the relationships?
We don't have to do this for all the countries, but for a few at first.

I am referring to https://lisphilar.github.io/covid19-sir/usage_policy.html, in specific to the (Experimental): Relationship of OxCGRT index and parameter values section.
What all these results actually mean? Could you provide a more detailed documentation and analysis, about the OxCGRT index usage and how it affects the parameter values for each country, with examples and practical explanations?

For example, what does that correlation table mean practically, how are these results interpreted and what's the results significance/impact on the general countries analyses?

The scatter plot at the bottom for every country, depicting the relationship between the Reproductive number (Rt) and the OxCGRT stringency index, what does it mean practically? Does it mean that for example for higher index values, the Rt is lower?
The results have to be reworked/scaled, in order to display properly all the points around 0-100 zone and to ignore the high outliers.

So, what those results practically mean?

lisphilar · 2020-10-27T16:40:26Z

As it is and this is just an experimental analysis to find the solutions to predict the parameter values using government responses with deep learning.

They are for feature selection.

I do not know which index is necessary in the solution. Correlation table, scatter plot and deep learning is helpful to find the useful index.

Inglezos · 2020-10-28T02:33:17Z

I would suggest to use the StringencyIndexForDisplay, GovernmentResponseIndexForDisplay and
ContainmentHealthIndexForDisplay indexes. The final index would be a sum of all these. We could start simple, finding some patterns between basic parameters, as Rt, and this index. Perhaps we could apply trends() in Rt-index plane, in order to see how the reproductive number, and in general the parameters, change with respect to this index.
Or we could see retrospectively which one index was the most representative of a specific country's effectiveness for an older phase when the pandemic was successfully contained, for example Italy during June-September or China after March! In that way we can rely on only one index (in case we don't decide to sum over all those three and have an ultimate one).
Then this index could for example affect the estimator's weights of the cost function or something like that.

Inglezos · 2020-11-24T10:06:06Z

For example, if a country applies quarantine measures on X day, we know that most probably the daily new cases will decrease in around 2-3 weeks after that X day. But the estimation analysis will give us increased cases for that period, since the parameters will be different and we simply cannot forecast this. Except if we somehow insert this index into the simulator analysis as an extra input parameter and affect the simulated cases. Because these simulated cases otherwise are not realistic and representative of our current knowledge that extra measures are in effect. We need to let the model know about that and the best way would be that index.
Do you have any ideas how something like this could be implemented soon? Could we start with a simple solution?

lisphilar · 2020-11-24T13:08:11Z

I intended to analyse the relationship with PolicyMeasures class, but solving difficult issue #225 is necessary.
One solution with Scenario class is here. Predict parameter values and simulate the numer of cases with these predicted parameter values.

Perform parameter estimation and get a dataframe with index Date and columns theta, rho, sigma, kappa. (Please see Scenario._track_param() method)
Calculate rolling mean values of the estimated parameter values because they are discrete values for dates. (c.f. Continuous for phases.)
Combine OxCGRT records for the country and the rolled estimated parameter values.
Split this data to train/validation data.
Select one OxCGRT index for one parameter using correlation coefficient, considering delay you mentioned.
Predict the future values of parameters using the OxCGRT index values respectively, using fbprohet (with additional regressors) or darts package.
Validate this prediction with validation data.
Simulate the number of cases with predicted parameter values.

What do you think about these steps? Sentences in bold will be the most difficult part.

Inglezos · 2020-11-25T00:12:08Z

I intended to analyse the relationship with PolicyMeasures class, but solving difficult issue #225 is necessary.

Why is necessary to have a web service/RESTful API for such relationship? Can't we run these in advance once for a specific set of countries and then find this relationship on-the-fly?

Regarding the above algorithm I think this would suffice as a standalone solution and would enable the model to consider the various government measures in effect. I think this is vital to be implemented soon. And if not a complete solution, for a starting point it would be enough to apply this to a single future phase (one predicted set of parameters) or to the next month, for short-term impact.

Another question more general, what is the physical meaning of the estimated parameters? Do they make sense, are the parameters logical? For example, for Greece the Rt now is 19.5 . What does this mean? Is it logical that one individual can infect other 19 people? Or is it just a value with no realistic meaning, that serves only for fitting of the data to the model?

lisphilar · 2020-11-25T14:31:47Z

Which do you want to use for this analysis, PolicyMeasures class or Scenario class? Does "Standalone solution" mean that we will create a new class?

If PolicyMeasures class, though we can use small number of countries with .countries setter (i.e. property users can change), but it would be helpful if we can run many countries. I did not tried, but machine learning needs a lot of data to predict the results, avoiding over-fitting. (However, we can try it. If you think yes, please move forward to discussion about detailed codes or algorithms.)

If Scneario class, we can implement the function with the steps I mentioned in the previous comment. Please discuss the codes to implement.

If another class, please explain the detailed steps of your idea.

Another question more general, what is the physical meaning of the estimated parameters?

Reproduction number is a index to know whether outbreaking (Rt > 1) or not.
Parameter values have physical/logical meanings and have units [1/min]. E.g. rho is effective contact rate.
Please refer to my model desription in my Kaggle notebook.
https://www.kaggle.com/lisphilar/covid-19-data-with-sir-model#SIR-to-SIR-F

Rho, sigma and kappa are functions of control factors as explained in Factors of model parameters section of my Kaggle Notebook.
https://www.kaggle.com/lisphilar/covid-19-data-with-sir-model#Factors-of-model-parameters

Inglezos · 2020-12-08T08:17:07Z

I think for a first solution implementation a Scenario class/method would suffice. A simple pattern recognition or even trend analysis in the {Rt or parameters set} - response_index plane could probably be enough, in order to predict short-term future model parameters after some measures were applied, per some representative and specific countries analysis.

lisphilar · 2020-12-28T14:20:31Z

[MEMO]
pre-test: https://gist.github.com/lisphilar/637d248376eb9fb7511ba9c037aae9b2
Updated idea

User-specification ot time-series prediction of OxCGRT scores in the future phases
Linear regression: X = OxCGRT scores, y = rho values etc. (theta, kappa, sigma, rho)
Evaluation of linear regression (RMSLE etc.)
Predict rho values in the future phases with linear regression above
Set future phases using the predicted parameter values
Simulate the number of cases

Inglezos · 2020-12-28T15:22:27Z

May I suggest another way to predict future values?
What if for a moment we forget the OxCGRT index and focus solely on Rt (and the other parameters). Essentially we need to find a function that fits the estimated values for Rt (and the rest parameters). If we find such a fitting function then we can extrapolate the next values. We could use the index only in case we want to estimate the delay period (if this is needed). What do you think?

lisphilar · 2020-12-28T15:32:43Z

Yes, time series forcasting only with parameter values is an alternative. However, I tried a prototype of this solution in the bottom of the notebook I mentioned in the previous comment and failed in forcasting as shown in line 97. How can we improve it?

Inglezos · 2020-12-28T16:04:37Z

You mean a prototype of which solution, the alternative I described or the one you had in mind with the index?
As a first attempt I think it would be easier to try the alternative one.
If you tried other values for delay? Or try other parameters?
I think the major problem is that you used a linear regressor. I wouldn't expect the values to follow such a distribution.

Inglezos · 2020-12-28T16:08:22Z

Perhaps a time varying autoregressive model would be more appropriate for fitting
https://arxiv.org/pdf/1711.05204.pdf
https://icasas.github.io/tvReg/reference/tvAR.html
(I haven't searched into this yet)

lisphilar · 2020-12-28T16:14:23Z

Linear regression part was for the idea with OxCGRT scores. This is not related to the alternative you nentioned.
The bottom lines with Dart package is for the alternative (time series fodcasting of parameter values).

lisphilar · 2020-12-29T08:57:10Z

MEMO: https://gist.github.com/lisphilar/8f492770cd4c306b081873ca71b7871d
It be required to predict OxCGRT scores using time series forcasting, but this is the next step.

lisphilar · 2020-12-29T10:13:31Z

I tired time series forcasting without OxCGRT scores, but it seems difficult to forecast parameter values with this solution because parameter values show wild ups and downs.
https://gist.github.com/lisphilar/30cb8d615659948334fb3aa5faa20aca

Inglezos · 2020-12-29T11:21:20Z

MEMO: https://gist.github.com/lisphilar/8f492770cd4c306b081873ca71b7871d
It be required to predict OxCGRT scores using time series forcasting, but this is the next step.

This is very good I think!

I tired time series forcasting without OxCGRT socres, but it seems difficult to forecast parameter values with this solution because parameter values show wild ups and downs.
https://gist.github.com/lisphilar/30cb8d615659948334fb3aa5faa20aca

It is a nice first approach I think. Also try to use AutoARIMA as well in the first model selection (they have same score with exponential smoothing).

A general note, I think the RMSLE by itself is not that much credible, because the parameter values are very small.
These ups and downs maybe cannot be forecasted with good accuracy. They probably depend on the index.
Also, I don't think that there is point in predicting long-term. We should aim to predict the parameters for the next phase only, short-term, i.e. for 2-6 weeks max into the future.

Inglezos · 2020-12-29T11:44:52Z

How OxCGRT index is combined and used in forecasting?

lisphilar · 2020-12-29T14:14:39Z

Like this: https://gist.github.com/lisphilar/21d251e40822186a9c6490dac82ce988

Inglezos · 2020-12-29T15:37:05Z

This seems very promising indeed!!

lisphilar · 2020-12-29T15:51:25Z

I try to use recovery period (=17 days) rather than 14 days as delay. Do you have any ideas?

Inglezos · 2020-12-29T16:10:41Z

In order to calculate the delay? Hmm.. if you compared the dates per country when the index was changed rapidly or critical measures were imposed to the dates of the phases (from S-R trend amalysis) or the dates when parameters changed rapidly?

Like applying change point analysis but in parameters-index plane instead of S-R.

Averaging of these change points duration then could indicate such delay period.

lisphilar · 2020-12-29T17:04:44Z

It seems a difficult issue and this will be solved in the future versions...

I created pull request #471 as the first step.
I will check the outputs for some countries tomorrow (UTC).

Usage:

snl = cs.Scenario(jhu_data, population_data, "Japan")
snl.trend()
snl.estimate(cs.SIRF)
snl.predict(oxcgrt_data)
snl.summary()
snl.simulate()
snl.history("Rt")

I may rename .predict() to .fit_predict() and create .fit() and .predict().

lisphilar · 2020-12-30T11:12:30Z

#471 was merged and tutorial of .fit_predict() etc. was documented.
https://lisphilar.github.io/covid19-sir/usage_quick.html#Short-term-prediction-of-parameter-values

rebeccadavidsson · 2021-01-08T13:07:33Z

In this example notebook, this code was used to include the delay period of 14 days:

# Assume OxCGRT score impact on parameter values with 14 days delay
delay = 14
df = oxcgrt_df.set_index("Date")
df.index += timedelta(days=delay)
merged_df = param_df.join(df, how="right")
merged_df.tail()

However, this delay is different for each country and the 'end' of the effects from Policy Measures is also very different. I made a short overview at the bottom of this notebook to identify the 'ending' effect of policy measures:
https://github.com/rebeccadavidsson/SIR_LSTM/blob/main/corr_oxf.ipynb

Just wanted to share this for any new ideas of implementations.

Inglezos · 2021-01-08T13:18:10Z

Yes as far as I know the delay should not be a fixed value, but calculated dynamically for each country. In this first implementation the delay is set to the recovery period just to have a first working functionality. This will have to be revised.
I refer you to my previous comment:

In order to calculate the delay? Hmm.. if you compared the dates per country when the index was changed rapidly or critical measures were imposed to the dates of the phases (from S-R trend amalysis) or the dates when parameters changed rapidly?
Like applying change point analysis but in parameters-index plane instead of S-R.
Averaging of these change points duration then could indicate such delay period.

Inglezos · 2021-01-08T14:42:38Z

The delay period will be reworked with issue #513.

lisphilar · 2021-01-08T14:50:58Z

Very very interesting. We will move forward to the new issue.

Inglezos added the documentation Improvements or additions to documentation label Oct 26, 2020

lisphilar added brainstorming Discussion to get creative ideas enhancement New feature or request labels Oct 27, 2020

lisphilar added this to the Release v2.14.0 milestone Dec 13, 2020

lisphilar changed the title ~~Relationship of OxCGRT index and parameter values~~ [New] Relationship of OxCGRT index and parameter values Dec 19, 2020

lisphilar mentioned this issue Dec 29, 2020

Issue280 prototype #471

Merged

lisphilar changed the title ~~[New] Relationship of OxCGRT index and parameter values~~ [New] Relationship of OxCGRT index and parameter values (short-term prediction) Dec 30, 2020

lisphilar mentioned this issue Dec 31, 2020

[New] Prediction of ODE parameters with indicators #477

Closed

lisphilar closed this as completed Dec 31, 2020

lisphilar mentioned this issue Jan 8, 2021

[New] calculate delay value per country in parameters forecast #513

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[New] Relationship of OxCGRT index and parameter values (short-term prediction) #280

[New] Relationship of OxCGRT index and parameter values (short-term prediction) #280

Inglezos commented Oct 26, 2020 •

edited

Loading

lisphilar commented Oct 27, 2020

Inglezos commented Oct 27, 2020

lisphilar commented Oct 27, 2020

Inglezos commented Oct 28, 2020 •

edited

Loading

Inglezos commented Nov 24, 2020

lisphilar commented Nov 24, 2020

Inglezos commented Nov 25, 2020 •

edited

Loading

lisphilar commented Nov 25, 2020

Inglezos commented Dec 8, 2020

lisphilar commented Dec 28, 2020 •

edited

Loading

Inglezos commented Dec 28, 2020 •

edited

Loading

lisphilar commented Dec 28, 2020

Inglezos commented Dec 28, 2020

Inglezos commented Dec 28, 2020

lisphilar commented Dec 28, 2020

lisphilar commented Dec 29, 2020

lisphilar commented Dec 29, 2020 •

edited

Loading

Inglezos commented Dec 29, 2020

Inglezos commented Dec 29, 2020

lisphilar commented Dec 29, 2020

Inglezos commented Dec 29, 2020

lisphilar commented Dec 29, 2020 •

edited

Loading

Inglezos commented Dec 29, 2020 •

edited

Loading

lisphilar commented Dec 29, 2020 •

edited

Loading

lisphilar commented Dec 30, 2020

rebeccadavidsson commented Jan 8, 2021

Inglezos commented Jan 8, 2021 •

edited

Loading

Inglezos commented Jan 8, 2021

lisphilar commented Jan 8, 2021

[New] Relationship of OxCGRT index and parameter values (short-term prediction) #280

[New] Relationship of OxCGRT index and parameter values (short-term prediction) #280

Comments

Inglezos commented Oct 26, 2020 • edited Loading

What we need to document?

lisphilar commented Oct 27, 2020

Inglezos commented Oct 27, 2020

lisphilar commented Oct 27, 2020

Inglezos commented Oct 28, 2020 • edited Loading

Inglezos commented Nov 24, 2020

lisphilar commented Nov 24, 2020

Inglezos commented Nov 25, 2020 • edited Loading

lisphilar commented Nov 25, 2020

Inglezos commented Dec 8, 2020

lisphilar commented Dec 28, 2020 • edited Loading

Inglezos commented Dec 28, 2020 • edited Loading

lisphilar commented Dec 28, 2020

Inglezos commented Dec 28, 2020

Inglezos commented Dec 28, 2020

lisphilar commented Dec 28, 2020

lisphilar commented Dec 29, 2020

lisphilar commented Dec 29, 2020 • edited Loading

Inglezos commented Dec 29, 2020

Inglezos commented Dec 29, 2020

lisphilar commented Dec 29, 2020

Inglezos commented Dec 29, 2020

lisphilar commented Dec 29, 2020 • edited Loading

Inglezos commented Dec 29, 2020 • edited Loading

lisphilar commented Dec 29, 2020 • edited Loading

lisphilar commented Dec 30, 2020

rebeccadavidsson commented Jan 8, 2021

Inglezos commented Jan 8, 2021 • edited Loading

Inglezos commented Jan 8, 2021

lisphilar commented Jan 8, 2021

Inglezos commented Oct 26, 2020 •

edited

Loading

Inglezos commented Oct 28, 2020 •

edited

Loading

Inglezos commented Nov 25, 2020 •

edited

Loading

lisphilar commented Dec 28, 2020 •

edited

Loading

Inglezos commented Dec 28, 2020 •

edited

Loading

lisphilar commented Dec 29, 2020 •

edited

Loading

lisphilar commented Dec 29, 2020 •

edited

Loading

Inglezos commented Dec 29, 2020 •

edited

Loading

lisphilar commented Dec 29, 2020 •

edited

Loading

Inglezos commented Jan 8, 2021 •

edited

Loading