flexibility in estimate_delays #4

smwindecker · 2023-10-04T05:38:02Z

Currently function uses the data to estimate delays. We should have the flexibility to use the linelist for certain dates/states, but specify other dates/states for which we should use either a national average, a disease literature average, or other.

Should not make it too easy to default to using bad data.

smwindecker · 2023-10-04T05:38:20Z

Further discussion == implementing multiple imputation for this task instead

Test type split smooth

AugustHao · 2024-01-10T04:54:40Z

need to estimate time-varying delays from paired dates data, current approach is to construct a rolling window for paired date delay data, and then getting cdf over those rolling windows. This is computationally expensive, so a long term goal is to find a better way to implement this, but noting that we have something that works in the meantime.

Key points to consider:

the goal is to estimate delay over a continuous time period, but paired date data does not necessarily cover all of the dates in this period, ie there are gaps in the timeseries where we do not observe paired delays due to missing observation of one of the dates. This means that we necessarily have to interpolate delay distribution between some date ranges.
if we can define a parametric form of the delay distribution, with the distribution parameters as time varying variables, we can learn them from data using a modelling approach. But this relies on very strong assumptions about the shape of delay distributions, which is undesirable.
there may be a way to mix parametric and non parametric densities in an informative way?

AugustHao · 2024-01-10T05:04:14Z

have a way to filter out recent days from calculation of delays

see this paper appendix A for a similar approach/justification: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-023-05428-4#appendices

in summary, because not all recent infections had been observed yet in the latest reported cases, those that would have been observed would have shorter delays than average. So if we had observed these shorter delays, and computed time varying delays following these observations, then we would erroneously underestimate delay for the most recent time period. Thus we should ignore information about delay in the most recent days and clamp delay distribution as constant at about 1 max delay range from the present, as they have done in the paper

AugustHao added the time-varying delay dev features to implement relating to time-varying delay (convolution) mass function label Oct 26, 2023

smwindecker pushed a commit that referenced this issue Nov 16, 2023

Merge pull request #4 from AugustHao/test_type_split_smooth

95a2b4c

Test type split smooth

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

flexibility in estimate_delays #4

flexibility in estimate_delays #4

smwindecker commented Oct 4, 2023

smwindecker commented Oct 4, 2023

AugustHao commented Jan 10, 2024 •

edited

Loading

AugustHao commented Jan 10, 2024

flexibility in estimate_delays #4

flexibility in estimate_delays #4

Comments

smwindecker commented Oct 4, 2023

smwindecker commented Oct 4, 2023

AugustHao commented Jan 10, 2024 • edited Loading

AugustHao commented Jan 10, 2024

AugustHao commented Jan 10, 2024 •

edited

Loading