Change Getting started documentation to point users to faster models #730

jamesmbaazam · 2024-07-31T13:08:53Z

This issue can be solved after merging the benchmarking vignette in #695.

I was curious about how the paper "Real-time estimation of the epidemic reproduction number: Scoping review of the applications and challenges" measured run times of the various R packages and came across this section in the supplementary material where more details are given (page 3):

Computational speed has been assessed by each author of this study with different computer specifications (see Table B). The main function in each package/tool was taken from provided examples (if available) and wrapped in the system.time() R function to measure the execution time. The main function estimated the reproduction number for each R package/tool except for epicontacts, which estimates the serial interval. We chose the following classifications: <10 seconds = very good, 10 seconds – 5 minutes = good, >5 minutes = poor. The classification allocated to each package was based on the agreement of at least 2 out of the 3 computers. We note that such direct comparisons of the runtimes of the different models may not be fair, as the examples provided by each package which we have used to assess speed vary in terms of the dataset used, model complexity, and dimensionality of the reproduction number to estimate. Nevertheless, we assume that examples will always be relatively simple and therefore their computational speed may be a good overall indicator of speed of reproduction number estimation in general using a given package.

This has got me thinking about whether we should change our docs to use the quicker models that sacrifice accuracy as that is what users will interact with first (copy & paste to try out) but with a caveat. We can then signpost to the slower but more accurate models for real-world use cases.

I'm also making a note here to raise an issue in EpiEstim to re-assess the speed score in this table using other faster and relatively accurate models with evidence in #695.

seabbs · 2024-07-31T16:18:26Z

Do we want users to use models that are less accurate? That is the implicit trade-off of showcasing faster more approximate models as the first thing people see (as a lot of people will then just use that).

I was curious about how the paper "Real-time estimation of the epidemic reproduction number: Scoping review of the applications and challenges" measured run times of the various R packages and came across this section in the supplementary material where more details are given (page 3):

I'm not sure this really represents what a real user would do when assessing a package or more general a very credible package review so I am not super keen to make decisions that optimise for it?>

All that being said I really don't feel that strongly. I think the minimum we should do is clearly point people to the fact there are different model formulations they could use that have different properties

jamesmbaazam · 2024-07-31T16:46:09Z

All good points. I think more generally though, new users will use tools they saw others use and may also not use ones that had a bad/unfair review. So, if people see elsewhere that our models are slow, they may not even try to use them. Moreover, I may be wrong in saying this but often, users may not take the time to try out different packages before making a choice.

seabbs · 2024-07-31T20:44:36Z

if people see elsewhere that our models are slow

To be honest my view is that we need better multi-model evaluations that are across groups rather than optimising for the current status quo

jamesmbaazam · 2024-08-01T08:57:46Z

Alright. I'll close this issue.

seabbs · 2024-08-01T12:03:08Z

Do we want to reopen and instead of changing the default improving the signposting to faster model configurations?

jamesmbaazam changed the title ~~Change Getting started documentation to point use to faster models~~ Change Getting started documentation to point users to faster models Jul 31, 2024

jamesmbaazam linked a pull request Jul 31, 2024 that will close this issue

Add vignette on benchmarking model options #695

Draft

7 tasks

jamesmbaazam closed this as completed Aug 1, 2024

jamesmbaazam reopened this Aug 1, 2024

jamesmbaazam self-assigned this Aug 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change Getting started documentation to point users to faster models #730

Change Getting started documentation to point users to faster models #730

jamesmbaazam commented Jul 31, 2024 •

edited

Loading

seabbs commented Jul 31, 2024 •

edited

Loading

jamesmbaazam commented Jul 31, 2024

seabbs commented Jul 31, 2024

jamesmbaazam commented Aug 1, 2024

seabbs commented Aug 1, 2024

Change Getting started documentation to point users to faster models #730

Change Getting started documentation to point users to faster models #730

Comments

jamesmbaazam commented Jul 31, 2024 • edited Loading

seabbs commented Jul 31, 2024 • edited Loading

jamesmbaazam commented Jul 31, 2024

seabbs commented Jul 31, 2024

jamesmbaazam commented Aug 1, 2024

seabbs commented Aug 1, 2024

jamesmbaazam commented Jul 31, 2024 •

edited

Loading

seabbs commented Jul 31, 2024 •

edited

Loading