Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve <multi_epidist> printing #326

Merged
merged 21 commits into from
Jun 10, 2024
Merged

Improve <multi_epidist> printing #326

merged 21 commits into from
Jun 10, 2024

Conversation

joshwlambert
Copy link
Member

This PR addresses #202 and #302 by improving the print method for the <multi_epidist> class (print.multi_epidist()). This class wraps a list of <epidist> objects to provide summary information of the data returned from the {epiparameter} database (epidist_db()) and a preview of some or all of the <epidist> objects returned.

The design of the new print method takes inspiration from tibbles (<tbl_df>) and specifically the {pillar} R package.

The print.multi_epidist() function prints a header with a description of the list and the number of elements in the list. It then provides the counts for the number of diseases and number of epidemiological parameters returned, and lists them. It then provides the head of the list of <epidist> objects. The number of <epidist> objects to print is determined by global options set by the package, but can be modified by a user by specifying print(n = ...). Then a footer is printing stating whether other list elements are stored but not show and hints to use print(n = ...) and parameter_tbl(). Lastly, a URL to the database vignette is provided as requested in #302.

Here is an example:

library(epiparameter)
ed = epidist_db()
#> Returning 122 results that match the criteria (99 are parameterised). 
#> Use subset to filter by entry variables or single_epidist to return a single entry. 
#> To retrieve the citation for each use the 'get_citation' function
ed
#> # List of <epidist> objects 
#> # A list:  122 elements
#> 
#> Number of diseases: 23
#> ❯ Adenovirus ❯ Chikungunya ❯ COVID-19 ❯ Dengue ❯ Ebola Virus Disease ❯ Hantavirus Pulmonary Syndrome ❯ Human Coronavirus ❯ Influenza ❯ Japanese Encephalitis ❯ Marburg Virus Disease ❯ Measles ❯ MERS ❯ Mpox ❯ Parainfluenza ❯ Pneumonic Plague ❯ Rhinovirus ❯ Rift Valley Fever ❯ RSV ❯ SARS ❯ Smallpox ❯ West Nile Fever ❯ Yellow Fever ❯ Zika Virus Disease
#> 
#> 
#> Number of epi distributions: 12
#> ❯ generation time ❯ hospitalisation to death ❯ hospitalisation to discharge ❯ incubation period ❯ notification to death ❯ notification to discharge ❯ offspring distribution ❯ onset to death ❯ onset to discharge ❯ onset to hospitalisation ❯ onset to ventilation ❯ serial interval
#> 
#> 
#> [[1]]
#> Disease: Adenovirus
#> Pathogen: Adenovirus
#> Epi Distribution: incubation period
#> Study: Lessler J, Reich N, Brookmeyer R, Perl T, Nelson K, Cummings D (2009).
#> "Incubation periods of acute respiratory viral infections: a systematic
#> review." _The Lancet Infectious Diseases_.
#> doi:10.1016/S1473-3099(09)70069-6
#> <https://doi.org/10.1016/S1473-3099%2809%2970069-6>.
#> Distribution: lnorm
#> Parameters:
#>   meanlog: 1.247
#>   sdlog: 0.975
#> 
#> [[2]]
#> Disease: Human Coronavirus
#> Pathogen: Human_Cov
#> Epi Distribution: incubation period
#> Study: Lessler J, Reich N, Brookmeyer R, Perl T, Nelson K, Cummings D (2009).
#> "Incubation periods of acute respiratory viral infections: a systematic
#> review." _The Lancet Infectious Diseases_.
#> doi:10.1016/S1473-3099(09)70069-7
#> <https://doi.org/10.1016/S1473-3099%2809%2970069-7>.
#> Distribution: lnorm
#> Parameters:
#>   meanlog: 0.742
#>   sdlog: 0.918
#> 
#> [[3]]
#> Disease: SARS
#> Pathogen: SARS-Cov-1
#> Epi Distribution: incubation period
#> Study: Lessler J, Reich N, Brookmeyer R, Perl T, Nelson K, Cummings D (2009).
#> "Incubation periods of acute respiratory viral infections: a systematic
#> review." _The Lancet Infectious Diseases_.
#> doi:10.1016/S1473-3099(09)70069-8
#> <https://doi.org/10.1016/S1473-3099%2809%2970069-8>.
#> Distribution: lnorm
#> Parameters:
#>   meanlog: 0.660
#>   sdlog: 1.205
#> 
#> # ℹ 119 more elements
#> # ℹ Use `print(n = ...)` to see more elements.
#> # ℹ Use `parameter_tbl()` to see a summary table of the parameters.
#> Explore database online at
#> <https://epiverse-trace.github.io/epiparameter/dev/articles/database.html>

Created on 2024-06-03 with reprex v2.1.0

This PR adds {cli} as a package dependency for using symbols, pluralisation and URL formatting.

An .onLoad() function is added to load the package options in the global environment.

This PR also restructures the package slightly by putting the print.multi_epidist() function it it's own file print.R. The tests and function documentation is also updated where necessary.

Copy link
Member

@Bisaloo Bisaloo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of setting / getting the options, it's technically correct but probably not very intuitive.

At the moment, the only way to change the options is to set them up after the R session has been restarted (since options are not preserved from one session to the other), but before epiparameter is loaded.

A better alternative would be to either:

  • get the user provided value from an environment variable, so it can be set across sessions. You can check how linelist does this.
  • read the option in print() rather than in .onLoad(), so it can be changed even after epiparameter has been loaded.

Comment on lines +8 to +14
epiparameter_options <- list(
# Maximum number of <epidist> objects to print for <multi_epidist>
print_max = getOption("epiparameter.print_max", default = 5L),
# Number of <epidist> objects to print if <multi_epidist> has more than
# `print_max`
print_min = getOption("epiparameter.print_min", default = 3L)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having two coupled options is a bit confusing IMO. It took me some time to understand how many elements will be printing in which situation with these settings.
We can also end up in a situation where a shorter object prints more elements than a longer object (when length == 4 with the default settings), which seems a bit counterintuitive to me.

What about about using a single value that would be the default for n?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This design pattern follows the {tibble} printing. My assumption is it is worth printing a slightly longer table/list if all the rows/elements can be shown up to a limit.

library(tibble)
tibble(a = 1:3, b = 1:3)
#> # A tibble: 3 × 2
#>       a     b
#>   <int> <int>
#> 1     1     1
#> 2     2     2
#> 3     3     3
tibble(a = 1:20, b = 1:20)
#> # A tibble: 20 × 2
#>        a     b
#>    <int> <int>
#>  1     1     1
#>  2     2     2
#>  3     3     3
#>  4     4     4
#>  5     5     5
#>  6     6     6
#>  7     7     7
#>  8     8     8
#>  9     9     9
#> 10    10    10
#> 11    11    11
#> 12    12    12
#> 13    13    13
#> 14    14    14
#> 15    15    15
#> 16    16    16
#> 17    17    17
#> 18    18    18
#> 19    19    19
#> 20    20    20
tibble(a = 1:50, b = 1:50)
#> # A tibble: 50 × 2
#>        a     b
#>    <int> <int>
#>  1     1     1
#>  2     2     2
#>  3     3     3
#>  4     4     4
#>  5     5     5
#>  6     6     6
#>  7     7     7
#>  8     8     8
#>  9     9     9
#> 10    10    10
#> # ℹ 40 more rows

Created on 2024-06-04 with reprex v2.1.0

Unless you feel the current printing options will lead to confusion I will keep as is for now.

R/print.R Outdated Show resolved Hide resolved
n_entries <- length(x) # nolint object_usage_linter used by cli {} syntax

diseases <- vapply(x, "[[", FUN.VALUE = character(1), "disease")
alpha_unique_diseases <- sort(unique(diseases))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default sort() is locale-dependent. Is it what we want? If not, sort(method = "radix") is a better, locale-independent alternative.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I am using sort() to alphabetise the vector of characters. I think locale-dependent sorting would be best for alphabetical order across different languages, even if that differs between locales, but please correct me if I've misunderstood this.

#'
#' @return Invisibly returns a `<multi_epidist>`. Called for side-effects.
#' @export
print.multi_epidist <- function(x, ..., n = NULL) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print.multi_epidist <- function(x, ..., n = NULL) {
print.multi_epidist <- function(x, n = NULL, ...) {

Or do you prefer to always force n to be named?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to force explicit argument naming. Again following the design from {tibble}: https://tibble.tidyverse.org/reference/formatting.html.

R/print.R Outdated Show resolved Hide resolved
R/print.R Outdated Show resolved Hide resolved
R/print.R Show resolved Hide resolved
R/print.R Outdated Show resolved Hide resolved
R/print.R Outdated Show resolved Hide resolved
@papsti
Copy link

papsti commented Jun 3, 2024

this is wonderful! thanks for taking my feedback (and the feedback of others) into account here.

one small point---the new database page is a bit hard to read with the current pkgdown site styling:

image

this hard-to-read theming seems to be applied to the dev site, but not the main site.

dev:
image

main:
image

just quickly searching through open issues, i haven't seen anyone raise this yet, but apologies if this is a known and documented issues (or intended behaviour!)

@joshwlambert
Copy link
Member Author

get the user provided value from an environment variable, so it can be set across sessions. You can check how linelist does this.

@Bisaloo I'm not sure how I can achieve this. Sys.getenv() unset argument takes a character string which is fine for {linelist} as the variable you want to store is a character. However, in this case for {epiparameter} we want the default to be a list of 2 numerics.

@joshwlambert
Copy link
Member Author

@papsti thanks for reporting. I haven't seen this before but will look into it. At the moment we have to link the development site as the database vignette has not been rendered on the main site (it will be once the next version is released). Once it is on the main site I will change the print message to link there.

Can I ask do you use dark-mode in your browser, and do other Epiverse websites have the same issue between dev and main site?

{epidemics}:

@papsti
Copy link

papsti commented Jun 4, 2024

@joshwlambert, yes, i noticed this issue while using microsoft edge with dark mode enabled in the settings:

image

i initially thought that might be it, but when i didn't see it on the main site, i thought it might just be that you were trying new styling on the dev site. i haven't changed my dark mode settings but i see both the main and dev site of epidemics with a white background.

perhaps you're using a newer version of pkgdown on the epiparameter dev site that tries to accommodate dark mode? maybe there's an option to switch this off?

@Bisaloo
Copy link
Member

Bisaloo commented Jun 4, 2024

Thanks for the report @papsti, I think I understand the source of the issue now. I'll try to fix it in the next couple of days. Can I ping you to test it once it's live?

@papsti
Copy link

papsti commented Jun 4, 2024

sure, happy to test a fix out @Bisaloo

@Bisaloo
Copy link
Member

Bisaloo commented Jun 5, 2024

@Bisaloo I'm not sure how I can achieve this. Sys.getenv() unset argument takes a character string which is fine for {linelist} as the variable you want to store is a character. However, in this case for {epiparameter} we want the default to be a list of 2 numerics.

This comment was mainly done with the idea we may switch to a single option. You could still control this via 2 environment variables, which indeed would be strings, but can be converted to a numeric.

However, that may not be necessary, we can maybe keep the current structure with some extra docs:

  • document in epiparameter_options you need to reload epiparameter for options() changes to be taken into account
  • document somewhere (either in epiparameter_options or in a vignette) that options() can be set in .Rprofile to persist across sessions

For the other points, even though it makes sense to match with tibble's design, I still not completely convinced but I'm happy to let you decide as the maintainer.

@joshwlambert joshwlambert merged commit 6e08856 into main Jun 10, 2024
10 checks passed
@joshwlambert joshwlambert deleted the print_multi_epidist branch June 10, 2024 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request S3-method
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants