Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding continuous tabulating function #953

Merged
merged 37 commits into from
Oct 11, 2021
Merged

Adding continuous tabulating function #953

merged 37 commits into from
Oct 11, 2021

Conversation

ddsjoberg
Copy link
Owner

@ddsjoberg ddsjoberg commented Jul 31, 2021

What changes are proposed in this pull request?

  • Added new function tbl_continuous() to summarize a continuous variable by 1 or more categorial variables.
  • Added tbl_strata(.stack_group_header=) argument to include/exclude the headers when tables are combined with tbl_stack()
  • Added tbl_strata(.quiet=) argument.

If there is an GitHub issue associated with this pull request, please provide link.

# Example 1 ----------------------------------
tbl_continuous_ex1 <-
  tbl_continuous(
    data = trial,
    variable = age,
    by = trt,
    include = grade
  )

image

# Example 2 ----------------------------------
tbl_continuous_ex2 <-
  tbl_continuous(
    data = trial,
    variable = age,
    include = c(trt, grade)
  )

image


Checklist for PR reviewer

  • PR branch has pulled the most recent updates from master branch. Ensure the pull request branch and your local version match and both have the latest updates from the master branch.
  • If an update was made to tbl_summary(), was the same change implemented for tbl_svysummary()?
  • If a new function was added, function included in _pkgdown.yml
  • If a bug was fixed, a unit test was added for the bug check
  • Run pkgdown::build_site(). Check the R console for errors, and review the rendered website.
  • Code coverage is suitable for any new functions/features. Review coverage with covr::report(). Before you run, set Sys.setenv(NOT_CRAN="true") and begin in a fresh R session without any packages loaded.
  • R CMD Check runs without errors, warnings, and notes
  • usethis::use_spell_check() runs with no spelling errors in documentation

When the branch is ready to be merged into master:

  • Update NEWS.md with the changes from this pull request under the heading "# gtsummary (development version)". If there is an issue associated with the pull request, reference it in parantheses at the end update (see NEWS.md for examples).
  • Increment the version number using usethis::use_version(which = "dev")
  • Run codemetar::write_codemeta()
  • Run usethis::use_spell_check() again
  • Approve Pull Request
  • Merge the PR. Please use "Squash and merge".

@ddsjoberg ddsjoberg marked this pull request as ready for review August 8, 2021 00:26
@ddsjoberg ddsjoberg mentioned this pull request Aug 30, 2021
@karissawhiting
Copy link
Contributor

karissawhiting commented Sep 21, 2021

@ddsjoberg This feature is really useful. I'm looking forward to using it!

An initial thought about the statistic argument:

I was thinking it would be useful to accept a list of formulas and allow different continuous statistic types for different comparison variables.

E.g.

tbl_continuous(
  data = trial,
  variable = age,
  statistic =  list(grade ~ "{mean}", 
                    trt ~ "{median}"),
  include = c(trt, grade)
)

However, using it this way is a llittle inconsistent with left side variable ~ right side statistic relationship in tbl_summary(), so that could potentially be confusing?

Conversely, you could acheive the same result this way:

x <- tbl_continuous(
  data = trial,
  variable = age,
  statistic =  "{mean}",
  include = c(grade)
)

y <- tbl_continuous(
  data = trial,
  variable = age,
  statistic =  "{median}",
  include = c(trt)
)

tbl_stack(list(x, y))

But it's a bit more cumbersome and the footnote needs adjustment.

Thoughts on this?

Also, is there a way to use continuous2 for multi-line continuous stats here? Could it potentially default to this if the user passes a vector of length 2 to statistic argument (e.g. statistic = c("{median} ({p25}, {p75})", "{min}, {max}")?

@ddsjoberg
Copy link
Owner Author

Thank you for the review @karissawhiting !!

You make a very good point, and this is something I planned very poorly unfortunately. The tbl_summary(statistic=) argument take a formula list, tbl_cross(statistic=) is a string, and it's not consistent. UGH

Since tbl_summary() is more often used, matching that syntax is preferable I think. I"ll see how difficult that update is, and report back!

@larmarange
Copy link
Collaborator

Just a quick comment, but it is also possible to replicate tbl_continuous() with the drafted tbl_custom_summary() who already accepts different statistics per variable.

cf. #976 (comment)

;-)

@ddsjoberg
Copy link
Owner Author

FYI @larmarange @karissawhiting I am going to update this function to use Joseph's contribution tbl_custom_summary(): it'll make the internals sooo much nicer to read :)

@ddsjoberg
Copy link
Owner Author

Hey @karissawhiting @larmarange !

I updated the back-end to use tbl_custom_summary().....so much less code! Very nice! Thanks @larmarange !

this allows users to select different stats for variable variables and I also added a digits argument to control rounding of the continuous variable.

@larmarange
Copy link
Collaborator

That's great @ddsjoberg

  • Would it be relevant to add the option overall_row? Could be useful here.
  • Just curious, why duplicating continuous_summary ()

@ddsjoberg
Copy link
Owner Author

@larmarange

Just curious, why duplicating continuous_summary()

I forgot there was a copy already in the package 😆

@ddsjoberg ddsjoberg merged commit bcd602e into master Oct 11, 2021
@ddsjoberg ddsjoberg deleted the continuous_cross branch October 11, 2021 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants