Skip to content

Commit

Permalink
Merge pull request #141 from atorus-research/84_xportr_deep_dive_vign…
Browse files Browse the repository at this point in the history
…ette

Closes #84 xportr deep dive vignette
  • Loading branch information
bms63 authored Jun 15, 2023
2 parents 79550f9 + 67b6925 commit 4692171
Show file tree
Hide file tree
Showing 25 changed files with 867 additions and 147 deletions.
2 changes: 2 additions & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,5 @@
^advs\.xpt$
^advs_Define-Excel-Spec_match_admiral\.xlsx
^cran-comments\.md$
^example_data_specs$

2 changes: 2 additions & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -81,3 +81,5 @@ Suggests:
metacore
Config/testthat/edition: 3
VignetteBuilder: knitr
Depends:
R (>= 3.5)
14 changes: 9 additions & 5 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@

## New Features and Bug Fixes

* Fixed an issue where `xportr_type` would overwrite column labels, widths, and "sas.formats"
* Fixed messaging of `xportr_order`to give better visability of the number of variables being reordered.
* Add new argument to `xportr_write` to allow users to specify how xpt validation checks are handled.
* Fixed an issue where `xportr_type()` would overwrite column labels, widths, and "sas.formats"
* Fixed messaging of `xportr_order()`to give better visibility of the number of variables being reordered.
* Add new argument to `xportr_write()` to allow users to specify how xpt validation checks are handled.
* Fixed bug where character_types were case sensitive. They are now case insensitive.
* Updated `xportr_type` to make type coercion more explicit.
* Updated `xportr_type()` to make type coercion more explicit.
* `xpt_validate` updated to accept iso8601 date formats. (#76)
* Added function `xportr_metadata()` to explicitly set metadata at the start of a pipeline (#44)
* Metadata order columns are now coerced to numeric by default in `xportr_order()` to prevent character sorting (#149)
Expand All @@ -16,8 +16,12 @@
## Documentation

* Moved `{pkgdown}` site to bootswatch. Enabled search and linked slack icon (#122).
* Additional Deep Dive vignette showcasing functions and quality of life utilities for processing `xpts` created (#84)
* Get Started vignette spruced up. Messages are now displayed and link to Deep Dive vignette (#150)

## Deprecation and Breaking Changes

## Deprecation
and Breaking Changes

* The `metacore` argument has been renamed to `metadata` in the following six xportr functions: `xportr_df_label()`, `xportr_format()`, `xportr_label()`, `xportr_length()`, `xportr_order()`, and `xportr_type()`. Please update your code to use the new `metadata` argument in place of `metacore`.

Expand Down
84 changes: 84 additions & 0 deletions R/data.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
#' Analysis Dataset Subject Level
#'
#' An example dataset containing subject level data
#'
#' @format ## `adsl`
#' A data frame with 254 rows and 48 columns:
#' \describe{
#' \item{STUDYID}{Study Identifier}
#' \item{USUBJID}{Unique Subject Identifier}
#' \item{SUBJID}{Subject Identifier for the Study}
#' \item{SITEID}{Study Site Identifier}
#' \item{SITEGR1}{Pooled Site Group 1}
#' \item{ARM}{Description of Planned Arm}
#' \item{TRT01P}{Planned Treatment for Period 01}
#' \item{TRT01PN}{Planned Treatment for Period 01 (N)}
#' \item{TRT01A}{Actual Treatment for Period 01}
#' \item{TRT01AN}{Actual Treatment for Period 01 (N)}
#' \item{TRTSDT}{Date of First Exposure to Treatment}
#' \item{TRTEDT}{Date of Last Exposure to Treatment}
#' \item{TRTDUR}{Duration of Treatment (days)}
#' \item{AVGDD}{Avg Daily Dose (as planned)}
#' \item{CUMDOSE}{Cumulative Dose (as planned)}
#' \item{AGE}{Age}
#' \item{AGEGR1}{Pooled Age Group 1}
#' \item{AGEGR1N}{Pooled Age Group 1 (N)}
#' \item{AGEU}{Age Units}
#' \item{RACE}{Race}
#' \item{RACEN}{Race (N)}
#' \item{SEX}{Sex}
#' \item{ETHNIC}{Ethnicity}
#' \item{SAFFL}{Safety Population Flag}
#' \item{ITTFL}{Intent-To-Treat Population Flag}
#' \item{EFFFL}{Efficacy Population Flag}
#' \item{COMP8FL}{Completers of Week 8 Population Flag}
#' \item{COMP16FL}{Completers of Week 16 Population Flag}
#' \item{COMP24FL}{Completers of Week 24 Population Flag}
#' \item{DISCONFL}{Did the Subject Discontinue the Study}
#' \item{DSRAEFL}{Discontinued due to AE}
#' \item{DTHFL}{Subject Died}
#' \item{BMIBL}{Baseline BMI (kg/m^2)}
#' \item{BMIBLGR1}{Pooled Baseline BMI Group 1}
#' \item{HEIGHTBL}{Baseline Height (cm)}
#' \item{WEIGHTBL}{Baseline Weight (kg)}
#' \item{EDUCLVL}{Years of Education}
#' \item{DISONSDT}{Date of Onset of Disease}
#' \item{DURDIS}{Duration of Disease (Months)}
#' \item{DURDSGR1}{Pooled Disease Duration Group 1}
#' \item{VISIT1DT}{Date of Visit 1}
#' \item{RFSTDTC}{Subject Reference Start Date/Time}
#' \item{RFENDTC}{Subject Reference End Date/Time}
#' \item{VISNUMEN}{End of Trt Visit (Vis 12 or Early Term.)}
#' \item{RFENDT}{Date of Discontinuation/Completion}
#' \item{DCDECOD}{Standardized Disposition Term}
#' \item{DCREASCD}{Reason for Discontinuation}
#' \item{MMSETOT}{MMSE Total}
#' }
"adsl"

#' Example Dataset Specification
#'
#' @format ## `var_spec`
#' A data frame with 216 rows and 19 columns:
#' \describe{
#' \item{Order}{Order of variable}
#' \item{Dataset}{Dataset}
#' \item{Variable}{Variable}
#' \item{Label}{Variable Label}
#' \item{Data Type}{Data Type}
#' \item{Length}{Variable Length}
#' \item{Significant Digits}{Significant Digits}
#' \item{Format}{Variable Format}
#' \item{Mandatory}{Mandatory Variable Flag}
#' \item{Assigned Value}{Variable Assigned Value}
#' \item{Codelist}{Variable Codelist}
#' \item{Common}{Common Variable Flag}
#' \item{Origin}{Variable Origin}
#' \item{Pages}{Pages}
#' \item{Method}{Variable Method}
#' \item{Predecessor}{Variable Predecessor}
#' \item{Role}{Variable Role}
#' \item{Comment}{Comment}
#' \item{Developer Notes}{Developer Notes}
#' }
"var_spec"
2 changes: 1 addition & 1 deletion R/messages.R
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ type_log <- function(meta_ordered, type_mismatch_ind, verbose) {

#' Utility for Lengths
#'
#' @param miss_vars Variables missing from metatdata
#' @param miss_vars Variables missing from metadata
#' @param verbose Provides additional messaging for user
#'
#' @return Output to Console
Expand Down
18 changes: 15 additions & 3 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ data sets (≤ 200)
- Coerces variables to only numeric or character types
- Display format support for numeric float and date/time values
- Variables names are ≤ 8 characters.
- Variable labels are ≤ 200 characters.
- Variable labels are ≤ 40 characters.
- Data set labels are ≤ 40 characters.
- Presence of non-ASCII characters in Variable Names, Labels or data set labels.

Expand All @@ -103,7 +103,7 @@ To do this we will need to do the following:
- Apply a dataset label
- Write out a version 5 xpt file

All of which can be done using a well-defined specification file and the `xportr` package!
All of which can be done using a well-defined specification file and the `{xportr}` package!

First we will start with our `ADSL` dataset created in R. This example `ADSL` dataset is taken from the [`{admiral}`](https://pharmaverse.github.io/admiral/index.html) package. The script that generates this `ADSL` dataset can be created by using this command `admiral::use_ad_template("adsl")`. This `ADSL` dataset has 306 observations and 48 variables.

Expand All @@ -125,7 +125,19 @@ var_spec <- readxl::read_xlsx(spec_path, sheet = "Variables") %>%
rlang::set_names(tolower)
```

Each `xportr_` function has been written in a way to take in a part of the specification file and apply that piece to the dataset.
Each `xportr_` function has been written in a way to take in a part of the specification file and apply that piece to the dataset. Setting `verbose = "warn"` will send appropriate warning message to the console. We have suppressed the warning for the sake of brevity.

```{r, warning = FALSE, message=FALSE, eval=TRUE}
adsl %>%
xportr_type(var_spec, "ADSL", verbose = "warn") %>%
xportr_length(var_spec, "ADSL", verbose = "warn") %>%
xportr_label(var_spec, "ADSL", verbose = "warn") %>%
xportr_order(var_spec, "ADSL", verbose = "warn") %>%
xportr_format(var_spec, "ADSL", verbose = "warn") %>%
xportr_write("adsl.xpt", label = "Subject-Level Analysis Dataset")
```

The `xportr_metadata()` function can reduce duplication by setting the variable specification and domain explicitly at the top of a pipeline. If you would like to use the `verbose` argument, you will need to set in each function call.

```{r, message=FALSE, eval=FALSE}
adsl %>%
Expand Down
23 changes: 20 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ to any validators or data reviewers.
- Coerces variables to only numeric or character types
- Display format support for numeric float and date/time values
- Variables names are ≤ 8 characters.
- Variable labels are ≤ 200 characters.
- Variable labels are ≤ 40 characters.
- Data set labels are ≤ 40 characters.
- Presence of non-ASCII characters in Variable Names, Labels or data set
labels.
Expand All @@ -99,7 +99,7 @@ To do this we will need to do the following:
- Write out a version 5 xpt file

All of which can be done using a well-defined specification file and the
`xportr` package!
`{xportr}` package!

First we will start with our `ADSL` dataset created in R. This example
`ADSL` dataset is taken from the
Expand Down Expand Up @@ -131,7 +131,24 @@ var_spec <- readxl::read_xlsx(spec_path, sheet = "Variables") %>%
```

Each `xportr_` function has been written in a way to take in a part of
the specification file and apply that piece to the dataset.
the specification file and apply that piece to the dataset. Setting
`verbose = "warn"` will send appropriate warning message to the console.
We have suppressed the warning for the sake of brevity.

``` r
adsl %>%
xportr_type(var_spec, "ADSL", verbose = "warn") %>%
xportr_length(var_spec, "ADSL", verbose = "warn") %>%
xportr_label(var_spec, "ADSL", verbose = "warn") %>%
xportr_order(var_spec, "ADSL", verbose = "warn") %>%
xportr_format(var_spec, "ADSL", verbose = "warn") %>%
xportr_write("adsl.xpt", label = "Subject-Level Analysis Dataset")
```

The `xportr_metadata()` function can reduce duplication by setting the
variable specification and domain explicitly at the top of a pipeline.
If you would like to use the `verbose` argument, you will need to set in
each function call.

``` r
adsl %>%
Expand Down
74 changes: 38 additions & 36 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ template:
params:
bootswatch: sandstone
search:
exclude: ['news/index.html']
exclude: ["news/index.html"]
news:
cran_dates: true

Expand All @@ -18,39 +18,41 @@ navbar:
href: https://pharmaverse.slack.com/archives/C030EB2M4GM
aria-label: slack


reference:
- title: The six core xportr functions
- contents:
- xportr_type
- xportr_length
- xportr_label
- xportr_write
- xportr_format
- xportr_order

- title: xportr helper functions
- contents:
- label_log
- length_log
- type_log
- var_names_log
- var_ord_msg
- xportr_logger
- xportr_df_label
- xportr_metadata

- title: xportr
navbar: ~
contents:
- xportr

- title: internal
contents:
- cli_theme_tests
- expect_attr_width
- minimal_metadata
- minimal_table



- title: The six core xportr functions
- contents:
- xportr_type
- xportr_length
- xportr_label
- xportr_write
- xportr_format
- xportr_order

- title: xportr helper functions
- contents:
- label_log
- length_log
- type_log
- var_names_log
- var_ord_msg
- xportr_logger
- xportr_df_label
- xportr_metadata

- title: xportr example datasets and specification files
- contents:
- adsl
- var_spec

- title: internal
contents:
- cli_theme_tests
- expect_attr_width
- minimal_metadata
- minimal_table

articles:
- title: ~
navbar: ~
contents:
- deepdive
Binary file added data/adsl.rda
Binary file not shown.
Binary file added data/var_spec.rda
Binary file not shown.
Binary file not shown.
Binary file added example_data_specs/TDF_ADaM_Pilot3.xlsx
Binary file not shown.
Binary file added example_data_specs/adadas.xpt
Binary file not shown.
Binary file added example_data_specs/adae.xpt
Binary file not shown.
Binary file added example_data_specs/adlbc.xpt
Binary file not shown.
Binary file added example_data_specs/adtte.xpt
Binary file not shown.
1 change: 1 addition & 0 deletions example_data_specs/readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Data taken from Pilot 3 Submission Study: https://github.com/RConsortium/submissions-pilot3-adam
22 changes: 18 additions & 4 deletions inst/WORDLIST
Original file line number Diff line number Diff line change
@@ -1,34 +1,48 @@
ADAE
ADSL
ADaM
AE
Atorus
BMI
CDISC
CDSIC
Codelist
Completers
DCREASCD
DM
GSK
JPT
Lifecycle
MMSE
ORCID
PHUSE
Pharma
Repostiory
SASformat
SASlength
SAStype
SDSP
SDTM
Standardisation
TRTDUR
Trt
Vignesh
Vis
XPT
acrf
adrg
bootswatch
chr
cli
deliverables
df
iso
magrittr
metacore
metatdata
pre
sas
repo
sdrg
validator
validators
visability
xportr's
xportr’s
xpt
Loading

0 comments on commit 4692171

Please sign in to comment.