Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend xportr_write to accept metadata and deprecate label #185

Merged
merged 13 commits into from
Dec 7, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 6 additions & 3 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
# xportr (development version)
# xportr 0.3.1.9001

## New Features and Bug Fixes

## Documentation
* `xportr_write()` now accepts `metadata` argument which can be used to set the dataset label to stay consistent with the other `xportr_*` functions. It is noteworthy that the dataset label set using the `xportr_df_label()` function will be retained during the `xportr_write()`.
* Exporting a new dataset `dataset_spec` that contains the Dataset Specification for ADSL.

## Deprecation and Breaking Changes
* The `label` argument from the `xportr_write()` function is deprecated in favor of the `metadata` argument.

## Documentation

# xportr 0.3.1

Expand Down
19 changes: 18 additions & 1 deletion R/data.R
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@
#' }
"adsl"

#' Example Dataset Specification
#' Example Dataset Variable Specification
#'
#' @format ## `var_spec`
#' A data frame with 216 rows and 19 columns:
Expand All @@ -82,3 +82,20 @@
#' \item{Developer Notes}{Developer Notes}
#' }
"var_spec"

#' Example Dataset Specification
#'
#' @format ## `dataset_spec`
#' A data frame with 1 row and 9 columns:
#' \describe{
#' \item{Dataset}{<chr> Dataset}
#' \item{Description}{<chr> Dataset description}
#' \item{Class}{<chr> Dataset class}
#' \item{Structure}{<lgl> Logical, indicating if there's a specific structure}
#' \item{Purpose}{<chr> Purpose of the dataset}
#' \item{Key, Variables}{<chr> Join Key variables in the dataset}
#' \item{Repeating}{<chr> Indicates if the dataset is repeating}
#' \item{Reference Data}{<lgl> Regerence Data}
#' \item{Comment}{<chr> Additional comment}
#' }
"dataset_spec"
4 changes: 4 additions & 0 deletions R/df_label.R
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,10 @@ xportr_df_label <- function(.df,
abort("Length of dataset label must be 40 characters or less.")
}

if (stringr::str_detect(label, "[^[:ascii:]]")) {
abort("`label` cannot contain any non-ASCII, symbol or special characters.")
}

attr(.df, "label") <- label

.df
Expand Down
46 changes: 31 additions & 15 deletions R/write.R
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,12 @@
#' @param .df A data frame to write.
#' @param path Path where transport file will be written. File name sans will be
#' used as `xpt` name.
#' @param label Dataset label. It must be <=40 characters.
#' @param label `r lifecycle::badge("deprecated")` Previously used to to set the Dataset label.
#' Use the `metadata` argument to set the dataset label.
#' @param strict_checks If TRUE, xpt validation will report errors and not write
#' out the dataset. If FALSE, xpt validation will report warnings and continue
#' with writing out the dataset. Defaults to FALSE
#' @inheritParams xportr_length
#'
#' @details
#' * Variable and dataset labels are stored in the "label" attribute.
Expand All @@ -32,17 +34,43 @@
#' Param = c("param1", "param2", "param3")
#' )
#'
#' var_spec <- data.frame(dataset = "adsl", label = "Subject-Level Analysis Dataset")
#' xportr_write(adsl,
#' path = paste0(tempdir(), "/adsl.xpt"),
#' label = "Subject-Level Analysis",
#' metadata = var_spec,
#' strict_checks = FALSE
#' )
#'
xportr_write <- function(.df, path, label = NULL, strict_checks = FALSE) {
xportr_write <- function(.df,
path,
metadata = NULL,
domain = NULL,
strict_checks = FALSE,
label = deprecated()) {
path <- normalizePath(path, mustWork = FALSE)

name <- tools::file_path_sans_ext(basename(path))

## Common section to detect domain from argument or pipes

df_arg <- tryCatch(as_name(enexpr(.df)), error = function(err) NULL)
domain <- get_domain(.df, df_arg, domain)
if (!is.null(domain)) attr(.df, "_xportr.df_arg_") <- domain

## End of common section

if (!missing(label)) {
lifecycle::deprecate_warn(
when = "0.3.2",
what = "xportr_write(label = )",
with = "xportr_write(metadata = )"
)
metadata <- data.frame(dataset = domain, label = label)
}
if (!is.null(metadata)) {
.df <- xportr_df_label(.df, metadata = metadata, domain = domain)
}

if (nchar(name) > 8) {
abort("`.df` file name must be 8 characters or less.")
}
Expand All @@ -51,18 +79,6 @@ xportr_write <- function(.df, path, label = NULL, strict_checks = FALSE) {
abort("`.df` cannot contain any non-ASCII, symbol or underscore characters.")
}

if (!is.null(label)) {
if (nchar(label) > 40) {
abort("`label` must be 40 characters or less.")
}

if (stringr::str_detect(label, "[^[:ascii:]]")) {
abort("`label` cannot contain any non-ASCII, symbol or special characters.")
}

attr(.df, "label") <- label
}

checks <- xpt_validate(.df)

if (length(checks) > 0) {
Expand Down
10 changes: 8 additions & 2 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ library(fontawesome)
# xportr <img src="man/figures/logo.png" align="right" alt="" width="120" />

<!-- badges: start -->
[<img src="https://img.shields.io/badge/Slack-RValidationHub-blue?style=flat&logo=slack">](https://RValidationHub.slack.com)
[![R build status](https://github.com/atorus-research/xportr/workflows/R-CMD-check/badge.svg)](https://github.com/atorus-research/xportr/actions?workflow=R-CMD-check)
[<img src="https://img.shields.io/codecov/c/gh/atorus-research/xportr">](https://app.codecov.io/gh/atorus-research/xportr)
[<img src="https://img.shields.io/badge/License-MIT-blue.svg">](https://github.com/atorus-research/xportr/blob/master/LICENSE)
Expand Down Expand Up @@ -121,6 +122,9 @@ spec_path <- system.file(paste0("specs/", "ADaM_admiral_spec.xlsx"), package = "
var_spec <- readxl::read_xlsx(spec_path, sheet = "Variables") %>%
dplyr::rename(type = "Data Type") %>%
rlang::set_names(tolower)
dataset_spec <- readxl::read_xlsx(spec_path, sheet = "Datasets") %>%
dplyr::rename(label = "Description") %>%
rlang::set_names(tolower)
```

Each `xportr_` function has been written in a way to take in a part of the specification file and apply that piece to the dataset. Setting `verbose = "warn"` will send appropriate warning message to the console. We have suppressed the warning for the sake of brevity.
Expand All @@ -132,7 +136,8 @@ adsl %>%
xportr_label(var_spec, "ADSL", verbose = "warn") %>%
xportr_order(var_spec, "ADSL", verbose = "warn") %>%
xportr_format(var_spec, "ADSL") %>%
xportr_write("adsl.xpt", label = "Subject-Level Analysis Dataset")
xportr_df_label(dataset_spec, "ADSL") %>%
xportr_write("adsl.xpt")
vedhav marked this conversation as resolved.
Show resolved Hide resolved
```

The `xportr_metadata()` function can reduce duplication by setting the variable specification and domain explicitly at the top of a pipeline. If you would like to use the `verbose` argument, you will need to set in each function call.
Expand All @@ -145,7 +150,8 @@ adsl %>%
xportr_label() %>%
xportr_order() %>%
xportr_format() %>%
xportr_write("adsl.xpt", label = "Subject-Level Analysis Dataset")
xportr_df_label(dataset_spec) %>%
xportr_write("adsl.xpt")
```

That's it! We now have a xpt file created in R with all appropriate types, lengths, labels, ordering and formats. Please check out the [Get Started](https://atorus-research.github.io/xportr/articles/xportr.html) for more information and detailed walk through of each `xportr_` function.
Expand Down
9 changes: 7 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,9 @@ spec_path <- system.file(paste0("specs/", "ADaM_admiral_spec.xlsx"), package = "
var_spec <- readxl::read_xlsx(spec_path, sheet = "Variables") %>%
dplyr::rename(type = "Data Type") %>%
rlang::set_names(tolower)
dataset_spec <- readxl::read_xlsx(spec_path, sheet = "Datasets") %>%
dplyr::rename(label = "Description") %>%
rlang::set_names(tolower)
```

Each `xportr_` function has been written in a way to take in a part of
Expand All @@ -140,7 +143,8 @@ adsl %>%
xportr_label(var_spec, "ADSL", verbose = "warn") %>%
xportr_order(var_spec, "ADSL", verbose = "warn") %>%
xportr_format(var_spec, "ADSL") %>%
xportr_write("adsl.xpt", label = "Subject-Level Analysis Dataset")
xportr_df_label(dataset_spec, "ADSL") %>%
xportr_write("adsl.xpt")
```

The `xportr_metadata()` function can reduce duplication by setting the
Expand All @@ -156,7 +160,8 @@ adsl %>%
xportr_label() %>%
xportr_order() %>%
xportr_format() %>%
xportr_write("adsl.xpt", label = "Subject-Level Analysis Dataset")
xportr_df_label(dataset_spec) %>%
xportr_write("adsl.xpt")
```

That’s it! We now have a xpt file created in R with all appropriate
Expand Down
1 change: 1 addition & 0 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ reference:
- contents:
- adsl
- var_spec
- dataset_spec

articles:
- title: ~
Expand Down
Binary file added data/dataset_spec.rda
Binary file not shown.
30 changes: 30 additions & 0 deletions man/dataset_spec.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions man/var_spec.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

22 changes: 19 additions & 3 deletions man/xportr_write.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading