Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testdata #142

Merged
merged 10 commits into from
Oct 20, 2022
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -39,12 +39,12 @@ Imports:
utils,
RColorBrewer,
stats,
rlang,
lubridate
Depends:
R (>= 3.5.0)
Suggests:
remotes,
rlang,
tidyverse,
knitr,
rmarkdown,
Expand Down
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -30,3 +30,4 @@ export(TransformCensoredData)
export(WQXTargetUnits)
export(readWQPwebservice)
importFrom(magrittr,"%>%")
importFrom(rlang,":=")
34 changes: 33 additions & 1 deletion R/DataDiscoveryRetrieval.R
Original file line number Diff line number Diff line change
Expand Up @@ -271,11 +271,43 @@ autoclean <- function(.data) {

# run MeasureValueSpecialCharacters function
.data <- MeasureValueSpecialCharacters(.data)


#convert 'meters' to 'm'
.data$ActivityDepthHeightMeasure.MeasureUnitCode[.data$ActivityDepthHeightMeasure.MeasureUnitCode == 'meters'] <- 'm'
.data$ActivityTopDepthHeightMeasure.MeasureUnitCode[.data$ActivityTopDepthHeightMeasure.MeasureUnitCode == 'meters'] <- 'm'
.data$ActivityBottomDepthHeightMeasure.MeasureUnitCode[.data$ActivityBottomDepthHeightMeasure.MeasureUnitCode == 'meters'] <- 'm'
.data$ResultDepthHeightMeasure.MeasureUnitCode[.data$ResultDepthHeightMeasure.MeasureUnitCode == 'meters'] <- 'm'
.data$ResultMeasure.MeasureUnitCode[.data$ResultMeasure.MeasureUnitCode == 'meters'] <- 'm'

return(.data)
}


#' autoclean
#'
#' Removes complex biological data. Removes non-water media samples.
#' Removes rows of data that are true duplicates. Capitalizes fields to harmonize
#' data. This function includes and runs the TADA "MeasureValueSpecialCharacters"
#' function as well.
#'
#' Within "BiologicalIntentName", only the allowable values "tissue", "toxicity",
#' and "NA" apply to non-biological data (the function removes all others).
#' Toxicity and fish tissue data will be kept, but other types of biological
#' monitoring data will not.
#'
#' We decided to make some fields uppercase that way they're more compatible
#' with the WQX validation reference tables and to avoid any issues with
#' case-sensitivity when joining data. Therefore, we might need to tack on any
#' immediate QA steps (removing true duplicates, converting result values to numeric,
#' capitalizing letters, etc.) to this function, as well as the other retrieval functions.
#'
#' @param .data TADA dataframe
#'
#' @return cleaned TADA data profile
#'



#' TADA Profile Check
#'
#' This function checks if the column names in a dataframe include the TADA
Expand Down
201 changes: 119 additions & 82 deletions R/ResultFlagsDependent.R
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ InvalidFraction <- function(.data, clean = TRUE) {
col.order <- colnames(.data)
# add WQX.SampleFractionValidity column to the list
col.order <- append(col.order, "WQX.SampleFractionValidity")
# reorder columns in flag.data
# reorder columns in check.data
check.data <- check.data[, col.order]
# place flag column next to relevant fields
check.data <- check.data %>%
Expand Down Expand Up @@ -156,7 +156,7 @@ InvalidSpeciation <- function(.data, clean = TRUE) {
col.order <- colnames(.data)
# add WQX.MethodSpeciationValidity column to the list
col.order <- append(col.order, "WQX.MethodSpeciationValidity")
# reorder columns in flag.data
# reorder columns in check.data
check.data <- check.data[, col.order]
# place flag columns next to relevant fields
check.data <- check.data %>%
Expand Down Expand Up @@ -254,7 +254,7 @@ InvalidResultUnit <- function(.data, clean = TRUE) {
col.order <- colnames(.data)
# add WQX.ResultUnitValidity column to the list
col.order <- append(col.order, "WQX.ResultUnitValidity")
# reorder columns in flag.data
# reorder columns in check.data
check.data <- check.data[, col.order]
# place flag columns next to relevant fields
check.data <- check.data %>%
Expand Down Expand Up @@ -294,14 +294,14 @@ InvalidResultUnit <- function(.data, clean = TRUE) {
#' Depth Profile Flag & Unit Conversion
#'
#' Function checks dataset for depth profile data. Where depth profile columns
#' are populated, the function appends 'Conversion Factor' columns
#' are populated, the function appends 'Conversion.Factor' columns
#' and populates those columns based on the original unit (MeasureUnitCode
#' columns) and the target unit, which is defined in the 'unit' argument. A
#' 'Depth Target Unit' column is also appended, indicating the unit all selected
#' 'WQX.Depth.TargetUnit' column is also appended, indicating the unit all selected
#' depth data is converted to. When transform = FALSE, the output includes all
#' 'Conversion Factor' columns and the 'Depth Target Unit' column. When transform
#' = TRUE, the output includes converted depth data and the 'Depth Target
#' Unit' column, which acts as a flag indicating which rows have been converted.
#' 'Conversion.Factor' columns and the 'WQX.Depth.TargetUnit' column. When transform
#' = TRUE, the output includes converted depth data and the 'WQX.Depth.TargetUnit'
#' column, which acts as a flag indicating which rows have been converted.
#' Default is transform = TRUE.
#'
#' @param .data TADA dataframe
Expand All @@ -315,15 +315,17 @@ InvalidResultUnit <- function(.data, clean = TRUE) {
#' 'ActivityBottomDepthHeightMeasure,' and 'ResultDepthHeightMeasure.'. Default
#' is to include all allowable values.
#' @param transform Boolean argument; When transform = FALSE, the output includes
#' all Conversion Factor' columns and the 'Depth Target Unit' column. When
#' all 'Conversion.Factor' columns and the 'WQX.Depth.TargetUnit' column. When
#' transform = TRUE, the output includes converted depth data and the 'Depth
#' Target Unit' column, which acts as a flag indicating which rows have been
#' converted. Default is transform = TRUE.
#'
#' @return Full dataset with converted uniform depth units and a 'Depth Target
#' Unit' column, which acts as a flag indicating which rows have been converted.
#' When transform = FALSE, the output is the full dataset with 'Conversion Factor'
#' columns and a 'Depth Target Unit' column.
#' @return Full dataset with converted uniform depth units and a 'WQX.Depth.TargetUnit'
#' column, which acts as a flag indicating which rows have been converted.
#' When transform = FALSE, the output is the full dataset with 'Conversion.Factor'
#' columns and a 'WQX.Depth.TargetUnit' column.
#'
#' @importFrom rlang :=
#'
#' @export
#'
Expand All @@ -349,7 +351,7 @@ DepthProfileData <- function(.data,
"ActivityDepthHeightMeasure.MeasureValue", "ActivityDepthHeightMeasure.MeasureUnitCode",
"ActivityTopDepthHeightMeasure.MeasureValue", "ActivityTopDepthHeightMeasure.MeasureUnitCode",
"ActivityBottomDepthHeightMeasure.MeasureValue", "ActivityBottomDepthHeightMeasure.MeasureUnitCode",
"ResultDepthHeightMeasure.MeasureValue", "ResultDepthHeightMeasure.MeasureUnitCode"
"ResultDepthHeightMeasure.MeasureValue", "ResultDepthHeightMeasure.MeasureUnitCode", "ActivityEndTime.TimeZoneCode"
)
%in% colnames(.data)) == FALSE) {
stop("The dataframe does not contain the required fields to use TADA. Use either the full physical/chemical profile downloaded from WQP or download the TADA profile template available on the EPA TADA webpage.")
Expand Down Expand Up @@ -456,8 +458,8 @@ DepthProfileData <- function(.data,
col.order <- append(col.order, appCol)
}
}
# reorder columns in flag.data
flag.data <- check.data[, col.order]
# reorder columns in check.data
check.data <- check.data[, col.order]
# place flag columns next to relevant fields
if ((appCols[1] %in% colnames(check.data)) == TRUE) {
check.data <- check.data %>%
Expand Down Expand Up @@ -489,92 +491,127 @@ DepthProfileData <- function(.data,
.after = "ActivityEndTime.TimeZoneCode"
)
}
# if transform = FALSE, output data
if (transform == FALSE) {
return(flag.data)
}

#function should always run all code above
}

# if transform = FALSE, output data
if (transform == FALSE) {
return(check.data)
}

# if transform = TRUE, apply conversion
if (transform == TRUE) {
# define clean.data
clean.data <- flag.data
if (transform == TRUE) {
# define clean.data
clean.data <- check.data

# if WQX.ActDepth.ConversionFactor exists...
if (("WQX.ActDepth.ConversionFactor" %in% colnames(clean.data)) == TRUE) {
# multiply ActivityDepthHeightMeasure.MeasureValue by WQX.ActDepth.ConversionFactor
suppressWarnings(
clean.data$ActivityDepthHeightMeasure.MeasureValue ==
as.numeric(clean.data$ActivityDepthHeightMeasure.MeasureValue) *
clean.data$WQX.ActDepth.ConversionFactor
)
clean.data$ActivityDepthHeightMeasure.MeasureValue <- ((clean.data$ActivityDepthHeightMeasure.MeasureValue) * (clean.data$WQX.ActDepth.ConversionFactor))

# replace ActivityDepthHeightMeasure.MeasureUnitCode values with unit argument
clean.data$ActivityDepthHeightMeasure.MeasureUnitCode.Original <- clean.data$ActivityDepthHeightMeasure.MeasureUnitCode
clean.data <- clean.data %>%
dplyr::relocate("ActivityDepthHeightMeasure.MeasureUnitCode.Original",
.after = "WQX.ActDepth.ConversionFactor"
)

clean.data$ActivityDepthHeightMeasure.MeasureUnitCode[which(
!is.na(clean.data$ActivityDepthHeightMeasure.MeasureUnitCode)
)] <- unit
# delete ActDepth.Conversion.Unit column
clean.data <- dplyr::select(clean.data, -"WQX.ActDepth.ConversionFactor")
}

# if WQX.ActTopDepth.ConversionFactor exists...
if (("WQX.ActTopDepth.ConversionFactor" %in% colnames(clean.data)) == TRUE) {
# multiply ActivityTopDepthHeightMeasure.MeasureValue by WQX.ActTopDepth.ConversionFactor
suppressWarnings(
clean.data$ActivityTopDepthHeightMeasure.MeasureValue ==
as.numeric(clean.data$ActivityTopDepthHeightMeasure.MeasureValue) *
clean.data$WQX.ActTopDepth.ConversionFactor
)
# replace ActivityTopDepthHeightMeasure.MeasureUnitCode values with unit argument
clean.data$ActivityTopDepthHeightMeasure.MeasureUnitCode[which(
!is.na(clean.data$ActivityTopDepthHeightMeasure.MeasureUnitCode)
)] <- unit
# delete ActDepth.Conversion.Unit column
clean.data <- dplyr::select(clean.data, -"WQX.ActTopDepth.ConversionFactor")
}

# if WQX.ActBottomDepth.ConversionFactor exists...
if (("WQX.ActBottomDepth.ConversionFactor" %in% colnames(clean.data)) == TRUE) {
# multiply ActivityBottomDepthHeightMeasure.MeasureValue by WQX.ActBottomDepth.ConversionFactor
suppressWarnings(
clean.data$ActivityBottomDepthHeightMeasure.MeasureValue ==
as.numeric(clean.data$ActivityBottomDepthHeightMeasure.MeasureValue) *
clean.data$WQX.ActBottomDepth.ConversionFactor
)
# replace ActivityBottomDepthHeightMeasure.MeasureUnitCode values with unit argument
clean.data$ActivityBottomDepthHeightMeasure.MeasureUnitCode[which(
!is.na(clean.data$ActivityBottomDepthHeightMeasure.MeasureUnitCode)
)] <- unit
# delete ActBottomDepth.Conversion.Unit column
clean.data <- dplyr::select(clean.data, -"WQX.ActBottomDepth.ConversionFactor")
}

# if WQX.ResultDepth.ConversionFactor exists...
if (("WQX.ResultDepth.ConversionFactor" %in% colnames(clean.data)) == TRUE) {
# multiply ResultDepthHeightMeasure.MeasureValue by WQX.ResultDepth.ConversionFactor
suppressWarnings(
clean.data$ResultDepthHeightMeasure.MeasureValue ==
as.numeric(clean.data$ResultDepthHeightMeasure.MeasureValue) *
clean.data$WQX.ResultDepth.ConversionFactor
)
# replace ResultDepthHeightMeasure.MeasureUnitCode values with unit argument
clean.data$ResultDepthHeightMeasure.MeasureUnitCode[which(
!is.na(clean.data$ResultDepthHeightMeasure.MeasureUnitCode)
)] <- unit
# delete ResultDepth.Conversion.Unit column
clean.data <- dplyr::select(clean.data, -"WQX.ResultDepth.ConversionFactor")

clean.data <- clean.data %>%
dplyr::relocate("ActivityDepthHeightMeasure.MeasureUnitCode",
.after = "ActivityDepthHeightMeasure.MeasureValue"
)
# uncoment below to delete ActDepth.Conversion.Unit column
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May want to uncomment to remove ActDepth.Conversion.Unit from clean dataset

#clean.data <- dplyr::select(clean.data, -"WQX.ActDepth.ConversionFactor")
}

# delete WQX.Depth.TargetUnit column
clean.data <- dplyr::select(clean.data, -"WQX.Depth.TargetUnit")
#WQX.ActTopDepth.ConversionFactor exists...
if (("WQX.ActTopDepth.ConversionFactor" %in% colnames(clean.data)) == TRUE) {
# multiply ActivityTopDepthHeightMeasure.MeasureValue by WQX.ActTopDepth.ConversionFactor
clean.data$ActivityTopDepthHeightMeasure.MeasureValue <- ((clean.data$ActivityTopDepthHeightMeasure.MeasureValue) * (clean.data$WQX.ActTopDepth.ConversionFactor))

# replace ActivityTopDepthHeightMeasure.MeasureUnitCode values with unit argument
clean.data$ActivityTopDepthHeightMeasure.MeasureUnitCode.Original <- clean.data$ActivityTopDepthHeightMeasure.MeasureUnitCode
clean.data <- clean.data %>%
dplyr::relocate("ActivityTopDepthHeightMeasure.MeasureUnitCode.Original",
.after = "WQX.ActTopDepth.ConversionFactor"
)

clean.data$ActivityTopDepthHeightMeasure.MeasureUnitCode[which(
!is.na(clean.data$ActivityTopDepthHeightMeasure.MeasureUnitCode)
)] <- unit

clean.data <- clean.data %>%
dplyr::relocate("ActivityTopDepthHeightMeasure.MeasureUnitCode",
.after = "ActivityTopDepthHeightMeasure.MeasureValue"
)
# uncoment below to delete ActTopDepth.Conversion.Unit column
#clean.data <- dplyr::select(clean.data, -"WQX.ActTopDepth.ConversionFactor")
}

#WQX.ActBottomDepth.ConversionFactor exists...
if (("WQX.ActBottomDepth.ConversionFactor" %in% colnames(clean.data)) == TRUE) {
# multiply ActivityBottomDepthHeightMeasure.MeasureValue by WQX.ActBottomDepth.ConversionFactor
clean.data$ActivityBottomDepthHeightMeasure.MeasureValue <- ((clean.data$ActivityBottomDepthHeightMeasure.MeasureValue) * (clean.data$WQX.ActBottomDepth.ConversionFactor))

# replace ActivityTopDepthHeightMeasure.MeasureUnitCode values with unit argument
clean.data$ActivityBottomDepthHeightMeasure.MeasureUnitCode.Original <- clean.data$ActivityBottomDepthHeightMeasure.MeasureUnitCode
clean.data <- clean.data %>%
dplyr::relocate("ActivityBottomDepthHeightMeasure.MeasureUnitCode.Original",
.after = "WQX.ActBottomDepth.ConversionFactor"
)

clean.data$ActivityBottomDepthHeightMeasure.MeasureUnitCode[which(
!is.na(clean.data$ActivityBottomDepthHeightMeasure.MeasureUnitCode)
)] <- unit

clean.data <- clean.data %>%
dplyr::relocate("ActivityBottomDepthHeightMeasure.MeasureUnitCode",
.after = "ActivityBottomDepthHeightMeasure.MeasureValue"
)
# uncoment below to delete ActBottomDepth.Conversion.Unit column
#clean.data <- dplyr::select(clean.data, -"WQX.ActBottomDepth.ConversionFactor")
}

#WQX.ResultDepth.ConversionFactor exists...
if (("WQX.ResultDepth.ConversionFactor" %in% colnames(clean.data)) == TRUE) {
# multiply ResultDepthHeightMeasure.MeasureValue by WQX.ResultDepth.ConversionFactor
clean.data$ResultDepthHeightMeasure.MeasureValue <- ((clean.data$ResultDepthHeightMeasure.MeasureValue) * (clean.data$WQX.ResultDepth.ConversionFactor))

# replace ResultDepthHeightMeasure.MeasureUnitCode values with unit argument
clean.data$ResultDepthHeightMeasure.MeasureUnitCode.Original <- clean.data$ResultDepthHeightMeasure.MeasureUnitCode
clean.data <- clean.data %>%
dplyr::relocate("ResultDepthHeightMeasure.MeasureUnitCode.Original",
.after = "WQX.ResultDepth.ConversionFactor"
)

clean.data$ResultDepthHeightMeasure.MeasureUnitCode[which(
!is.na(clean.data$ResultDepthHeightMeasure.MeasureUnitCode)
)] <- unit

clean.data <- clean.data %>%
dplyr::relocate("ResultDepthHeightMeasure.MeasureUnitCode",
.after = "ResultDepthHeightMeasure.MeasureUnitCode"
)
# uncoment below to delete WQX.ResultDepth.ConversionFactor column
#clean.data <- dplyr::select(clean.data, -"WQX.ResultDepth.ConversionFactor")
}

# uncomment below to delete WQX.Depth.TargetUnit column
# clean.data <- dplyr::select(clean.data, -"WQX.Depth.TargetUnit")

return(clean.data)
} else {
stop("'transform' argument must be Boolean (TRUE or FALSE)")
}
}
}




#' Check for Special Characters in Measure Value Fields
#'
#' Function checks for special characters and non-numeric values in the
Expand Down
2 changes: 1 addition & 1 deletion R/Transformations.R
Original file line number Diff line number Diff line change
Expand Up @@ -312,7 +312,7 @@ WQXTargetUnits <- function(.data, transform = TRUE) {
#' Function generates a harmonization reference table that is specific to
#' the input dataset. Users can review how their input data relates to standard
#' TADA values for CharacteristicName, ResultSampleFractionText,
#' MethodSpecicationName, and ResultMeasure.MeasureUnitCode and they can optionally
#' MethodSpecificationName, and ResultMeasure.MeasureUnitCode and they can optionally
#' edit the reference file to meet their needs.
#'
#' @param .data TADA dataframe
Expand Down
2 changes: 1 addition & 1 deletion docs/articles/CONTRIBUTING.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading