Censored data 2 eh #224

ehinman · 2023-03-09T01:03:42Z

This branch separates the pre-existing censored data function into two: one that identifies censored data and flags it, and another that performs simple handling rules for non-detects and over detects. It also contains a new summary function that gives the sample count, % censored, and a suggested statistical analysis for censored data handling for user-specified groups. It also contains three new tests for the censored data functions.

- pulled id censored data out of simplecensoredmethods and created own function that can be used within/on own like autoclean - created more warnings for detection limit metadata that may be missing from WQX domain table - found example. - created tests to ensure data are not dropped during censored data handling.

cristinamullin · 2023-03-09T15:12:13Z

vignettes/WQPDataHarmonization.Rmd

@@ -61,7 +61,7 @@ ggplot2 from GitHub.

 # remotes::install_github("hadley/ggplot2", dependencies=TRUE)

-remotes::install_github("USEPA/TADA", dependencies=TRUE)
+remotes::install_github("USEPA/TADA", ref = "censored_data_2_eh",dependencies=TRUE)


why is the ref included/needed in the install?

This is so pesky! When I add new functions, GitHub will throw an error when trying to use the develop branch to build the vignette, so I have to temporarily tell it to use my branch. Feel free to change it back to develop once those new functions are included in the develop branch.

Ah okay, good to know! I can do that

cristinamullin · 2023-03-09T15:16:26Z

vignettes/WQPDataHarmonization.Rmd

@@ -669,7 +669,7 @@ that we want to remove the "Quality Control Sample-Field Replicate" and
 field.

 ```{r}
-TADAProfileClean14 <- dplyr::filter(TADAProfileClean13, !(ActivityTypeCode %in% c("Quality Control Sample-Field Replicate", "Quality Control Sample-Field Blank", "Quality Control Sample-Lab Duplicate", "Quality Control Sample-Equipment Blank")))
+TADAProfileClean14 <- dplyr::filter(TADAProfileClean13, !(ActivityTypeCode %in% ActivityTypeCode[grepl("Quality",ActivityTypeCode)]))


add line for users # this removes rows where any value in the ActivityTypeCode includes the string "quality" (not case sensitive?), These quality control samples are not included in the analysis. Are we now including this set as an automated step in autoclean, or just including it here? If the latter, we will need to remember to do this step separately in the shiny app as well.

It doesn't happen in autoclean. Would it be better to do it there?

Let's wait for now. It may be better as a flag that can then be used to inform use of that data in the future.

cristinamullin · 2023-03-09T15:17:55Z

vignettes/WQPDataHarmonization.Rmd

@@ -682,7 +682,7 @@ ResultStatusIdentifier field.
 FilterFieldReview("ActivityMediaSubdivisionName", TADAProfileClean14)
 ```

-The ActivityMediaSubdivisionName field has two unique values, "Surface
+[Not true of current test dataset] The ActivityMediaSubdivisionName field has two unique values, "Surface


Would it make sense to remove the specifics here (and elsewhere) and be more general so that it does not need updating each time we change the example data?

That sounds great. It's hard to keep this current when we occasionally change the test dataset!

Agreed, we can work on that in the future

cristinamullin

@ehinman Approved, but added a few questions and suggestions

ehinman added 4 commits March 7, 2023 16:56

updates to censored data summary

b0d01be

update censored data functions

97e47ca

update website

f5e36a5

ehinman requested a review from cristinamullin March 9, 2023 01:03

cristinamullin reviewed Mar 9, 2023

View reviewed changes

cristinamullin approved these changes Mar 9, 2023

View reviewed changes

cristinamullin merged commit 0f3178d into develop Mar 9, 2023

cristinamullin deleted the censored_data_2_eh branch March 9, 2023 15:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Censored data 2 eh #224

Censored data 2 eh #224

ehinman commented Mar 9, 2023

cristinamullin Mar 9, 2023

ehinman Mar 9, 2023

cristinamullin Mar 9, 2023

cristinamullin Mar 9, 2023

ehinman Mar 9, 2023

cristinamullin Mar 9, 2023

cristinamullin Mar 9, 2023 •

edited

Loading

ehinman Mar 9, 2023

cristinamullin Mar 9, 2023

cristinamullin left a comment •

edited

Loading

Censored data 2 eh #224

Censored data 2 eh #224

Conversation

ehinman commented Mar 9, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cristinamullin Mar 9, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cristinamullin left a comment • edited Loading

Choose a reason for hiding this comment

cristinamullin Mar 9, 2023 •

edited

Loading

cristinamullin left a comment •

edited

Loading