-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closes #142 issue_142_updated to account for DT, DTM, TM variables #145
Conversation
R/type.R
Outdated
attributes(.df[[i]]) <<- orig_attributes | ||
} else { | ||
attributes(.df[[i]]) <- NULL | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The data are still converted to numeric R class. The attribute should numeric but the R class should stay date/dttm/time. The numeric date variables should not go through ".df[[i]] <<- as.numeric(.df[[i]])"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your updates. Numeric dates are still coerced to dbl, it should not be the case.
@cpiraux can you give an example dataframe where this happens and what the expected result should be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be fixed now @cpiraux
@elimillera can you take a look too? I've implemented your suggestion in today's meeting. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't use a metacore object but a dataframe for the metadata
metadata <- data.frame(
dataset = c("adsl", "adsl", "adsl", "adsl"),
variable = c("USUBJID", "DMDTC", "RFICDT", "RFICDTM"),
type = c("text", "date", "integer", "integer"),
format = c(NA, NA, "date9.", "datetime15.")
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jump in the conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went through and everything looked ok from the code side. I would rely on others to make sure it behaves as expected. I'll be sure to update this in the type.R documentation
I updated the testthat for this function and I see a new issue where in line 205 of test-type.R, the original df and xpt file have different timezones.
── Failure (test-type.R:208:3): xportr_type: date variables are not converted to numeric ──
df$RFICDTM (actual
) not equal to df_xpt$RFICDTM (expected
).
actual
: 1490824800
expected
: 1490832000
I checked the documentation and it comes from this parameter in the write_xpt function of haven:
Is this something to be concerned about? @cpiraux @elimillera @kaz462
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into this! Timezone should be ignored. As the default value in haven for adjust_tz = TRUE
, I think it is okay. I open dfdates.xpt in SAS and dates are okay. Is it possible to ignore the timezone when you compare the dates in testthat?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cpiraux test updated to expect that the as_character() of each datetime are equal so that timezone is ignored.
Could you add a test for numeric date variables in test-type.R? |
@cpiraux testthat updated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your updates. Numeric dates are still coerced to dbl, it should not be the case.
I did a test using a df iso metacore object as metadata. The numeric dates have been coerced to character without giving a message:
metacore does not contains any row: |
Another issue that @kaz462 pointed out is when a date in the data frame is numeric, but the type in the metadata belongs to the character type. In this case, the date is not coerced to a character, and no message is provided to inform the user that the types in the metadata and data are different.
In type.R, meta_ordered: |
@cpiraux Pushed new changes. Please let me know your feedback. @elimillera could you take a look when you get a chance? I'm finding the function more complicated more than I thought and I feel like I'm just putting 'workarounds' to ensure that Celine's issues are fixed while still passing the test-type.R. E.g. the meta_ordered logic is getting longer and longer to account for the discrepancies that Celine is finding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went through and everything looked ok from the code side. I would rely on others to make sure it behaves as expected. I'll be sure to update this in the type.R documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks for the updates. It seems to work pretty well. I only see coerced message when no coercion is done. Could you check this?
adsl_original <- tibble::tribble(
~USUBJID, ~DMDTC, ~RFICDT, ~RFICDTM,
"test1", "2017-03-30", "2017-03-30", "2017-03-30",
"test2", "2017-01-08", "2017-01-08", "2017-01-08"
)
adsl_original$DMDTC <- as.Date(adsl_original$DMDTC)
adsl_original$RFICDT <- as.Date(adsl_original$RFICDT)
adsl_original$RFICDTM <- as.POSIXct(adsl_original$RFICDTM)
adsl_xpt <- adsl_original %>%
xportr_type(metacore)
@cpiraux When I run this code, I get one coercion that seems to make sense:
Can you provide the code for your metacore object that gives you this issue where coercion shouldn't happen and the expected outcome? |
Sorry I did not give the code with an issue. Please use this one:
|
@cpiraux I fixed this issue but now we get the other issue where if a DTC variable is a date class, it is not coerced to character. But is it necessary for DTC variables to be coerced to character? When I use xportr_write and read in the xpt file, the date value is still the same so I feel like it is ok to leave it as it is. Please let me know your thoughts. If you have any suggestions for improved logic in the code for xportr_type, that would be great as there is a lot of different scenarios with dates/datetimes that is making this quite confusing.
|
It would be great to have both. --DTC variables coerced to chr when they have an other type and no message when there is no coercion. If it is not possible to have it for this release, I suggest to have no message when no coercion and add an issue for --DTC variables to be done for the next release. What do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can keep it like this and resolve the coercion from --DTC variable to for the next release. Could you add an issue for this?
Done! Thanks for your review. |
LGTM! |
Thank you for your Pull Request!
We have developed a Pull Request template to aid you and our reviewers. Completing the below tasks helps to ensure our reviewers can maximize their time on your code as well as making sure the xportr codebase remains robust and consistent.
The scope of
{xportr}
{xportr}
's scope is to enable R users to write out submission compliantxpt
files that can be delivered to a Health Authority or to downstream validation software programs. We see labels, lengths, types, ordering and formats from a dataset specification object (SDTM and ADaM) as being our primary focus. We also see messaging and warnings to users around applying information from the specification file as a primary focus. Please make sure your Pull Request meets this scope of {xportr}. If your Pull Request moves beyond this scope, please get in touch with the{xportr}
team on slack or create an issue to discuss.Please check off each task box as an acknowledgment that you completed the task. This checklist is part of the Github Action workflows and the Pull Request will not be merged into the
devel
branch until you have checked off each task.Changes Description
#142
do not convert DT, DT, and DTM variables with a format specified in the metacore specs (e.g. date9., datetime20.) to numeric, which will cause a 10 year difference when reading it back by read_xpt
Task List
devel
branch, Pull Requests tomain
should use the Release Pull Request Templatestyler
package and functions to style files accordingly.devtools::document()
so all.Rd
files in theman
folder and theNAMESPACE
file in the project root are updated appropriatelypkgdown::build_site()
and check that all affected examples are displayed correctly and that all new/updated functions occur on the "Reference" page.