-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add input validation for setter and read functions #231
Conversation
Add the 'verify.data.frame.column' to the miscellaneous utilities to check the existence and welltypedness of columns in a dataframe. This works towards fixing se-sic#208. Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
Add call to 'verify.data.frame.columns' to relevant setters and read functions This works towards fixing se-sic#208. Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
Change most internal types from numeric to integer; also in related tests. Add sufficient checks for empty input-files in read functions. Close previously unlocked parenthesis in util-data. Change dummy fill of author email from NA to null-string to comply with character type of column. This fixes se-sic#208. Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
Add copyright header to updated utility and test files. This fixes se-sic#208. Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @MaLoefUDS, this PR looks really good so far! I only spotted a few inconsistencies in comments that should be fixed, please see my individual comments below.
In addition, there is one data source we have not talked about yet: The custom event timestamps. We also should validate the inputs for its setter and read functions, although it is a little bit different than the other data sources:
read.custom.event.timestamps
: This function reads a data.frame with two columns – one is of type character, the other one of type POSIXct. We should check this directly after reading it. However, there are no constants and no column names, as this data format is not used outside of the read function. So there should be only hardcoded checks within this function. Because at the end, the read data.frame is transferred to a named list prior to returning.
set.custom.event.timestamps
: This function just takes a list as parameter. So we should just check whether the input is a list.
So, could you please add the validation for these two special cases?
And please don't forget to add the description for this PR (you don't need to add a changelog there). (description is already present, I just have overlooked it, sorry)
Add check that input is a list in 'set.custom.event.timestamps' and verify correct input structure in 'read.custom.event.timestamps'. Concretize unclear comments, fix punctuation and spelling mistakes. Update copyright headers. This fixes feedback on se-sic#208. Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just spotted two issues regarding the new checks for the custom event timestamps. Please fix them, everything else looks good to me.
(And don't worry about the failing CI pipeline on GitHub. There are some temporary problems at our CI provider that let the CI pipeline always run into a timeout before starting the tests. Just ignore the pipeline for the moment and just make sure that the tests run successfully on your end.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks @MaLoefUDS . I just found one line that has more than 120 characters. Once this is fixed and the annotations by @bockthom are processed, we can merge this.
Remove check on first column of timestamp data in 'read.custom.event.timestamps' as non-character fields will be cast to character anyways by call to 'names'. Simplify POSIXct format check on second column using 'get.date.from.string' function. Wrap overly long line in 'verify.data.frame.columns'. This fixes feedback on se-sic#208. Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
Add changes and additions reguarding dataframe verification to 'NEWS.md'. This completes checklist on se-sic#208. Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!
(I only spotted one minor indentation inconsistency)
Indent line-wrapped string according to convention. This fixes feedback on se-sic#208. Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
In change issue number to pull-request number and add comma after. This fixes feedback on se-sic#208. Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have just tried to run the test suite on my machine since the CI is still stuck. And I noticed that you are using an experimental feature of testthat in tests-misc.R
. See my comment.
Once this is fixed for all four occurences, I will merge the PR. Thanks again for the great work!
tests/test-misc.R
Outdated
data.frame = data.frame(user.names, age, is.male) | ||
|
||
## 1) Check base functionality (benign use-case) | ||
expect_no_error(verify.data.frame.columns( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change this to expect_error(..., NA, message(...))
.
The expect_no_error
function is a new feature from testthat but is still experimental. So for now we want to stay at the "old" version. Also please put a comment behind the check with Expect no error
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know about this, I will do the replacement 😇
Replace 'expect_no_error(expr, message)' by 'expect_error(expr, NA, message)'. This fixes feedback on se-sic#208. Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
Prerequisites
showcase.R
with respect to my changes.dev
.Description
Changelog