Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add input validation for setter and read functions #231

Merged
merged 10 commits into from
Jan 18, 2023
Merged

Conversation

maxloeffler
Copy link
Contributor

@maxloeffler maxloeffler commented Jan 10, 2023

Prerequisites

  • I adhere to the coding conventions (described here) in my code.
  • I have updated the copyright headers of the files I have modified.
  • I have written appropriate commit messages, i.e., I have recorded the goal, the need, the needed changes, and the location of my code modifications for each commit. This includes also, e.g., referencing to relevant issues.
  • I have put signed-off tags in all commits.
  • I have updated the changelog file NEWS.md appropriately.
  • I have checked whether I need to adjust the showcase file showcase.R with respect to my changes.
  • The pull request is opened against the branch dev.

Description

  • Add input validation into setter functions and functions that read input from files.
  • Fix check for empty input files in almost all read functions

Changelog

Add the 'verify.data.frame.column' to the miscellaneous utilities to check the existence and welltypedness of columns in a dataframe.

This works towards fixing se-sic#208.

Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
Add call to 'verify.data.frame.columns' to relevant setters and read functions

This works towards fixing se-sic#208.

Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
Change most internal types from numeric to integer; also in related tests.
Add sufficient checks for empty input-files in read functions.
Close previously unlocked parenthesis in util-data.
Change dummy fill of author email from NA to null-string
to comply with character type of column.

This fixes se-sic#208.

Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
Add copyright header to updated utility and test files.

This fixes se-sic#208.

Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
Copy link
Collaborator

@bockthom bockthom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MaLoefUDS, this PR looks really good so far! I only spotted a few inconsistencies in comments that should be fixed, please see my individual comments below.

In addition, there is one data source we have not talked about yet: The custom event timestamps. We also should validate the inputs for its setter and read functions, although it is a little bit different than the other data sources:

read.custom.event.timestamps: This function reads a data.frame with two columns – one is of type character, the other one of type POSIXct. We should check this directly after reading it. However, there are no constants and no column names, as this data format is not used outside of the read function. So there should be only hardcoded checks within this function. Because at the end, the read data.frame is transferred to a named list prior to returning.

set.custom.event.timestamps: This function just takes a list as parameter. So we should just check whether the input is a list.

So, could you please add the validation for these two special cases?

And please don't forget to add the description for this PR (you don't need to add a changelog there). (description is already present, I just have overlooked it, sorry)

util-misc.R Outdated Show resolved Hide resolved
util-misc.R Outdated Show resolved Hide resolved
util-misc.R Outdated Show resolved Hide resolved
util-misc.R Outdated Show resolved Hide resolved
util-misc.R Outdated Show resolved Hide resolved
tests/test-misc.R Outdated Show resolved Hide resolved
tests/test-misc.R Outdated Show resolved Hide resolved
tests/test-misc.R Outdated Show resolved Hide resolved
util-read.R Outdated Show resolved Hide resolved
tests/test-misc.R Outdated Show resolved Hide resolved
@bockthom bockthom added this to the v4.3 milestone Jan 12, 2023
Add check that input is a list in 'set.custom.event.timestamps' and verify
correct input structure in 'read.custom.event.timestamps'. Concretize unclear
comments, fix punctuation and spelling mistakes. Update copyright headers.

This fixes feedback on se-sic#208.

Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
Copy link
Collaborator

@bockthom bockthom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just spotted two issues regarding the new checks for the custom event timestamps. Please fix them, everything else looks good to me.

(And don't worry about the failing CI pipeline on GitHub. There are some temporary problems at our CI provider that let the CI pipeline always run into a timeout before starting the tests. Just ignore the pipeline for the moment and just make sure that the tests run successfully on your end.)

util-read.R Outdated Show resolved Hide resolved
util-read.R Outdated Show resolved Hide resolved
hechtlC
hechtlC previously approved these changes Jan 17, 2023
Copy link
Contributor

@hechtlC hechtlC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks @MaLoefUDS . I just found one line that has more than 120 characters. Once this is fixed and the annotations by @bockthom are processed, we can merge this.

util-misc.R Outdated Show resolved Hide resolved
Remove check on first column of timestamp data in 'read.custom.event.timestamps'
as non-character fields will be cast to character anyways by call to 'names'.
Simplify POSIXct format check on second column using 'get.date.from.string'
function. Wrap overly long line in 'verify.data.frame.columns'.

This fixes feedback on se-sic#208.

Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
Add changes and additions reguarding dataframe verification to 'NEWS.md'.

This completes checklist on se-sic#208.

Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
Copy link
Collaborator

@bockthom bockthom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

(I only spotted one minor indentation inconsistency)

util-misc.R Outdated Show resolved Hide resolved
Indent line-wrapped string according to convention.

This fixes feedback on se-sic#208.

Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
NEWS.md Outdated Show resolved Hide resolved
In  change issue number to pull-request number and add comma after.

This fixes feedback on se-sic#208.

Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
bockthom
bockthom previously approved these changes Jan 18, 2023
Copy link
Contributor

@hechtlC hechtlC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just tried to run the test suite on my machine since the CI is still stuck. And I noticed that you are using an experimental feature of testthat in tests-misc.R. See my comment.
Once this is fixed for all four occurences, I will merge the PR. Thanks again for the great work!

data.frame = data.frame(user.names, age, is.male)

## 1) Check base functionality (benign use-case)
expect_no_error(verify.data.frame.columns(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change this to expect_error(..., NA, message(...)).
The expect_no_error function is a new feature from testthat but is still experimental. So for now we want to stay at the "old" version. Also please put a comment behind the check with Expect no error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know about this, I will do the replacement 😇

Replace 'expect_no_error(expr, message)' by 'expect_error(expr, NA, message)'.

This fixes feedback on se-sic#208.

Signed-off-by: Maximilian Löffler <s8maloef@stud.uni-saarland.de>
@hechtlC hechtlC merged commit 3c452ca into se-sic:dev Jan 18, 2023
@maxloeffler maxloeffler deleted the dev branch January 18, 2023 14:34
@bockthom bockthom mentioned this pull request Apr 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants