-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DateTimeParser Validation and Performance Improvements #4593
Conversation
Improve the validation of DateTimeParser by validating the numbers parsed for day, month, year, hour, minute, second, millis and micros. Improve performance by caching the RegularExpressions created in DateTimeFormat::isValid.
Additional change to validate milliseconds field if the . or , exists.
Foundation/src/DateTimeFormat.cpp
Outdated
RegularExpression(DateTimeFormat::SORTABLE_REGEX) | ||
}; | ||
// make sure the regex list and the array of regexes are in sync | ||
poco_assert((sizeof(regs) / sizeof(regs[0])) == REGEX_LIST.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
REGEX_LIST is used only in this place. I propose to remove it as a class member and have it as a static member in this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left it where it is because it's part of the public API of this class, unfortunately.
Wasn't sure if this would be 1.13.x or 1.14 and didn't want to make a breaking change just in case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrewauclair 1.14, you can break it
Foundation/src/DateTimeParser.cpp
Outdated
|
||
namespace Poco { | ||
[[nodiscard]] parse_iter skip_non_digits(parse_iter it, parse_iter end) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall these helper functions be inlined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. inline
is used in headers to avoid ODR issues. Free functions in cpp files should be either static or in an anonymous namespace to avoid IFNDR issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrewauclair please don't introduce new naming conventions, we have our own
Foundation/src/DateTimeParser.cpp
Outdated
|
||
namespace Poco { | ||
[[nodiscard]] parse_iter skip_non_digits(parse_iter it, parse_iter end) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andrewauclair please don't introduce new naming conventions, we have our own
REGEX_LIST was only used in isValid, which now uses a cached list of RegularExpressions instead.
@andrewauclair Do you have any performance measurements with these changes compared to 1.13? |
Came up with the following when running in WSL with the code from issue #4592: 1.12.5: 0m0.063s Switching to a format string that doesn't match one of the expected ones, to avoid regex matches ( 1.12.5: 0m0.063s |
Improvements for #4592.
Improve the validation of
DateTimeParser
by validating the numbers parsed forday
,month
,year
,hour
,minute
,second
,millis
, andmicros
. The value 0 will be used if no digits were found formicros
.These changes create new instances of
SyntaxException
thrown byDateTimeParser::parse
andDateTimeParser::parseTZD
when the previously listed fields do not contain the required number of digits or the string of digits is not a valid number.Improve performance by caching the
RegularExpression
s created inDateTimeFormat::isValid
.