Improve date parsing #722

helrond · 2020-08-13T21:30:02Z

Attempts to use fromisoformat to parse date strings. The (buggy) strptime method is used as a fallback for Python versions < 3.7

fixes #685

codecov · 2020-08-13T21:32:09Z

Codecov Report

Merging #722 into master will decrease coverage by 0.17%.
The diff coverage is 54.54%.

@@            Coverage Diff             @@
##           master     #722      +/-   ##
==========================================
- Coverage   96.18%   96.01%   -0.18%     
==========================================
  Files          17       17              
  Lines        2648     2658      +10     
  Branches      308      310       +2     
==========================================
+ Hits         2547     2552       +5     
- Misses         84       87       +3     
- Partials       17       19       +2

Impacted Files	Coverage Δ
jsonschema/tests/test_jsonschema_test_suite.py	`93.02% <50.00%> (-6.98%)`	⬇️
jsonschema/_format.py	`77.27% <60.00%> (-0.57%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ff1ddcf...d9d5fe1. Read the comment docs.

Julian · 2020-08-18T17:14:34Z

@helrond now that these are merged upstream think you can now unskip the relevant tests and help get this over the finish line?

Appreciated!

helrond · 2020-08-18T18:52:59Z

@Julian sure - what exactly do I need to do?

Julian · 2020-08-18T19:07:24Z

@helrond so I basically pulled in the changes you made to the test suite, so now those tests are in here, but being skipped on master (where they'd otherwise fail).

Essentially what needs to be done now is to unskip them (delete e.g. these lines: https://github.com/Julian/jsonschema/blob/master/jsonschema/tests/test_jsonschema_test_suite.py#L103-L111), confirm that the tests now pass in this branch, and then we'd be good.

Lemme know if that makes sense!

helrond · 2020-08-18T19:56:56Z

OK, I think I did that right, but tests are failing on everything < Python 3.6, which I believe is a consequence of the implementation we chose, which falls back to the buggy strptime method if fromisoformat isn't available.

Julian · 2020-08-18T19:58:26Z

Got it so... probably the thing to do is to re-add a skip that just skips those tests on 3.6 then? Have a look at e.g. the narrow_unicode skipping thing, it should show you how to do that (though yours will now predicate on sys.version_info)

helrond · 2020-08-19T22:08:41Z

OK, so tests are passing now but I'm getting an error on test coverage. Any pointers on how to fix that?

Julian · 2020-08-19T22:37:59Z

Cool! We can likely ignore that, it's due to the fact that we don't yet combine coverage across all runs, so codecov is confused and thinks coverage has gone down, it won't block merging though.

Let me review! Will leave any comments.

Julian · 2020-08-19T22:41:59Z

jsonschema/_format.py

@@ -323,7 +323,10 @@ def is_regex(instance):
 def is_date(instance):
    if not isinstance(instance, str):
        return True
-    return datetime.datetime.strptime(instance, "%Y-%m-%d")
+    try:


Minor, but might as well -- it'll save doing the "wrong" thing on each call on Py36 if we do this only once rather than each time we validate.

I.e. if we move this into a try/except, and do:

if hasattr(datetime.date, "fromisoformat"): _is_date = datetime.date.fromisoformat else: def _is_date(instance): return datetime.datetime.strptime(instance, "%Y-%m-%d") def is_date(...): return _is_date(...)

I'm not sure I fully got what you were after here, but happy to take another crack at it if I messed up.

Sorry, I should have been less lazy in writing it out :) -- so the point is (which is true of your commit here too) -- as is, every time the is_date validation is applied, it checks to see which implementation should be used. So on Py36, that means you first try fromisoformat, then that fails, then you try the old way, but then you "forget" about the fact that it didn't work and the next time the function is called, you'll again try the fromisoformat, again find it doesn't work, etc.

Instead, what I was trying to say was to move the check outside of the is_date function entirely. In other words, globally at the top level in the module:

if hasattr(datetime.date, "fromisoformat"): _is_date = datetime.date.fromisoformat else: def _is_date(instance): return datetime.datetime.strptime(instance, "%Y-%m-%d") @_checks_drafts(draft3="date", draft7="date", raises=ValueError) def is_date(instance): if not isinstance(instance, str): return True return _is_date(instance)

Now, we only check once on import which implementation should be used, and then if you call that function 1000 times, we've already figured out the right implementation ahead of time.

Does that make more sense?

Ah yes, got it. If I'd looked around a bit more in that file I'd have seen you doing that kind of thing in other spots. This one should be resolved now.

jsonschema/tests/test_jsonschema_test_suite.py

Julian

Two minor nitpicks on the implementation that I should have pointed out earlier.

After that though think we'll be good to merge.

Much appreciated again...

Julian · 2020-08-19T23:59:49Z

All good, this looks great now, thanks again!

* origin/master: Remove docformatter, it's too unwieldy. Minor doc formatting. Sigh, this appears to be a regex, not exact match, so it was ignoring everything. Update pre-commit hooks. And two last ones... Remove the rest of the notebooks.ai related resources. Remove demo links from README file Move the latest validator CLI test to where it must go now. Simplify the schemas in the tests for CLI validator detection. add test case Delete the failed test case Fix the bug of processing arguments[validator] in parse_args use correct format check globally renames message and function better handling for Python 3.6 fix formatting error skip tests on python 3.6 unskip tests Temporarily skip the tests that will be unskipped by python-jsonschema#722. Squashed 'json/' changes from 86f52b87..fce9e9b3 Validate IP addresses using the ipaddress module. Skip more tests for python-jsonschema#686. Squashed 'json/' changes from ea415537..86f52b87 improve date parsing Bump the version in the notebooks.io requirements. pre-commit setup. Trailing whitespace Kill Py2 in the sphinx role. isorted. Unbreak non-format-setuptools-extra-installed jsonschema. Run CI against all setuptools extras separately. Pick a format checker that has no external dep now that fqdn exists. regenerated the requirements.txt Modify the code based on review comments fix bug about hostname by import fqdn

improve date parsing

d9c0da8

Julian added a commit that referenced this pull request Aug 14, 2020

Temporarily skip the tests that will be unskipped by #722.

ff1ddcf

helrond added 2 commits August 18, 2020 15:49

Merge remote-tracking branch 'upstream/master' into issue-685

f1b1ad0

unskip tests

6bb7f9b

helrond added 2 commits August 19, 2020 17:09

skip tests on python 3.6

3089143

fix formatting error

b8dd5a6

Julian reviewed Aug 19, 2020

View reviewed changes

jsonschema/tests/test_jsonschema_test_suite.py Outdated Show resolved Hide resolved

Julian requested changes Aug 19, 2020

View reviewed changes

helrond added 3 commits August 19, 2020 19:02

better handling for Python 3.6

2f1d5ef

renames message and function

69c6823

use correct format check globally

d9d5fe1

Julian merged commit 756de12 into python-jsonschema:master Aug 20, 2020

helrond deleted the issue-685 branch August 20, 2020 00:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve date parsing #722

Improve date parsing #722

helrond commented Aug 13, 2020

codecov bot commented Aug 13, 2020 •

edited

Loading

Julian commented Aug 18, 2020

helrond commented Aug 18, 2020

Julian commented Aug 18, 2020

helrond commented Aug 18, 2020

Julian commented Aug 18, 2020

helrond commented Aug 19, 2020

Julian commented Aug 19, 2020

Julian Aug 19, 2020

helrond Aug 19, 2020

Julian Aug 19, 2020

helrond Aug 19, 2020

Julian left a comment

Julian commented Aug 19, 2020

Improve date parsing #722

Improve date parsing #722

Conversation

helrond commented Aug 13, 2020

codecov bot commented Aug 13, 2020 • edited Loading

Codecov Report

Julian commented Aug 18, 2020

helrond commented Aug 18, 2020

Julian commented Aug 18, 2020

helrond commented Aug 18, 2020

Julian commented Aug 18, 2020

helrond commented Aug 19, 2020

Julian commented Aug 19, 2020

Julian Aug 19, 2020

Choose a reason for hiding this comment

helrond Aug 19, 2020

Choose a reason for hiding this comment

Julian Aug 19, 2020

Choose a reason for hiding this comment

helrond Aug 19, 2020

Choose a reason for hiding this comment

Julian left a comment

Choose a reason for hiding this comment

Julian commented Aug 19, 2020

codecov bot commented Aug 13, 2020 •

edited

Loading