Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance regression from 3.2.0 to 4.0.1 #853

Closed
tfranzel opened this issue Oct 5, 2021 · 18 comments
Closed

performance regression from 3.2.0 to 4.0.1 #853

tfranzel opened this issue Oct 5, 2021 · 18 comments
Labels
Bug Something doesn't work the way it should.

Comments

@tfranzel
Copy link

tfranzel commented Oct 5, 2021

Hi @Julian, first of all thanks for this library and all the hard work. We rely heavily on it and are delighted to see Draft 2020-12 support as it is required for OpenAPI 3.1.0

When updating, I noticed our test suite ran noticeably slower. I compiled a quick reproduction for you. On my machine I get a 5x slowdown with the new version. Is this performance hit to be expected due to the new capabilities or is it a regression?

import jsonschema
import yaml
import json
import time

# https://raw.githubusercontent.com/tfranzel/drf-spectacular/master/tests/test_basic.yml
with open('tests/test_basic.yml') as fh:
    data = yaml.load(fh.read(), Loader=yaml.SafeLoader)

# https://raw.githubusercontent.com/tfranzel/drf-spectacular/master/drf_spectacular/validation/openapi3_schema.json
# which comes from:
# https://github.com/OAI/OpenAPI-Specification/blob/6d17b631fff35186c495b9e7d340222e19d60a71/schemas/v3.0/schema.json
with open('drf_spectacular/validation/openapi3_schema.json') as fh:
    openapi3_schema_spec = json.load(fh)

t_acc = 0

for i in range(500):
    t0 = time.time()
    jsonschema.validate(instance=data, schema=openapi3_schema_spec)
    t1 = time.time()
    t_acc += t1 - t0

print(f'{t_acc} sec')
✗ python --version; pip freeze | grep json; python test.py
Python 3.9.5
jsonschema==3.2.0
5.254251718521118 sec

✗ python --version; pip freeze | grep json; python test.py
Python 3.9.5
jsonschema==4.0.1
28.189855813980103 sec

✗ python --version; pip freeze | grep json; python test.py
Python 3.9.7
jsonschema==4.2.1
27.27832531929016 sec

✗ python --version; pip freeze | grep json; python test.py
Python 3.9.7
jsonschema==4.3.1
8.10183048248291 sec

EDIT: included measurement for 4.2.1 release
EDIT: included measurement for 4.3.1 release

@tdamsma
Copy link

tdamsma commented Oct 5, 2021

I noticed this too, a set of tests I run for an application that uses jsonschema heavily the test suite now takes 60 minutes vs 14 before.

@Julian
Copy link
Member

Julian commented Oct 5, 2021

It will take some digging, but there is indeed a "known" performance chance in 4.0.0: https://github.com/Julian/jsonschema/blob/main/CHANGELOG.rst#v400

specifically:

False and 0 are now properly considered non-equal even recursively within a container (#686). As part of this change, uniqueItems validation may be slower in some cases. Please feel free to report any significant performance regressions, though in some cases they may be difficult to address given the specification requirement.

If someone could confirm (or deny) whether that's the culprit it'd be helpful, but yeah this may need some investigating (some I won't know I have time for for at least a few days).

CC @skamensky as well who I know was interested in doing some performance optimization -- here's another benchmark we may want to adopt or use to drive any change.

@edmundlam
Copy link

I was also looking into upgrading from jsonschema 3.2.0 to >4.0.0 in order to use the uuid string validation that came with the 2019-09 draft.

We also noticed around a 4.5x slower runtime when running our json schema validation tests.

I'd need to confirm, but I don't think we are using the uniqueItems validation in our schemas.

@jakevdp
Copy link

jakevdp commented Oct 13, 2021

In case it's helpful, here's a benchmark related to the Altair package (uses the %timeit magic in Jupyter/IPython):

from urllib.request import urlopen
import json
import jsonschema

with urlopen('https://raw.githubusercontent.com/vega/schema/master/vega-lite/v4.8.1.json') as f:
  schema = json.load(f)
with urlopen('https://raw.githubusercontent.com/vega/vega-lite/master/examples/specs/bar.vl.json') as f:
  instance = json.load(f)

print(jsonschema.__version__)
%timeit jsonschema.validate(instance, schema)

On a Google Colab CPU runtime, I get the following:

3.2.0
10 loops, best of 5: 146 ms per loop
4.1.0
1 loop, best of 5: 1.03 s per loop

@jvtm
Copy link
Contributor

jvtm commented Oct 20, 2021

In case it's helpful, here's a benchmark related to the Altair package (uses the %timeit magic in Jupyter/IPython):

from urllib.request import urlopen
import json
import jsonschema

with urlopen('https://raw.githubusercontent.com/vega/schema/master/vega-lite/v4.8.1.json') as f:
  schema = json.load(f)
with urlopen('https://raw.githubusercontent.com/vega/vega-lite/master/examples/specs/bar.vl.json') as f:
  instance = json.load(f)

print(jsonschema.__version__)
%timeit jsonschema.validate(instance, schema)

On a Google Colab CPU runtime, I get the following:

3.2.0
10 loops, best of 5: 146 ms per loop
4.1.0
1 loop, best of 5: 1.03 s per loop

I tried this same schema + instance with the usual optimization of creating Validator instance only once (ie. do what `jsonschema.validate() does: validate schema itself once, create validator instance...), and then calling it in a loop.

Short test on Python 3.9.x, forcing use of Draft7Validator (as spec'd by in the schema)

jsonschema.__version__='3.2.0'
iterations=1000 took 0.993949 seconds (1006.09 / sec)
jsonschema.__version__='4.1.2' 
iterations=1000 took 45.402565 seconds (22.03 / sec)

...which makes the impact even bigger: 45x slower.

Update / cProfile output (with 100 calls to validator.validate(instance)):

$ python -m cProfile -s cumtime ./temp.py                                                                                                                                                                    
jsonschema.__version__='3.2.0'                                                            
iterations=100 took 0.204604 seconds (488.75 / sec)                                                                                                                                                    
         932645 function calls (832611 primitive calls) in 0.616 seconds                                                                                                                                           
                                                                                                                                                                                                        
   Ordered by: cumulative time                                                                                                                                                              
                                                                                                                                                                                                  
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)                                                                                                                      
    117/1    0.000    0.000    0.617    0.617 {built-in method builtins.exec}                                                                                                                           
        1    0.000    0.000    0.617    0.617 temp.py:3(<module>)                                                                                                                                              
40059/101    0.102    0.000    0.484    0.005 validators.py:296(iter_errors)                                                                                                                 
39958/103    0.026    0.000    0.483    0.005 validators.py:343(descend)                                                                                                                   
11123/476    0.023    0.000    0.482    0.001 _validators.py:252(ref)                                                                                                                         
 5435/101    0.024    0.000    0.479    0.005 _validators.py:276(properties)                                                                                                                
1318/1101    0.006    0.000    0.291    0.000 _validators.py:41(additionalProperties)                                                                                                                              
        1    0.000    0.000    0.280    0.280 validators.py:291(check_schema)                                        
3702/2195    0.009    0.000    0.248    0.000 _validators.py:319(anyOf)                                                                                                                    
      100    0.000    0.000    0.204    0.002 validators.py:351(validate)                                                                                                                                          
4102/4003    0.008    0.000    0.180    0.000 _validators.py:69(items)                                               
        8    0.000    0.000    0.160    0.020 __init__.py:1(<module>)                                                                                                                                              
    18907    0.027    0.000    0.143    0.000 _validators.py:269(type)                                        
    148/2    0.001    0.000    0.123    0.061 <frozen importlib._bootstrap>:1002(_find_and_load)                                                                                                                   
    148/2    0.001    0.000    0.123    0.061 <frozen importlib._bootstrap>:967(_find_and_load_unlocked)           
    140/2    0.001    0.000    0.122    0.061 <frozen importlib._bootstrap>:659(_load_unlocked)                                                                                                                    
    115/2    0.000    0.000    0.122    0.061 <frozen importlib._bootstrap_external>:784(exec_module)               
    183/2    0.000    0.000    0.122    0.061 <frozen importlib._bootstrap>:220(_call_with_frames_removed)                                                                                                         
    28406    0.016    0.000    0.113    0.000 validators.py:355(is_type)                                           
    28406    0.021    0.000    0.097    0.000 _types.py:66(is_type)                                                                                                                                                
    15913    0.007    0.000    0.084    0.000 {built-in method builtins.any}          
    32012    0.011    0.000    0.080    0.000 _validators.py:272(<genexpr>)                                                                                                                                        
    20/13    0.000    0.000    0.079    0.006 {built-in method builtins.__import__}                                 
  159/140    0.000    0.000    0.077    0.001 <frozen importlib._bootstrap>:1033(_handle_fromlist)                                                                                                                 
        2    0.000    0.000    0.069    0.034 exceptions.py:1(<module>)                               
    28406    0.015    0.000    0.061    0.000 _pmap.py:69(__getitem__)                                                                                                                                             
    28407    0.018    0.000    0.046    0.000 _pmap.py:59(_getitem)                                                      
        1    0.000    0.000    0.043    0.043 _utils.py:1(<module>)                                                                                                                                                
        1    0.000    0.000    0.043    0.043 compat.py:1(<module>)                                                
        1    0.000    0.000    0.040    0.040 request.py:1(<module>)
    28410    0.021    0.000    0.029    0.000 _pmap.py:53(_get_bucket)
        1    0.000    0.000    0.027    0.027 client.py:1(<module>)
        1    0.000    0.000    0.026    0.026 _format.py:1(<module>)
      115    0.001    0.000    0.021    0.000 <frozen importlib._bootstrap_external>:856(get_code)
  448/441    0.007    0.000    0.019    0.000 {built-in method builtins.__build_class__}
     7518    0.012    0.000    0.018    0.000 exceptions.py:121(_set)
        1    0.000    0.000    0.017    0.017 rfc3987.py:19(<module>)
   138582    0.017    0.000    0.017    0.000 {method 'get' of 'dict' objects}
    14499    0.012    0.000    0.017    0.000 validators.py:684(push_scope)
   ....
$ python -m cProfile -s cumtime ./temp.py                                                                                 
jsonschema.__version__='4.1.2'                                                                                           
iterations=100 took 15.473336 seconds (6.46 / sec)                                                                 
         67530251 function calls (67430256 primitive calls) in 17.414 seconds                                           
                                                                                                        
   Ordered by: cumulative time                                                                                          
                                                                                                              
   ncalls  tottime  percall  cumtime  percall filename:lineno(function)                                                 
    111/1    0.000    0.000   17.424   17.424 {built-in method builtins.exec}                                                     
        1    0.000    0.000   17.424   17.424 temp.py:3(<module>)                                                               
40059/101    0.126    0.000   17.282    0.171 validators.py:188(iter_errors)                                                    
11123/476    0.039    0.000   17.276    0.036 _validators.py:288(ref)                                                   
39958/103    0.058    0.000   16.717    0.162 validators.py:239(descend)                                      
    10064    0.058    0.000   16.396    0.002 validators.py:769(resolve)                                                
 5435/101    0.029    0.000   16.148    0.160 _validators.py:326(properties)                                                    
    35146   10.067    0.000   16.131    0.000 validators.py:759(_finditem)                                                        
3702/2195    0.016    0.000   15.918    0.007 _validators.py:365(anyOf)                                                           
      100    0.000    0.000   15.473    0.155 validators.py:247(validate)                                                       
 24516184    2.209    0.000    2.217    0.000 {built-in method builtins.isinstance}                                  
1318/1101    0.009    0.000    1.833    0.002 _validators.py:38(additionalProperties)
        1    0.000    0.000    1.810    1.810 validators.py:180(check_schema)
     2953    0.027    0.000    1.697    0.001 validators.py:803(resolve_fragment)
 24376180    1.640    0.000    1.640    0.000 {method 'pop' of 'collections.deque' objects}
  8582776    1.470    0.000    1.470    0.000 {method 'extendleft' of 'collections.deque' objects}
  8582776    0.766    0.000    0.766    0.000 {method 'values' of 'dict' objects}
4102/4003    0.009    0.000    0.644    0.000 _legacy_validators.py:102(items_draft6_draft7_draft201909)
       23    0.000    0.000    0.508    0.022 validators.py:788(resolve_from_url)
    32440    0.040    0.000    0.230    0.000 validators.py:185(evolve)
    32442    0.099    0.000    0.191    0.000 _funcs.py:317(evolve)
    18907    0.039    0.000    0.170    0.000 _validators.py:318(type)
    10087    0.027    0.000    0.153    0.000 parse.py:585(urldefrag)
        7    0.000    0.000    0.152    0.022 __init__.py:1(<module>)
    28406    0.019    0.000    0.144    0.000 validators.py:251(is_type)
    147/3    0.001    0.000    0.129    0.043 <frozen importlib._bootstrap>:1002(_find_and_load)
    147/3    0.001    0.000    0.129    0.043 <frozen importlib._bootstrap>:967(_find_and_load_unlocked)
    136/3    0.001    0.000    0.129    0.043 <frozen importlib._bootstrap>:659(_load_unlocked)
    110/3    0.000    0.000    0.129    0.043 <frozen importlib._bootstrap_external>:784(exec_module)
    179/3    0.000    0.000    0.128    0.043 <frozen importlib._bootstrap>:220(_call_with_frames_removed)
    28406    0.027    0.000    0.125    0.000 _types.py:65(is_type)
    15913    0.008    0.000    0.103    0.000 {built-in method builtins.any}
    ....

Might create a flame graph later, but no promises... hope this helps to narrow it down. At the moment this blocks updating few projects to 4.x series (and to draft 2020-12).

@Julian
Copy link
Member

Julian commented Oct 21, 2021

Thanks! That does help a bit, and possibly points to some code that was merged to support location independent identifiers ($refs like #foo) here which looked suspicious to me at the time but I couldn't put my finger on how important it'd be to fix it --

But specifically my broad guess from just seeing the profile is that each time a ref is being resolved we're repeatedly walking over schemas looking for location independent identifiers, when instead we should be doing that 0 or 1 time if a location-independent identifier is being dereferenced.

Will require a bit more investigation but yeah that's at least a lead I suspect. (Certainly PRs are welcome to help if someone gets to it first).

@jakevdp
Copy link

jakevdp commented Nov 3, 2021

Another quick note here about the impact of this: in altair_viewer, with jsonschema<4.0, the test suite runs in 30 seconds. With jsonschema 4.0 or newer, the test suite times out after 6 hours. We fixed this in altair-viz/altair_viewer#44 by pinning to jsonschema<4.0

@edufelipe
Copy link

My company also suffered a 19x slowdown when going from 3.2 to 4.1 which caused all sorts of issues.

With this much evidence of a serious performance regression, it might be wise to add a warning about it in the README or CHANGELOG.

@tdamsma
Copy link

tdamsma commented Nov 11, 2021

I wanted to have a quick look to see If I could spot something obviously wrong, but havent successed. But I do have some nice flamecharts to share:

jsonschema.__version__='3.2.0' completed in 0.77s
v3.2

jsonschema.__version__='4.2.0' completed in 46.77s
v4.2

Made using

from urllib.request import urlopen
import json
import jsonschema
from time import time

with urlopen('https://raw.githubusercontent.com/vega/schema/master/vega-lite/v4.8.1.json') as f:
  schema = json.load(f)
with urlopen('https://raw.githubusercontent.com/vega/vega-lite/master/examples/specs/bar.vl.json') as f:
  instance = json.load(f)

v = jsonschema.Draft7Validator(schema)

start = time()
for i in range(1000):
    v.validate(instance)

print(f"{jsonschema.__version__=} completed in {time() - start: 0.2f}s")

@Stranger6667
Copy link
Contributor

Stranger6667 commented Dec 15, 2021

Indeed, quite a lot of time is spent on finding subschemas inside self.referrer. Assuming that self.referrer doesn't change, then the following patch yields some performance improvements (in cost of some memory):

class RefResolver:

    ...
    @lru_cache()
    def _find_ids(self):
        return list(self._finditem(self.referrer, "$id"))

    def resolve(self, ref):
        """
        Resolve the given reference.
        """
        url = self._urljoin_cache(self.resolution_scope, ref).rstrip("/")

        uri, fragment = urldefrag(url)

        # Use `_find_ids` below
        for subschema in self._find_ids():
            target_uri = self._urljoin_cache(
                self.resolution_scope, subschema["$id"],
            )
            if target_uri.rstrip("/") == uri.rstrip("/"):
                if fragment:
                    subschema = self.resolve_fragment(subschema, fragment)
                return url, subschema

        return url, self._remote_cache(url)

All tests pass.

Running against code from the first message in this issue:
Before: 26.696181535720825 sec
After: 13.24342656135559 sec

@Julian
Copy link
Member

Julian commented Dec 15, 2021

Assuming that self.referrer doesn't change

This is definitely a reasonable assumption so sounds like such a patch would be very much appreciated. Thanks for the investigation.

@Stranger6667
Copy link
Contributor

This is definitely a reasonable assumption so sounds like such a patch would be very much appreciated. Thanks for the investigation.

Great! Happy to help :)

After some more adjustments:

class RefResolver:
    ...
    @lru_cache()
    def _find_ids(self):
        return list(self._finditem(self.referrer, "$id"))

    @lru_cache()
    def _find_in_subschema(self, url):
        uri, fragment = urldefrag(url)
        for subschema in self._find_ids():
            target_uri = self._urljoin_cache(
                self.resolution_scope, subschema["$id"],
            )
            if target_uri.rstrip("/") == uri.rstrip("/"):
                if fragment:
                    subschema = self.resolve_fragment(subschema, fragment)
                return url, subschema
        return None

    def resolve(self, ref):
        """
        Resolve the given reference.
        """
        url = self._urljoin_cache(self.resolution_scope, ref).rstrip("/")

        match = self._find_in_subschema(url)
        if match is not None:
            return match

        return url, self._remote_cache(url)

Cuts the execution time to 11.305032968521118 sec

I'll submit a patch shortly

@Julian
Copy link
Member

Julian commented Dec 17, 2021

v4.3.1 is out with great thanks to @Stranger6667 for putting in the time to make the fix. I haven't fully tested myself against the examples above but please do share feedback.

@tfranzel
Copy link
Author

updated my measurements in the OP. thank you guys for putting in the work! ❤️ it is a 3x improvement, but still a little bit slower than the 3.2 version. I think this is a manageable slowdown now and from my side the ticket could be closed.

@tdamsma
Copy link

tdamsma commented Dec 17, 2021

Seems like it improved a lot for the specific testcase I used, 80 or so times faster and only a few % slower than 3.2.0

jsonschema.__version__='4.3.1' completed in 0.83s
v4.3

@jakevdp
Copy link

jakevdp commented Dec 19, 2021

I just tried the altair_viewer test suite with jsonschema 4.3; it's back to executing under 30 seconds (altair-viz/altair_viewer#46). Thanks so much for the fix!

@Stranger6667
Copy link
Contributor

Awesome! :) There is a small follow-up patch that reduces the number of calls to the subschema search routine - not a dramatic improvement, but seems like a nice to have :)

In my use case, I'd be happy to upgrade jsonschema to the latest version in Schemathesis, so we're closer to the Open API 3.1 support :)

Thanks to everybody involved for such detailed reports & reviewing patches :)

@Julian
Copy link
Member

Julian commented Dec 19, 2021

Thanks all for the feedback and a big thanks again to @Stranger6667. Sounds like we can close this for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something doesn't work the way it should.
Projects
None yet
Development

No branches or pull requests

8 participants