-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New validator instances are built for each instancefile, discarding any internal caching #463
Comments
There's a secondary bug here, which is that the retrieve callable's in-memory caching is not working correctly because it's mixing absolute and relative URIs. I have a potential fix for this pair of issues which I need to think about for a little before I proceed: $ git diff HEAD
diff --git a/src/check_jsonschema/schema_loader/main.py b/src/check_jsonschema/schema_loader/main.py
index 4ce95c9..88ff6a0 100644
--- a/src/check_jsonschema/schema_loader/main.py
+++ b/src/check_jsonschema/schema_loader/main.py
@@ -1,5 +1,6 @@
from __future__ import annotations
+import functools
import pathlib
import typing as t
import urllib.error
@@ -130,11 +131,21 @@ class SchemaLoader(SchemaLoaderBase):
instance_doc: dict[str, t.Any],
format_opts: FormatOptions,
fill_defaults: bool,
+ ) -> jsonschema.protocols.Validator:
+ return self._get_validator(format_opts, fill_defaults)
+
+ @functools.lru_cache
+ def _get_validator(
+ self,
+ format_opts: FormatOptions,
+ fill_defaults: bool,
) -> jsonschema.protocols.Validator:
retrieval_uri = self.get_schema_retrieval_uri()
schema = self.get_schema()
schema_dialect = schema.get("$schema")
+ if schema_dialect is not None and not isinstance(schema_dialect, str):
+ schema_dialect = None
# format checker (which may be None)
format_checker = make_format_checker(format_opts, schema_dialect)
diff --git a/src/check_jsonschema/schema_loader/resolver.py b/src/check_jsonschema/schema_loader/resolver.py
index c63b7bb..5084328 100644
--- a/src/check_jsonschema/schema_loader/resolver.py
+++ b/src/check_jsonschema/schema_loader/resolver.py
@@ -79,8 +79,8 @@ def create_retrieve_callable(
else:
full_uri = uri
- if full_uri in cache._cache:
- return cache[uri]
+ if full_uri in cache:
+ return cache[full_uri]
full_uri_scheme = urllib.parse.urlsplit(full_uri).scheme
if full_uri_scheme in ("http", "https"):
@@ -100,8 +100,8 @@ def create_retrieve_callable(
else:
parsed_object = get_local_file(full_uri)
- cache[uri] = parsed_object
- return cache[uri]
+ cache[full_uri] = parsed_object
+ return cache[full_uri]
return retrieve_reference This makes validator creation cached internally for the main That said, I need to at the very least devise some tests for these to ensure there are no regressions in the future. |
It would be amazing to get this fixed - then I can integrate it efficiently in the CI setup I am planning to :) |
I wasn't able to work on this last week, but I've just put the above into a PR with a test to run against. I should be able to get this all merged and released soon, assuming that there are no surprises when CI runs. |
v0.29.1 has the above fix and is freshly available. I tested it against the usage described in #451 and saw execution times of around 0.8s on average. Please let me know (comment or fresh issue) if the new version is still performing in an unexpectedly bad way! |
I can confirm the execution time very fast now - thanks! |
First reported via #451
For each file, we get a validator from the "schema loader" objects. This interface is correct for the top-level checker. It allows, for example, metaschema checking to proceed with different schemas for different files.
However, this also is being applied to the remote and local class --
SchemaLoader
.As a result, a new validator is built for each file, and any internal caching done under the validator is discarded.
To resolve, one of two approaches should be used:
(schema, settings)
where there might not be any current settings)The text was updated successfully, but these errors were encountered: