-
Notifications
You must be signed in to change notification settings - Fork 592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CORE-4181] schema registry json support validating drafts 5 6 and 7 #20817
[CORE-4181] schema registry json support validating drafts 5 6 and 7 #20817
Conversation
/dt |
3de0093
to
6d76551
Compare
skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51111#01907e92-321b-4476-b55a-e72a1d0bdc06: skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51111#01907e92-3215-4276-9644-9d82eecea467: skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51111#01907ec1-58de-4b86-921e-54abba990aad: skipped ducktape retry in https://buildkite.com/redpanda/redpanda/builds/51111#01907ec1-58e5-434e-b841-157f2c2dc075: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great.
The commit history seems a bit funky, though.
6d76551
to
d63c3d1
Compare
force push: review comments, added draft5 |
eventually, with default values for properties being "true/false" or "{}/{"not":{}}", we can form a call like is_superset("{}", ...) or is_superset(..., "{"not":{}") that can cause infinite recursion. this is because the type is assumed to mean every type, and this can lead to this recursive chain is_superset -> is_array_superset -> is_superset ... this commit breaks the chain by recognizing these atoms and returning early. normalization and/or AST could offer a more robust solution, by ensuring that implicit properties are explicit and removing extraneous fields. The test cases in this commit would lead to infinite recursion without the early return in is_superset.
boost test macros are not great at printing the exception. this commit adds a manual try/catch to intercept and print exceptions, for ease of development
will need to be visible to code in later commits
enum for the dialects supported by the implementation, meant to symbolize the URIs of the official dialects.
this function gets a dialect and a json::Document of the candidate schema, and will perform validation against the correct metaschema. to compile only once the Schema object, get_metaschema<json_schema_dialect>() will compile it lazily and save it in static const variable. to support validation of json without the "$schema" property, try_validate_json_schema will try to validate it against the known dialects, from the most recent to the oldest
this is an intermediate commit. split parsing and validating. the error messages and functionalities remain the same, but this way it's easier to wire in the new validating functions and to add support for new dialects
note: the diff of this commit is somewhat chaotic delegate schema validation to validate_json_schema. to do so, extract the dialect, if present, from the json document, and call validate_json_schema/try_validate_json_schema
draft5 is a cleanup of draft4 and reuses the metaschema, according to the official documentation. This is why the support here is implemented with the draft4 metaschema
adapt from_uri(uri) -> json_schema_dialect to recognize http://json-schema.org/draft-0N/schema# and http://json-schema.org/draft-0N/schema as draftN dialects. removed enum values for $schema in the metaschemas since the check for permitted values is already performed before validation
d4c39e0
to
161213c
Compare
force push merge conflict |
break recursion for
is_superset(get_true_schema(), ...)
andis_superset(..., get_false_schema())
these two cases can happens for properties that don't have a default value: for example{"type": "array"}
has no"items"
, so it gets the default value of{}
. without this early break, we get in an infinite recursion where{}
is passes throughis_array_superset
and that in turn will callis_superset
and so onadd support for draft5: according to the official documentation, this dialect reuses the metaschema for draft4, and so does the code
support for draft6 : the metaschema is adapted to draft4, because the current version of rapidjson implements only draft4
support for draft7 : the metaschema is adapted to draft4, same reasons
support for schemas uris without trailing #
Fixes https://redpandadata.atlassian.net/browse/CORE-4181
Backports Required
Release Notes