-
Notifications
You must be signed in to change notification settings - Fork 592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CORE-3186] schema registry json schema: array compatibility checks #20137
Merged
michael-redpanda
merged 7 commits into
redpanda-data:dev
from
andijcr:feat/core-3186/schema_registry_array
Jun 28, 2024
Merged
Changes from 1 commit
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
701189e
schema_registry/json: move is_object_additional_properties_superset
andijcr 6b3df9c
schema_registry/json: refactor is_object_additional_properties_superset
andijcr 83962e5
schema_registry/json: is_array_superset size checks
andijcr 4c9789c
schema_registry/json: is_array_superset uniqueItems
andijcr 04edb4f
schema_registry/json: is_array_superset array schema
andijcr 8f0aadd
schema_registry/json: is_array_schema tuple schema
andijcr 3c9596f
schema_registry/json: fix is_numeric_property_value_superset
andijcr File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -494,7 +494,11 @@ extract_property_and_gate_check( | |
return {std::nullopt, &older_it->value, &newer_it->value}; | ||
} | ||
|
||
// helper for numeric property that fits into a double: | ||
// helper for numeric property that fits into a double. | ||
// if the property has a default value, it can be passed as last parameter. | ||
// This is necessary to make {"type": "string", "minLength": 0} equivalent to | ||
// {"type": "string"}. | ||
// if no default value is passed, the result is given by this table | ||
// older | newer | is_superset | ||
// ------- | ------- | ----------- | ||
// nothing | nothing | yes | ||
|
@@ -507,36 +511,51 @@ bool is_numeric_property_value_superset( | |
json::Value const& older, | ||
json::Value const& newer, | ||
std::string_view prop_name, | ||
VPred&& value_predicate) { | ||
auto [maybe_is_compatible, older_val_p, newer_val_p] | ||
= extract_property_and_gate_check(older, newer, prop_name); | ||
if (maybe_is_compatible.has_value()) { | ||
return maybe_is_compatible.value(); | ||
VPred&& value_predicate, | ||
std::optional<double> default_value = std::nullopt) { | ||
// get value or default_value | ||
auto get_value = [&](json::Value const& v) -> std::optional<double> { | ||
auto it = v.FindMember( | ||
json::Value{prop_name.data(), rapidjson::SizeType(prop_name.size())}); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It might be worth creating a helper from |
||
|
||
if (it == v.MemberEnd()) { | ||
return default_value; | ||
} | ||
|
||
// Gate on values that can't be represented with doubles. | ||
// rapidjson can serialize a uint64_t even thought it's not a widely | ||
// supported type, so deserializing that would trigger this. note also | ||
// that 0.1 is a valid json literal, but does not have an exact double | ||
// representation. this cannot be caught with this, and it would require | ||
// some sort of decimal type | ||
if (!it->value.IsLosslessDouble()) { | ||
throw as_exception(invalid_schema(fmt::format( | ||
R"(is_numeric_property_value_superset-{} not implemented for type {}. input: older: '{}', newer: '{}')", | ||
prop_name, | ||
it->value.GetType(), | ||
pj{older}, | ||
pj{newer}))); | ||
} | ||
|
||
return it->value.GetDouble(); | ||
}; | ||
|
||
auto older_value = get_value(older); | ||
auto newer_value = get_value(newer); | ||
if (older_value == newer_value) { | ||
// either both not set or with the same value | ||
return true; | ||
} | ||
|
||
// Gate on values that can't be represented with doubles. | ||
// rapidjson can serialize a uint64_t even thought it's not a widely | ||
// supported type, so deserializing that would trigger this. note also that | ||
// 0.1 is a valid json literal, but does not have an exact double | ||
// representation. this cannot be caught with this, and it would require | ||
// some sort of decimal type | ||
if (!older_val_p->IsLosslessDouble() || !newer_val_p->IsLosslessDouble()) { | ||
// both have value but they can't be decoded as T | ||
throw as_exception(invalid_schema(fmt::format( | ||
R"({}-{} not implemented for types [{},{}]. input: older: '{}', newer: '{}')", | ||
__FUNCTION__, | ||
prop_name, | ||
older_val_p->GetType(), | ||
newer_val_p->GetType(), | ||
pj{older}, | ||
pj{newer}))); | ||
if (older_value.has_value() && newer_value.has_value()) { | ||
return std::invoke( | ||
std::forward<VPred>(value_predicate), *older_value, *newer_value); | ||
} | ||
|
||
auto older_value = older_val_p->GetDouble(); | ||
auto newer_value = newer_val_p->GetDouble(); | ||
return older_value == newer_value | ||
|| std::invoke( | ||
std::forward<VPred>(value_predicate), older_value, newer_value); | ||
// (relevant only if default_value is not used) | ||
// only one is set. if older is not set then newer has a value that is more | ||
// restrictive, so older is a superset of newer | ||
return !older_value.has_value(); | ||
} | ||
|
||
enum class additional_field_for { object, array }; | ||
|
@@ -616,7 +635,7 @@ bool is_additional_superset( | |
bool is_string_superset(json::Value const& older, json::Value const& newer) { | ||
// note: "format" is not part of the checks | ||
if (!is_numeric_property_value_superset( | ||
older, newer, "minLength", std::less_equal<>{})) { | ||
older, newer, "minLength", std::less_equal<>{}, 0)) { | ||
// older is less strict | ||
return false; | ||
} | ||
|
@@ -744,7 +763,7 @@ bool is_array_superset(json::Value const& older, json::Value const& newer) { | |
|
||
// size checks are common to both types | ||
if (!is_numeric_property_value_superset( | ||
older, newer, "minItems", std::less_equal<>{})) { | ||
older, newer, "minItems", std::less_equal<>{}, 0)) { | ||
return false; | ||
} | ||
|
||
|
@@ -1023,7 +1042,7 @@ bool is_object_required_superset( | |
|
||
bool is_object_superset(json::Value const& older, json::Value const& newer) { | ||
if (!is_numeric_property_value_superset( | ||
older, newer, "minProperties", std::less_equal<>{})) { | ||
older, newer, "minProperties", std::less_equal<>{}, 0)) { | ||
// newer requires less properties to be set | ||
return false; | ||
} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this only fixes the bug for the numeric properties but the same bug exists for other calls to
extract_property_and_gate_check
. If I understand it correctly, the bug is thatextract_property_and_gate_check
doesn't handle the default value of the field.For example, this test case would fail - I believe incorrectly:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, but
extract_property_and_gate_check
cannot handle this case cleanly."exclusiveMinimum" (or _maximum) is kind of tricky because it's a boolean with a default value in draft4 but a number without a default value from draft6.
Currently,
is_numeric_superset
doesn't know the schema dialect, so it cannot assign a meaningful default to the property.We have json schema normalization on the roadmap that should fix this problem at the root, by explicitly setting all the implicit default values based on the schema dialect in use.
So if you are ok with this, i'd move this fix to a follow-up or to the
json schema normalization
PR