Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CORE-6861 Schema Registry: verbose compat checks for JSON #23208

Merged
merged 9 commits into from
Sep 12, 2024

Conversation

pgellert
Copy link
Contributor

@pgellert pgellert commented Sep 5, 2024

This implements verbose compatibility reporting for json schema registry. This has already been implemented for Protobuf and Avro, and with this patch it is going to be supported for json as well.

This PR only changes what error message is being reported when there is an incompatibility but the logic of "what sets of schemas are considered compatible" is intended to be unchanged. The changes start with a few commits that introduce helpers and do non-logic-changing refactoring to make the main commit (47d543b) easier to review. While the main commit is large, the changes in it are trivial, most of its complexity lies in verifying that the reported compatibility error type matches the reference implementation. The existing compatibility test suite is extended to capture not just the (non)compatibility of the schemas but also the reason for the incompatibility by checking the verbose error reported. This helps us achieve reasonable coverage of this patch.

Note that because our json compatibility checks are expressed as forward compatibility checks and not backward compatibility checks, the names of the error types are always the inverse of what the surrounding code and comments express. For example, we report the additional_items_removed error when newer has additionalItems but older does not.

Known discrepancies

There are some Notes and TODOs left in the unit tests where I noticed that the error type we report is different to what the reference implementation reports.

  • The Notes highlight differences I don't intend to change because I consider them bugs in the reference implementation
  • The TODOs I intend to make follow up tickets for

Fixes https://redpandadata.atlassian.net/browse/CORE-6861

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.2.x
  • v24.1.x
  • v23.3.x

Release Notes

Features

  • Schema Registry: verbose compatibility error reporting is now supported for JSON as well

Adds a helper for constructing singleton `raw_compatibility_result`'s.
Define `operator==` and `operator<<` for `compatibility_result` to allow
using it in `BOOST_REQUIRE_EQ` in tests.
Add all json incompatibility types that we expect to report with their
descriptions matching the reference implementation.
@pgellert pgellert requested a review from a team September 5, 2024 13:19
@pgellert pgellert self-assigned this Sep 5, 2024
@pgellert pgellert requested review from michael-redpanda and removed request for a team September 5, 2024 13:19
@pgellert pgellert requested review from andijcr and BenPope September 5, 2024 13:19
@michael-redpanda
Copy link
Contributor

I don't think you need to backport this to v24.1 or v23.3, right?

Copy link
Member

@BenPope BenPope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good

src/v/pandaproxy/schema_registry/json.cc Show resolved Hide resolved
src/v/pandaproxy/schema_registry/json.cc Outdated Show resolved Hide resolved
src/v/pandaproxy/schema_registry/json.cc Outdated Show resolved Hide resolved
src/v/pandaproxy/schema_registry/json.cc Outdated Show resolved Hide resolved
src/v/pandaproxy/schema_registry/json.cc Outdated Show resolved Hide resolved
src/v/pandaproxy/schema_registry/json.cc Outdated Show resolved Hide resolved
src/v/pandaproxy/schema_registry/json.cc Outdated Show resolved Hide resolved
src/v/pandaproxy/schema_registry/json.cc Outdated Show resolved Hide resolved
src/v/pandaproxy/schema_registry/json.cc Outdated Show resolved Hide resolved
src/v/pandaproxy/schema_registry/json.cc Outdated Show resolved Hide resolved
Copy link
Contributor

@andijcr andijcr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for this monumental effort. need to do a second pass to understand better the pr. left some bookmarks for snippets that i'd like to revisit

src/v/pandaproxy/schema_registry/json.cc Show resolved Hide resolved
src/v/pandaproxy/schema_registry/json.cc Outdated Show resolved Hide resolved
src/v/pandaproxy/schema_registry/json.cc Outdated Show resolved Hide resolved
src/v/pandaproxy/schema_registry/json.cc Outdated Show resolved Hide resolved
src/v/pandaproxy/schema_registry/json.cc Show resolved Hide resolved
src/v/pandaproxy/schema_registry/json.cc Outdated Show resolved Hide resolved
Comment on lines 1164 to 1185
std::for_each(
newer_it, newer_tuple_schema.end(), [&](json::Value const& n) {
auto item_p = p / "items" / std::to_string(i);
auto sup_res = is_superset(ctx, older_additional_schema, n, item_p);
auto sup_err = sup_res.has_error();

if (sup_err) {
res.merge(std::move(sup_res));
res.emplace<json_incompatibility>(
std::move(item_p),
json_incompatibility_type::
item_removed_not_covered_by_partially_open_content_model);
}

++i;
});

// Check if older has excess elements that are not compatible with
// newer["additionalItems"]
auto newer_additional_schema = get_object_or_empty(
ctx.newer, newer, get_additional_items_kw(ctx.newer.dialect()));
if (!std::all_of(
older_it, older_tuple_schema.end(), [&](json::Value const& o) {
return is_superset(ctx, o, newer_additional_schema);
})) {
return false;
}
return true;
std::for_each(
older_it, older_tuple_schema.end(), [&](json::Value const& o) {
auto item_p = p / "items" / std::to_string(i);
auto sup_res = is_superset(ctx, o, newer_additional_schema, item_p);
auto sup_err = sup_res.has_error();

if (sup_err) {
res.merge(std::move(sup_res));
res.emplace<json_incompatibility>(
std::move(item_p),
json_incompatibility_type::
item_added_not_covered_by_partially_open_content_model);
}

++i;
});

return res;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with this change the interesting difference
is_superset(ctx, older_additional_schema, n, item_p) vs is_superset(ctx, o, newer_additional_schema, item_p) is drowned between equal lines.
not sure if factoring the common part could result in more readable code, have you tried it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah good point. I've refactored this now, I think it made it more readable but let me know what you think.

&& !is_superset(ctx, older_additional_properties, schema)) {
// not compatible
return false;
if (!is_false_schema(older_additional_properties)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to me: not sure why is_false_schema(older_additional_properties) is brought here, need to re-read this part

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't change the set of schema pairs that are compatible (because a false schema won't match against any schema anyway). But it's necessary to report the specific error of property_removed_from_closed_content_model when older_additional_properties is false. Let me invert the condition here, that seems simpler.

Comment on lines 1813 to 1825
auto older_minus_newer = json_type_list{};
std::ranges::set_difference(
older_types, newer_types, std::back_inserter(older_minus_newer));

if (
older_minus_newer.empty()
|| (older_minus_newer == json_type_list{json_type::integer} && newer_minus_older == json_type_list{json_type::number})) {
res.emplace<json_incompatibility>(
std::move(p), json_incompatibility_type::type_narrowed);
} else {
res.emplace<json_incompatibility>(
std::move(p), json_incompatibility_type::type_changed);
}
return res;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to self: i need to break down on paper this part

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment now, maybe that helps, but yeah you might need a paper for this.

@andijcr andijcr self-requested a review September 9, 2024 08:36
Copy link
Contributor

@michael-redpanda michael-redpanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow... did an initial pass, will also need to run through this again

@BenPope @andijcr would it be helpful to do a live review of this?

*newer_value)) {
return false;
}
} else if (older_value.has_value() /* && !newer_value.has_value() */) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: commented out code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant to have this as a helping comment to say that in this case newer_value doesn't have a value, but perhaps it's more confusing than helpful. I've removed it now.

context const& ctx,
json::Value const& older,
json::Value const& newer,
std::filesystem::path p);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: should this be a const& to avoid possible copies?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it's probably pointless, since at every level we need to generate a new path anyway

Pure refactor, no behaviour change intended.

Make a small change to the method to make the diff in the following
commit simpler to read and reason about.
Pure refactor, no behaviour change intended.

Make a small change to the method to make the diff in the following
commit simpler to read and reason about.
Pure refactor, no behaviour change intended.

Make a small change to the method to make the diff in the following
commit simpler to read and reason about.

New tests are also added (they passed before too).
@pgellert pgellert force-pushed the sr/verbose-compat-4 branch 2 times, most recently from 6d86671 to 3c88300 Compare September 9, 2024 15:29
@pgellert
Copy link
Contributor Author

pgellert commented Sep 9, 2024

Force push:

  • First: address code review feedback
  • Second: start using json_compatibility_result alias

return false;
if (is_false_schema(older_additional_properties)) {
res.emplace<json_incompatibility>(
std::move(prop_path),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird it triggers use-after-move but it seems safe

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I went with an lambda here to avoid the move now

Copy link
Contributor

@andijcr andijcr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some other comments, but overall seems good. waiting for the walkthrough before approving

Comment on lines 1834 to 1835
if (res.has_error()) {
return res;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we returning at the first json_type is_superset error, instead of merging the errors of the other is_[type]_superset and enum/combinators? is it to match behavior?

I think the final result will depend on the ordering of the "type": [...] array.

Not that this is highly critical, since I didn't see much use of the "type": [...] feature

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed this now. It makes sense to continue to return as many errors as possible.

Comment on lines +1412 to +1419
} else {
throw as_exception(invalid_schema(fmt::format(
"dependencies can only be an array or an object for valid "
"schemas but it was: {}",
pj{o})));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: json schema validation should prevent reaching this block, but at the same time it's not harmful to leave it here.

Comment on lines 1351 to 1414
auto path_dep = p / "dependencies" / as_string_view(older_dep.name);
auto const& o = older_dep.value;
auto n_it = newer_p.FindMember(older_dep.name);

if (o.IsObject()) {
if (n_it == newer_p.MemberEnd()) {
res.emplace<json_incompatibility>(
std::move(path_dep),
json_incompatibility_type::dependency_schema_added);
return;
}

if (!n.IsObject()) {
return false;
}
auto const& n = n_it->value;

// schemas: o and n needs to be compatible
return is_superset(ctx, o, n);
} else if (o.IsArray()) {
if (n_it == newer_p.MemberEnd()) {
return false;
}
if (!n.IsObject()) {
res.emplace<json_incompatibility>(
std::move(path_dep),
json_incompatibility_type::dependency_schema_added);
return;
}

auto const& n = n_it->value;
// schemas: o and n needs to be compatible
res.merge(is_superset(ctx, o, n, std::move(path_dep)));
} else if (o.IsArray()) {
if (n_it == newer_p.MemberEnd()) {
res.emplace<json_incompatibility>(
std::move(path_dep),
json_incompatibility_type::dependency_array_added);
return;
}

if (!n.IsArray()) {
return false;
}
// string array: n needs to be a a superset of o
// TODO: n^2 search
bool n_superset_of_o = std::ranges::all_of(
o.GetArray(), [n_array = n.GetArray()](json::Value const& p) {
return std::ranges::find(n_array, p) != n_array.End();
});
if (!n_superset_of_o) {
return false;
}
return true;
} else {
throw as_exception(invalid_schema(fmt::format(
"dependencies can only be an array or an object for valid "
"schemas but it was: {}",
pj{o})));
}
});
auto const& n = n_it->value;

if (!n.IsArray()) {
res.emplace<json_incompatibility>(
std::move(path_dep),
json_incompatibility_type::dependency_array_added);
return;
}
// string array: n needs to be a a superset of o
// TODO: n^2 search
bool n_superset_of_o = std::ranges::all_of(
o.GetArray(), [n_array = n.GetArray()](json::Value const& p) {
return std::ranges::find(n_array, p) != n_array.End();
});
if (!n_superset_of_o) {
bool o_superset_of_n = std::ranges::all_of(
n.GetArray(), [o_array = o.GetArray()](json::Value const& p) {
return std::ranges::find(o_array, p) != o_array.End();
});
if (o_superset_of_n) {
res.emplace<json_incompatibility>(
std::move(path_dep),
json_incompatibility_type::dependency_array_extended);
} else {
res.emplace<json_incompatibility>(
std::move(path_dep),
json_incompatibility_type::dependency_array_changed);
}
}
return;
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this diff could be less jarring if the lambda was pulled outside before changing all_of to (for_each+json_compatibility_result), but overall the code seems ok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left this as is. You're right, I could move the lambda out and that would make the diff much nicer, but I would want to move it back to its place afterwards so I think I am going to leave this as is to reduce code motion.

@pgellert pgellert requested a review from oleiman September 10, 2024 15:14
andijcr
andijcr previously approved these changes Sep 10, 2024
Copy link
Contributor

@andijcr andijcr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

waiting for a fix for use-after-free (possibly a false positive), but looks overall good

Adapt the json compatibility check code to report the exact reason of
the incompatibility that is causing the compatibility check to fail.

Tests are adapted to now make checks explicitly on the reported
incompatibilities instead of just the boolean of whether the change is
compatible or not.

Note that because our json compatibility checks are expressed as forward
compatibility checks and not backward compatibility checks, the names of
the error types are always the inverse of what the surrounding code and
comments express. For example, we report the `additional_items_removed`
error when `newer` has `additionalItems` but `older` does not.

Some general patterns in this change are:
 * return false --> add a specific error
 * return true --> return a non-error result
 * return is_superset(...) --> depending on how the reference
   implementation works one of the following is chosen:
    - Forward the result:
        return is_superset(...)
    - Merge the result and continue to gather more:
        res.merge(is_superset(...))
    - Merge the result and add a more specific error:
        res.merge(is_superset(...));
        res.emplace(...)
    - Only add a specific error if the result has an error:
        res.emplace(...)
Add a smoketest to further test that multiple (as many as possible)
errors are reported for a complex schema. This is checked in both the
forward and backward direction.
Extend the verbose compatibility message test to include json as well.
@pgellert
Copy link
Contributor Author

Force-push: fix (incorrect) use after move clang tidy error, address more code review feedback

@andijcr andijcr self-requested a review September 10, 2024 16:45
Copy link
Member

@BenPope BenPope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I think you missed one of my comments about appending "items" to the path.

nit? There are a few cases that don't have tests.

@pgellert
Copy link
Contributor Author

I think you missed one of my comments about appending "items" to the path.

I responded to the "items" comment now. I thought I fixed it but it was just a similar line close to your comment: #23208 (comment)

nit? There are a few cases that don't have tests.

Are there any in particular that you have in mind? I tried increasing test coverage with this PR, covering the existing tests and adding new ones for all interesting verbose error types that hadn't been covered. For some very similar error types (eg. maximum decreased vs minimum increased, etc.) I didn't bother adding new tests but I tried to have at least 1 test for each error type otherwise.

@pgellert pgellert requested a review from BenPope September 11, 2024 12:35
@pgellert pgellert merged commit 74ae259 into redpanda-data:dev Sep 12, 2024
18 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v24.2.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants