-
Notifications
You must be signed in to change notification settings - Fork 592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sr/avro: fix missing default compat check for null (v2) #24032
Merged
pgellert
merged 3 commits into
redpanda-data:dev
from
pgellert:fix/sr-avro-default-added-2
Dec 4, 2024
Merged
sr/avro: fix missing default compat check for null (v2) #24032
pgellert
merged 3 commits into
redpanda-data:dev
from
pgellert:fix/sr-avro-default-added-2
Dec 4, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Make it easier to extend by simplifying the definition of a test case.
This adds more tests for the avro incompatibility `reader_field_missing_default_value`. All of these tests are already passing, while the tests added in the following commit show the change in behaviour.
This fixes a bug in the avro compatibility check for fields present in the reader but missing from the writer. In this case the writer and the reader are compatible only if the reader field has a specified default value. The Avro library sets the default value to `GenericDatum()` which has type `AVRO_NULL` when the default is not specified. This is what we are trying to detect in this check. However, the check was confusing an explicit null default for a union-type as a missing value. The former is represented as `GenericDatum(Union(Null))` and because of the way `GenericDatum::type()` delegates to the type of the first union leaf value when `GenericDatum` contains a union, the check thought that the value was missing, when it was in fact specified as null. To solve for the above, we also check that the type is not a union type to be able to detect when the default is not set. (The same check is used in the Avro library to detect unset default values, for example, when printing an Avro schema as JSON.) The caveat is that we are still overly restrictive for fields of type null, because in that case, the default value will be `GenericDatum()` regardless of whether the default value was explicitly set to null or not. This is a limitation of the Avro library's API and could only be fixed by making breaking changes to the Avro library's API. Also note that the checks against `reader.leafAt(...)` have been removed. They were attempting to compare the type of the field against the type of the default value. This is not necessary because this is ensured by the Avro library's built-in schema validation done while parsing the Avro schema.
BenPope
approved these changes
Dec 3, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
the below tests from https://buildkite.com/redpanda/redpanda/builds/59170#01938c93-bede-4dcd-960c-59cbb41188da have failed and will be retried
|
/backport v24.3.x |
/backport v24.2.x |
/backport v24.1.x |
This was referenced Dec 4, 2024
This was referenced Dec 4, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This fixes a bug in the avro compatibility check for fields present in
the reader but missing from the writer. In this case the writer and the
reader are compatible only if the reader field has a specified default
value.
The Avro library sets the default value to
GenericDatum()
which hastype
AVRO_NULL
when the default is not specified. This is what we aretrying to detect in this check. However, the check was confusing an
explicit null default for a union-type as a missing value. The former is
represented as
GenericDatum(Union(Null))
and because of the wayGenericDatum::type()
delegates to the type of the first union leafvalue when
GenericDatum
contains a union, the check thought that thevalue was missing, when it was in fact specified as null.
To solve for the above, we also check that the type is not a union type
to be able to detect when the default is not set. (The same check is
used in the Avro library to detect unset default values, for example,
when printing an Avro schema as JSON.)
The caveat is that we are still overly restrictive for fields of type
null, because in that case, the default value will be
GenericDatum()
regardless of whether the default value was explicitly set to null or
not. This is a limitation of the Avro library's API and could only be
fixed by making breaking changes to the Avro library's API.
Also note that the checks against
reader.leafAt(...)
have beenremoved. They were attempting to compare the type of the field against
the type of the default value. This is not necessary because this is
ensured by the Avro library's built-in schema validation done while
parsing the Avro schema.
Fixes https://redpandadata.atlassian.net/browse/CORE-1796
Fixes #16649
Backports Required
Release Notes
Bug Fixes