Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schema_registry/avro: Canonicalize AVRO schema #10786

Merged
merged 5 commits into from
Jun 1, 2023

Conversation

BenPope
Copy link
Member

@BenPope BenPope commented May 15, 2023

Additional sanitization:

  • Sort members of all complex types
  • Sort members of record fields

Fix #7609

NOTE: This is not Parsing Canonical Form, but it is equivalent.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.1.x
  • v22.3.x
  • v22.2.x

Release Notes

Improvements

  • Schema Registry: Canonicalize AVRO schema so that syntactic differences don't fail lookups by schema contents.

Additional sanitization:
* Sort members of all complex types
* Sort members of record fields

Fix redpanda-data#7609

Signed-off-by: Ben Pope <ben@redpanda.com>
@BenPope BenPope added the area/schema-registry Schema Registry service within Redpanda label May 15, 2023
@BenPope BenPope requested a review from michael-redpanda May 15, 2023 22:22
@BenPope BenPope self-assigned this May 15, 2023
Copy link
Contributor

@michael-redpanda michael-redpanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

dotnwat
dotnwat previously approved these changes May 26, 2023
Copy link
Member

@dotnwat dotnwat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@BenPope BenPope dismissed stale reviews from dotnwat and michael-redpanda via 46de1d5 May 26, 2023 16:08
@BenPope
Copy link
Member Author

BenPope commented May 26, 2023

Improvements:

DONE:
The historical contents of the topic should now be considered as unparsed instead of canonical.

Read the contents of the topic as unparsed, and sanitize on upsert.

src/v/pandaproxy/schema_registry/storage.h Outdated Show resolved Hide resolved
src/v/pandaproxy/schema_registry/sharded_store.cc Outdated Show resolved Hide resolved
BenPope added 4 commits May 30, 2023 15:40
Signed-off-by: Ben Pope <ben@redpanda.com>
Pure refactor to allow unparsed handling in a future commit.

Signed-off-by: Ben Pope <ben@redpanda.com>
Signed-off-by: Ben Pope <ben@redpanda.com>
Signed-off-by: Ben Pope <ben@redpanda.com>
Copy link
Contributor

@michael-redpanda michael-redpanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Contributor

@michael-redpanda michael-redpanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@michael-redpanda
Copy link
Contributor

I liked it so much I approved it twice (somehow)

@piyushredpanda piyushredpanda merged commit dad9523 into redpanda-data:dev Jun 1, 2023
@vbotbuildovich
Copy link
Collaborator

/backport v23.1.x

@vbotbuildovich
Copy link
Collaborator

/backport v22.3.x

@vbotbuildovich
Copy link
Collaborator

Failed to run cherry-pick command. I executed the commands below:

git checkout -b backport-pr-10786-v22.3.x-20 remotes/upstream/v22.3.x
git cherry-pick -x 50cd49dbcc59190f942c131bbe3b46b1e30eff29 6b7d89faee15923a7301230f3f093b8529173ea4 bbb65079afe9a31879621ed150cd6939185f8119 12f339c409a81ef51fd8800e6202fa412785a0ed c836d580ee12ffde9753e2ea716a3c2561aefd1d

Workflow run logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/redpanda area/schema-registry Schema Registry service within Redpanda
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support 'Parsing Canonical Form' schema normalization in Avro
5 participants