Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/kafka schema registry integration #1959

Merged

Conversation

farbodahm
Copy link
Contributor

@farbodahm farbodahm commented Aug 15, 2022

Summary of Changes

Added an extractor for KafkaSchemaRegistry schemas.

Tests

Added databuilder/tests/unit/extractor/test_kafka_schema_registry_extractor.py for testing the new extractor.

Documentation

Updated README.md and databuilder/README.md to reflect the addition of the new extractor.

CheckList

Make sure you have checked all steps below to ensure a timely review.

  • PR title addresses the issue accurately and concisely. Example: "Updates the version of Flask to v1.0.2"
  • PR includes a summary of changes.
  • PR adds unit tests, updates existing unit tests, OR documents why no test additions or modifications are needed.
  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does

@boring-cyborg boring-cyborg bot added the area:databuilder From databuilder folder label Aug 15, 2022
@boring-cyborg
Copy link

boring-cyborg bot commented Aug 15, 2022

Congratulations on your first Pull Request and welcome to Amundsen community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/amundsen-io/amundsen/blob/main/CONTRIBUTING.md)

@farbodahm farbodahm force-pushed the feat/kafka-schema-registry-integration branch 4 times, most recently from 3cd912a to 4144ce6 Compare August 16, 2022 05:55
@boring-cyborg boring-cyborg bot added area:all Related to all the project area:docs labels Aug 16, 2022
@farbodahm farbodahm marked this pull request as ready for review August 16, 2022 06:08
@farbodahm farbodahm requested a review from a team as a code owner August 16, 2022 06:08
@farbodahm farbodahm force-pushed the feat/kafka-schema-registry-integration branch 2 times, most recently from 0084d2b to 004cd6f Compare August 16, 2022 06:43
@feng-tao
Copy link
Member

thanks for the contribution!

@feng-tao
Copy link
Member

could you rebase the pr with master?

Signed-off-by: Farbod Ahmadian <farbodahmadian2014@gmail.com>
…tsManager

Signed-off-by: Farbod Ahmadian <farbodahmadian2014@gmail.com>
…s functions

Signed-off-by: Farbod Ahmadian <farbodahmadian2014@gmail.com>
Signed-off-by: Farbod Ahmadian <farbodahmadian2014@gmail.com>
Signed-off-by: Farbod Ahmadian <farbodahmadian2014@gmail.com>
Signed-off-by: Farbod Ahmadian <farbodahmadian2014@gmail.com>
Signed-off-by: Farbod Ahmadian <farbodahmadian2014@gmail.com>
Signed-off-by: Farbod Ahmadian <farbodahmadian2014@gmail.com>
Signed-off-by: Farbod Ahmadian <farbodahmadian2014@gmail.com>
Signed-off-by: Farbod Ahmadian <farbodahmadian2014@gmail.com>
Signed-off-by: Farbod Ahmadian <farbodahmadian2014@gmail.com>
Signed-off-by: Farbod Ahmadian <farbodahmadian2014@gmail.com>
@farbodahm farbodahm force-pushed the feat/kafka-schema-registry-integration branch from 004cd6f to ab22098 Compare August 17, 2022 06:23
@farbodahm
Copy link
Contributor Author

@feng-tao Thank you for your fast response.
Done, I rebased it with master.

@mgorsk1
Copy link
Contributor

mgorsk1 commented Aug 17, 2022

Thanks for awesome PR. Would you consider reusing existing schema registry client https://pypi.org/project/python-schema-registry-client/ instead of writing api calls from scratch?

@farbodahm
Copy link
Contributor Author

Thanks for awesome PR. Would you consider reusing existing schema registry client https://pypi.org/project/python-schema-registry-client/ instead of writing api calls from scratch?

Thank you for your review. I will start rewriting it now with the library that you mentioned.

@farbodahm farbodahm requested a review from mgorsk1 August 17, 2022 12:54
@farbodahm
Copy link
Contributor Author

@mgorsk1 I replaced SchemaRegistryClient instead of requests. Would you mind please review it again?

@farbodahm farbodahm force-pushed the feat/kafka-schema-registry-integration branch 3 times, most recently from 313ca49 to c602b70 Compare August 17, 2022 14:00
Copy link
Contributor

@mgorsk1 mgorsk1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for this refactor, minor nits added otherwise lgtm

databuilder/requirements.txt Outdated Show resolved Hide resolved
Signed-off-by: Farbod Ahmadian <farbodahmadian2014@gmail.com>
@farbodahm
Copy link
Contributor Author

farbodahm commented Aug 18, 2022

@mgorsk1 Done, Would you mind please review it again?
Note: When I force pushed, automatically everyone got mentioned for review, I don't know how 😅 Sorry for that :)

Copy link
Contributor

@mgorsk1 mgorsk1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! I assume it would work with either avro, protobuff and json schemas (as all 3 are possible to be stored in SR)? https://docs.confluent.io/platform/current/schema-registry/serdes-develop/serdes-protobuf.html

@farbodahm
Copy link
Contributor Author

farbodahm commented Aug 18, 2022

@mgorsk1 Thank you.
I fully tested it with Avro format (as it is mainly used with kafka), but that wouldn't be a pain to add others because probably mainly _get_property_type would need small changes to extract the correct schema if they have any difference in schemas.
I think we can merge this as a base and I can setup a development env for other formats and work on the other formats to make sure it fully supports all formats in the upcoming PRs.

@mgorsk1 mgorsk1 merged commit ca4a048 into amundsen-io:main Aug 18, 2022
@boring-cyborg
Copy link

boring-cyborg bot commented Aug 18, 2022

Awesome work, congrats on your first merged pull request!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:all Related to all the project area:databuilder From databuilder folder
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants