-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model and store column lineage in Marquez DB #2096
Conversation
Signed-off-by: mzareba <mzareba382@gmail.com>
Signed-off-by: mzareba <mzareba382@gmail.com>
Signed-off-by: mzareba <mzareba382@gmail.com>
Thanks for opening your first pull request in the Marquez project! Please check out our contributing guidelines (https://github.com/MarquezProject/marquez/blob/main/CONTRIBUTING.md). |
Signed-off-by: mzareba <mzareba382@gmail.com>
…o DatasetRecord, write test for createLineageRow() invocation Signed-off-by: mzareba <mzareba382@gmail.com>
Signed-off-by: mzareba <mzareba382@gmail.com>
Signed-off-by: mzareba <mzareba382@gmail.com>
Signed-off-by: mzareba <mzareba382@gmail.com>
Signed-off-by: mzareba <mzareba382@gmail.com>
Signed-off-by: mzareba <mzareba382@gmail.com>
Signed-off-by: mzareba <mzareba382@gmail.com>
Codecov Report
@@ Coverage Diff @@
## main #2096 +/- ##
============================================
+ Coverage 75.49% 75.78% +0.29%
- Complexity 1045 1061 +16
============================================
Files 206 209 +3
Lines 4925 5006 +81
Branches 399 403 +4
============================================
+ Hits 3718 3794 +76
Misses 763 763
- Partials 444 449 +5
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Signed-off-by: mzareba <mzareba382@gmail.com>
Signed-off-by: mzareba <mzareba382@gmail.com>
Signed-off-by: mzareba <mzareba382@gmail.com>
855f4fd
to
bfd7555
Compare
Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
…marquez into add-column-level-lineage
@mzareba382 I was wondering if this PR is now outdated? /cc @mobuchowski, @pawel-big-lebowski |
2c5151b
to
e18c13e
Compare
Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
e18c13e
to
21dac22
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work on modeling the column_lineage
table and separating the column-level lineage facet into a separate table (similar to the design proposed in #2076). Also great to see extensive testing around this feature! 💯 🥇
Great job! Congrats on your first merged pull request in the Marquez project! |
Problem & solution
This PR adds a column-level lineage representation and API endpoint to retrieve column-level lineage data from Marquez's database. It is based on OpenLineage's column-level lineage facet
Column-level lineage data is being stored in separate table with following fields:
Relevant tickets are:
Solution:
Checklist
CHANGELOG.md
with details about your change under the "Unreleased" section (if relevant, depending on the change, this may not be necessary).sql
database schema migration according to Flyway's naming convention (if relevant)