Skip to content

Commit

Permalink
Optimize Column Lineage Query Performance (#2821)
Browse files Browse the repository at this point in the history
* Optimize Column Lineage Query Performance

Signed-off-by: Vinh Nguyen <phuvinh97ag@gmail.com>

* Optimize Column Lineage Query Performance
- Format query
- replace select * with uuid, namespace_name, name

Signed-off-by: Vinh Nguyen <phuvinh97ag@gmail.com>

---------

Signed-off-by: Vinh Nguyen <phuvinh97ag@gmail.com>
Co-authored-by: Peter Hicks <phixMe@users.noreply.github.com>
  • Loading branch information
vinhnemo and phixMe authored Jun 3, 2024
1 parent e54ffca commit 7d0b290
Showing 1 changed file with 18 additions and 3 deletions.
21 changes: 18 additions & 3 deletions api/src/main/java/marquez/db/ColumnLineageDao.java
Original file line number Diff line number Diff line change
Expand Up @@ -185,9 +185,24 @@ SELECT DISTINCT ON (cl.output_dataset_field_uuid, cl.input_dataset_field_uuid) c
ORDER BY output_dataset_field_uuid, input_dataset_field_uuid, updated_at DESC, updated_at
),
dataset_fields_view AS (
SELECT d.namespace_name as namespace_name, d.name as dataset_name, df.name as field_name, df.type, df.uuid
FROM dataset_fields df
INNER JOIN datasets_view d ON d.uuid = df.dataset_uuid
SELECT
d.namespace_name AS namespace_name,
d.name AS dataset_name,
df.name AS field_name,
df.type,
df.uuid
FROM dataset_fields df
INNER JOIN (
SELECT uuid, namespace_name, name
FROM datasets_view
WHERE current_version_uuid IN (
SELECT DISTINCT output_dataset_version_uuid
FROM selected_column_lineage
UNION
SELECT DISTINCT input_dataset_version_uuid
FROM selected_column_lineage
)
) d ON d.uuid = df.dataset_uuid
)
SELECT
output_fields.namespace_name,
Expand Down

0 comments on commit 7d0b290

Please sign in to comment.