Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SELECT * from CTE fails to resolve column lineage #630

Open
skada-coder opened this issue Jun 21, 2024 · 1 comment
Open

SELECT * from CTE fails to resolve column lineage #630

skada-coder opened this issue Jun 21, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@skada-coder
Copy link

skada-coder commented Jun 21, 2024

Describe the bug
Column lineage fails to resolve when referencing a SELECT * from a CTE.

Not sure if this is the same issue as #303 as it looks similar just 1 layer deeper with a secondary select * CTE.

Below is a simplified example, our production scenarios are more complicated but it boils down to this.

SQL

CREATE TABLE MAIN.FOOBAR AS (
    WITH FOO AS (
        SELECT COL1, COL2 FROM FROM MAIN.BAR
    )
    SELECT * FROM FOO
)

Column Lineage fails to resolve columns defined in FOO

To Reproduce

from sqllineage.runner import LineageRunner
sql1 = """
CREATE TABLE MAIN.FOOBAR AS (
    WITH FOO AS (
        SELECT COL1, COL2 FROM FROM MAIN.BAR
    )
    SELECT * FROM FOO
)
"""
LineageRunner(sql1).print_column_lineage()

actual output is

main.foobar.* <- foo.*

I'd expect it to return this instead

main.foobar.col1 <- main.bar.col1
main.foobar.col2 <- main.bar.col2

Python version (available via python --version)

  • 3.11.5

SQLLineage version (available via sqllineage --version):

  • 1.5.3

We noticed the issue using the snowflake dialect for our own production use cases

@skada-coder skada-coder added the bug Something isn't working label Jun 21, 2024
@reata
Copy link
Owner

reata commented Feb 7, 2025

We recently fixed a similar issue with subquery. So if you change the CTE to subquery:

CREATE TABLE MAIN.FOOBAR AS (
    SELECT * FROM (
        SELECT COL1, COL2 FROM MAIN.BAR
    ) AS FOO
);

It will give back what you expected.

I guess we should fix the CTE scenario as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants