Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(sql): remove temporary table creation when using inline sql #8149

Merged
merged 6 commits into from
Jan 31, 2024

Conversation

cpcloud
Copy link
Member

@cpcloud cpcloud commented Jan 30, 2024

This PR fixes a long-standing annoyance with our .sql methods. Previously we pooped a bunch of temporary tables or views (depending on the backend), but after this PR the various .sql methods replace this hack with much more vanilla CTEs.

@cpcloud cpcloud added refactor Issues or PRs related to refactoring the codebase ux User experience related issues sql Backends that generate SQL tes-required-for-release Things that must be addressed before *release* of `main` after merging in `the-epic-split` labels Jan 30, 2024
@cpcloud cpcloud requested a review from kszucs January 30, 2024 11:51
@cpcloud cpcloud force-pushed the clean-up-temp-table-poo branch 5 times, most recently from bbfee93 to 6041191 Compare January 30, 2024 12:09
@cpcloud
Copy link
Member Author

cpcloud commented Jan 30, 2024

@ssanderson I was finally able to clean this up ... years later 😂

@@ -157,7 +157,9 @@ def extract_ctes(node):

g = Graph.from_bfs(node, filter=(ops.Relation, ops.Subquery, ops.JoinLink))
for node, dependents in g.invert().items():
if len(dependents) > 1 and isinstance(node, cte_types):
if isinstance(node, ops.View) or (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we avoid having this special case here? IIC it is only to force turning a View into a CTE.

Without this the generated SQL would be a subselect, right?

Copy link
Member Author

@cpcloud cpcloud Jan 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this change a cte won't be created, and View must compile to a cte so it's visible by any subsequent query.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider the case of joining two aliased ibis expressions. The aliases need to be visible at the top level for a subsequent join that uses them. An aliased subselect would hide all aliases except the immediate parent. AFAIK only CTEs allow defining multiple named relations with the same scope.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code still confuses me, you add View here to the list of the relations which should be wrapped with a CTE, but you exclude the same type from the dict comprehension below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm adding it to the list to be able to generate CTEs later, the ibis CTE object is only used in a call to replace, the dict of which filters out the View objects. There's are a few separate components of that process, that are independent of this PR:

  1. The extraction of things that should become CTEs
  2. The replacement of existing nodes with those CTEs
  3. The construction of sqlglot CTEs based on step 1

I am reusing steps 1 and 3 for this PR, but operating on Views.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basically we want to append the ops.View objects to the list of CTEs. That should be done by ctest + node.find(ops.View) instead, and outside of this function but rather at the compiler.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list needs to be assembled in topological order, we can't assemble the View CTEs separately from the others.

@cpcloud cpcloud force-pushed the clean-up-temp-table-poo branch 3 times, most recently from 953de2b to 6087e97 Compare January 30, 2024 20:04
@cpcloud cpcloud requested a review from kszucs January 30, 2024 20:26
@cpcloud cpcloud added this to the 9.0 milestone Jan 31, 2024
@kszucs kszucs force-pushed the clean-up-temp-table-poo branch from de35024 to 24effc3 Compare January 31, 2024 02:24
@cpcloud cpcloud force-pushed the clean-up-temp-table-poo branch from 24effc3 to b7c8e87 Compare January 31, 2024 09:34
@cpcloud cpcloud enabled auto-merge (squash) January 31, 2024 09:46
@cpcloud cpcloud merged commit f2f1158 into ibis-project:the-epic-split Jan 31, 2024
55 checks passed
@cpcloud cpcloud deleted the clean-up-temp-table-poo branch January 31, 2024 10:29
kszucs added a commit to kszucs/ibis that referenced this pull request Feb 1, 2024
…bis-project#8149)

This PR fixes a long-standing annoyance with our `.sql` methods.
Previously we pooped a bunch of temporary tables or views (depending on
the backend), but after this PR the various `.sql` methods replace this
hack with much more vanilla CTEs.

---------

Co-authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
kszucs added a commit to kszucs/ibis that referenced this pull request Feb 1, 2024
…bis-project#8149)

This PR fixes a long-standing annoyance with our `.sql` methods.
Previously we pooped a bunch of temporary tables or views (depending on
the backend), but after this PR the various `.sql` methods replace this
hack with much more vanilla CTEs.

---------

Co-authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
kszucs added a commit to kszucs/ibis that referenced this pull request Feb 1, 2024
…bis-project#8149)

This PR fixes a long-standing annoyance with our `.sql` methods.
Previously we pooped a bunch of temporary tables or views (depending on
the backend), but after this PR the various `.sql` methods replace this
hack with much more vanilla CTEs.

---------

Co-authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
kszucs added a commit to kszucs/ibis that referenced this pull request Feb 2, 2024
…bis-project#8149)

This PR fixes a long-standing annoyance with our `.sql` methods.
Previously we pooped a bunch of temporary tables or views (depending on
the backend), but after this PR the various `.sql` methods replace this
hack with much more vanilla CTEs.

---------

Co-authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
kszucs added a commit to kszucs/ibis that referenced this pull request Feb 2, 2024
…bis-project#8149)

This PR fixes a long-standing annoyance with our `.sql` methods.
Previously we pooped a bunch of temporary tables or views (depending on
the backend), but after this PR the various `.sql` methods replace this
hack with much more vanilla CTEs.

---------

Co-authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
kszucs added a commit to kszucs/ibis that referenced this pull request Feb 2, 2024
…bis-project#8149)

This PR fixes a long-standing annoyance with our `.sql` methods.
Previously we pooped a bunch of temporary tables or views (depending on
the backend), but after this PR the various `.sql` methods replace this
hack with much more vanilla CTEs.

---------

Co-authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
cpcloud added a commit to cpcloud/ibis that referenced this pull request Feb 4, 2024
…bis-project#8149)

This PR fixes a long-standing annoyance with our `.sql` methods.
Previously we pooped a bunch of temporary tables or views (depending on
the backend), but after this PR the various `.sql` methods replace this
hack with much more vanilla CTEs.

---------

Co-authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
cpcloud added a commit to cpcloud/ibis that referenced this pull request Feb 5, 2024
…bis-project#8149)

This PR fixes a long-standing annoyance with our `.sql` methods.
Previously we pooped a bunch of temporary tables or views (depending on
the backend), but after this PR the various `.sql` methods replace this
hack with much more vanilla CTEs.

---------

Co-authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
kszucs added a commit that referenced this pull request Feb 5, 2024
…8149)

This PR fixes a long-standing annoyance with our `.sql` methods.
Previously we pooped a bunch of temporary tables or views (depending on
the backend), but after this PR the various `.sql` methods replace this
hack with much more vanilla CTEs.

---------

Co-authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
kszucs added a commit that referenced this pull request Feb 6, 2024
…8149)

This PR fixes a long-standing annoyance with our `.sql` methods.
Previously we pooped a bunch of temporary tables or views (depending on
the backend), but after this PR the various `.sql` methods replace this
hack with much more vanilla CTEs.

---------

Co-authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
kszucs added a commit that referenced this pull request Feb 6, 2024
…8149)

This PR fixes a long-standing annoyance with our `.sql` methods.
Previously we pooped a bunch of temporary tables or views (depending on
the backend), but after this PR the various `.sql` methods replace this
hack with much more vanilla CTEs.

---------

Co-authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
cpcloud added a commit to cpcloud/ibis that referenced this pull request Feb 12, 2024
…bis-project#8149)

This PR fixes a long-standing annoyance with our `.sql` methods.
Previously we pooped a bunch of temporary tables or views (depending on
the backend), but after this PR the various `.sql` methods replace this
hack with much more vanilla CTEs.

---------

Co-authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
cpcloud added a commit that referenced this pull request Feb 12, 2024
…8149)

This PR fixes a long-standing annoyance with our `.sql` methods.
Previously we pooped a bunch of temporary tables or views (depending on
the backend), but after this PR the various `.sql` methods replace this
hack with much more vanilla CTEs.

---------

Co-authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
cpcloud added a commit to cpcloud/ibis that referenced this pull request Feb 12, 2024
…bis-project#8149)

This PR fixes a long-standing annoyance with our `.sql` methods.
Previously we pooped a bunch of temporary tables or views (depending on
the backend), but after this PR the various `.sql` methods replace this
hack with much more vanilla CTEs.

---------

Co-authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
cpcloud added a commit that referenced this pull request Feb 12, 2024
…8149)

This PR fixes a long-standing annoyance with our `.sql` methods.
Previously we pooped a bunch of temporary tables or views (depending on
the backend), but after this PR the various `.sql` methods replace this
hack with much more vanilla CTEs.

---------

Co-authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
kszucs added a commit that referenced this pull request Feb 12, 2024
…8149)

This PR fixes a long-standing annoyance with our `.sql` methods.
Previously we pooped a bunch of temporary tables or views (depending on
the backend), but after this PR the various `.sql` methods replace this
hack with much more vanilla CTEs.

---------

Co-authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
ncclementi pushed a commit to ncclementi/ibis that referenced this pull request Feb 21, 2024
…bis-project#8149)

This PR fixes a long-standing annoyance with our `.sql` methods.
Previously we pooped a bunch of temporary tables or views (depending on
the backend), but after this PR the various `.sql` methods replace this
hack with much more vanilla CTEs.

---------

Co-authored-by: Krisztián Szűcs <szucs.krisztian@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
refactor Issues or PRs related to refactoring the codebase sql Backends that generate SQL tes-required-for-release Things that must be addressed before *release* of `main` after merging in `the-epic-split` ux User experience related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants