Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically inferring dependencies in CTE feature #166

Closed
edublancas opened this issue Feb 24, 2023 · 4 comments · Fixed by #486
Closed

Automatically inferring dependencies in CTE feature #166

edublancas opened this issue Feb 24, 2023 · 4 comments · Fixed by #486
Assignees
Labels
stash Label used to categorize issues that will be worked on next

Comments

@edublancas
Copy link

edublancas commented Feb 24, 2023

Note: this is blocked by #158 - once that's merged, we can use sqlglot to parse SQL and find which tables a given SQL query is using.

Important: our guest blog posts might use the --with feature, so if they use it, we'll need to update them once we remove --with

improving the CTE feature

we should deprecate --with and automatically infer what are the dependencies for a given query.

%load_ext sql
%sql duckdb://

for example, if a user runs this:

%%sql
select *
from some_table

Console output (1/1):

*  duckdb://
(duckdb.CatalogException) Catalog Error: Table with name some_table does not exist!
Did you mean "pg_tables"?
LINE 2: from some_table
             ^
[SQL: select *
from some_table]
(Background on this error at: https://sqlalche.me/e/14/f405)

we should parse the sql query, detect that this is querying a some_table entity. Then check which queries the user has stored and if we find a match, we should automatically add the definition to form the CTE (currently, the user needs to pass --with some_table manually)

this introduces a side-effect problem where users might run--save {something} where {something} is also table in the database; hence masking the table and leading to all select * from something querying the snippet instead of the actual table. To alleviate this, we could print something like:

Forming CTE from saved snippets: A, B, C, and D

So it's clear that these are coming from the locally defined snippets, not from tables. Then, if a user wants to delete a definition (so it no longer masks the table), we could add a command like this:

%sqlcmd snippets something --delete

Keep in mind that there might be snippets that use something as definition, in such case, we should display a warning:

"A", "B", "C" depend on "something". Pass --delete-force to only delete "something", pass --delete-force-all to delete "A", "B", "C", and "something"

Finally, we can offer typo resolution. For example, If I try to run this:

%%sql
select * from somethingo

and something throws an error. we could check the stored snippets and see if there are any close matches, and if so, append something like this to the error raised by the DB driver:

There is a stored snippet called "something". Did you mean to use that one?

@edublancas
Copy link
Author

for the full context, see this notebook: cte-improvements.ipynb.zip

@edublancas edublancas added the stash Label used to categorize issues that will be worked on next label Mar 1, 2023
@sync-by-unito
Copy link

sync-by-unito bot commented Apr 17, 2023

➤ Neelasha Sen commented:

Should I start on this one now or finish Sqlalchemy error overriding

Ido Michael

@neelasha23
Copy link

neelasha23 commented Apr 19, 2023

Acceptance criteria:

  1. Deprecate --with, add deprecation warning
  2. track which queries user has stored and automatically infer
  3. Display clear message Forming CTE from saved snippets: A, B, C, and D
  4. Command for deleting existing definition with options --delete and --delete-force
  5. Typo resolutions
  6. Telemetry call
  7. Unit tests for the above 6 points
  8. Changelog entry
  9. User guide
  10. Remove --with from guest blog posts

Query: What if user actually wants to query a table some_table and there already exists a snippet named some_table which they don't want to delete?

@edublancas @idomic

@idomic
Copy link

idomic commented Apr 19, 2023

Don't forget telemetry into the AC.
You can take this once you're done with the errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stash Label used to categorize issues that will be worked on next
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants