Skip to content

Commit

Permalink
Adding --with for CTE (#705)
Browse files Browse the repository at this point in the history
  • Loading branch information
neelasha23 authored Jul 11, 2023
1 parent f56dfab commit 27bef63
Show file tree
Hide file tree
Showing 15 changed files with 253 additions and 244 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
* [Fix] Refactored `ResultSet` to lazy loading (#470)
* [Fix] Removed `WITH` when a snippet does not have a dependency (#657)
* [Fix] Used display module when generating CTE (#649)
* [Fix] Adding `--with` back because of issues with sqlglot query parser (#684)

## 0.7.9 (2023-06-19)

Expand Down
48 changes: 47 additions & 1 deletion doc/compose.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ pip install jupysql matplotlib
```


```{versionchanged} 0.7.8
```{versionchanged} 0.7.10
```

```{note}
Expand Down Expand Up @@ -158,6 +158,52 @@ We can verify the retrieved query returns the same result:
{{final}}
```

#### `--with` argument

JupySQL also allows you to specify the snippet name explicitly by passing the `--with` argument. This is particularly useful when our parsing logic is unable to determine the table name due to dialect variations. For example, consider the below example:

```{code-cell} ipython3
%sql duckdb://
```

```{code-cell} ipython3
%%sql --save first_cte --no-execute
SELECT 1 AS column1, 2 AS column2
```

```{code-cell} ipython3
%%sql --save second_cte --no-execute
SELECT
sum(column1),
sum(column2) FILTER (column2 = 2)
FROM first_cte
```

```{code-cell} ipython3
:tags: [raises-exception]
%%sql
SELECT * FROM second_cte
```

Note that the query fails because the clause `FILTER (column2 = 2)` makes it difficult for the parser to extract the table name. While this syntax works on some dialects like `DuckDB`, the more common usage is to specify `WHERE` clause as well, like `FILTER (WHERE column2 = 2)`.

Now let's run the same query by specifying `--with` argument.

```{code-cell} ipython3
%%sql --with first_cte --save second_cte --no-execute
SELECT
sum(column1),
sum(column2) FILTER (column2 = 2)
FROM first_cte
```

```{code-cell} ipython3
%%sql
SELECT * FROM second_cte
```


## Summary

In the given example, we demonstrated JupySQL's usage as a tool for managing large SQL queries in Jupyter Notebooks. It effectively broke down a complex query into smaller, organized parts, simplifying the process of analyzing a record store's sales database. By using JupySQL, users can easily maintain and reuse their queries, enhancing the overall data analysis experience.
7 changes: 7 additions & 0 deletions doc/plot.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,13 @@ We can see the highest value is a bit over 6, that's expected since we set a 6.3

+++

If you wish to specify the saved snippet explicitly, please use the `--with` argument.
[Click here](../compose) for more details on when to specify `--with` explicitly.

```{code-cell} ipython3
%sqlplot boxplot --table short_trips --column trip_distance --with short_trips
```

## Histogram

To create a histogram, call `%sqlplot histogram`, and pass the name of the table, the column you want to plot, and the number of bins. Similarly to what we did in the [Boxplot](#boxplot) example, JupySQL detects a saved snippet and only plots such data subset.
Expand Down
2 changes: 0 additions & 2 deletions src/sql/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
from .magic import RenderMagic, SqlMagic, load_ipython_extension
from .error_message import SYNTAX_ERROR
from .connection import PLOOMBER_DOCS_LINK_STR

__version__ = "0.7.10dev"
Expand All @@ -9,6 +8,5 @@
"RenderMagic",
"SqlMagic",
"load_ipython_extension",
"SYNTAX_ERROR",
"PLOOMBER_DOCS_LINK_STR",
]
4 changes: 4 additions & 0 deletions src/sql/command.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,10 @@ def __init__(self, magic, user_ns, line, cell) -> None:
if add_alias:
self.parsed["connection"] = self.args.line[0]

if self.args.with_:
final = store.render(self.parsed["sql"], with_=self.args.with_)
self.parsed["sql"] = str(final)

@property
def sql(self):
"""
Expand Down
85 changes: 27 additions & 58 deletions src/sql/error_message.py
Original file line number Diff line number Diff line change
@@ -1,70 +1,39 @@
import sqlglot
import sqlparse

SYNTAX_ERROR = "\nLooks like there is a syntax error in your query."
ORIGINAL_ERROR = "\nOriginal error message from DB driver:\n"
CTE_MSG = (
"If using snippets, you may pass the --with argument explicitly.\n"
"For more details please refer : "
"https://jupysql.ploomber.io/en/latest/compose.html#with-argument"
)


def parse_sqlglot_error(e, q):
def _is_syntax_error(error):
"""
Function to parse the error message from sqlglot
Parameters
----------
e: sqlglot.errors.ParseError, exception
while parsing through sqlglot
q : str, user query
Returns
-------
str
Formatted error message containing description
and positions
Function to detect whether error message from DB driver
is related to syntax error in user query.
"""
err = e.errors
position = ""
for item in err:
position += (
f"Syntax Error in {q}: {item['description']} at "
f"Line {item['line']}, Column {item['col']}\n"
)
msg = "Possible reason: \n" + position if position else ""
return msg
error_lower = error.lower()
return (
"syntax error" in error_lower
or ("catalog error" in error_lower and "does not exist" in error_lower)
or "error in your sql syntax" in error_lower
or "incorrect syntax" in error_lower
or "not found" in error_lower
)


def detail(original_error, query=None):
def detail(original_error):
original_error = str(original_error)
return_msg = SYNTAX_ERROR
if "syntax error" in original_error:
query_list = sqlparse.split(query)
for q in query_list:
try:
q = q.strip()
q = q[:-1] if q.endswith(";") else q
parse = sqlglot.transpile(q)
suggestions = ""
if q.upper() not in [suggestion.upper() for suggestion in parse]:
suggestions += f"Did you mean : {parse}\n"
return_msg = (
return_msg + "Possible reason: \n" + suggestions
if suggestions
else return_msg
)

except sqlglot.errors.ParseError as e:
parse_msg = parse_sqlglot_error(e, q)
return_msg = return_msg + parse_msg if parse_msg else return_msg

return return_msg + "\n" + ORIGINAL_ERROR + original_error + "\n"
if _is_syntax_error(original_error):
return f"{CTE_MSG}\n\n{ORIGINAL_ERROR}{original_error}\n"

if "fe_sendauth: no password supplied" in original_error:
return (
"\nLooks like you have run into some issues. "
"Review our DB connection via URL strings guide: "
"https://jupysql.ploomber.io/en/latest/connecting.html ."
" Using Ubuntu? Check out this guide: "
"https://help.ubuntu.com/community/PostgreSQL#fe_sendauth:_"
"no_password_supplied\n" + ORIGINAL_ERROR + original_error + "\n"
)
specific_error = """\nLooks like you have run into some issues.
Review our DB connection via URL strings guide:
https://jupysql.ploomber.io/en/latest/connecting.html .
Using Ubuntu? Check out this guide: "
https://help.ubuntu.com/community/PostgreSQL#fe_sendauth:_
no_password_supplied\n"""

return f"{specific_error}\n{ORIGINAL_ERROR}{original_error}\n"

return None
79 changes: 47 additions & 32 deletions src/sql/magic.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
from sql.magic_cmd import SqlCmdMagic
from sql._patch import patch_ipython_usage_error
from sql import query_util
from sql.util import get_suggestions_message, show_deprecation_warning
from sql.util import get_suggestions_message, pretty_print
from ploomber_core.dependencies import check_installed

from sql.error_message import detail
Expand Down Expand Up @@ -214,6 +214,35 @@ def check_random_arguments(self, line="", cell=""):
"Unrecognized argument(s): {}".format(check_argument)
)

def _error_handling(self, e, query):
detailed_msg = detail(e)
if self.short_errors:
if detailed_msg is not None:
err = exceptions.UsageError(detailed_msg)
raise err
# TODO: move to error_messages.py
# Added here due to circular dependency issue (#545)
elif "no such table" in str(e):
tables = query_util.extract_tables_from_query(query)
for table in tables:
suggestions = get_close_matches(table, list(self._store))
err_message = f"There is no table with name {table!r}."
# with_message = "Alternatively, please specify table
# name using --with argument"
if len(suggestions) > 0:
suggestions_message = get_suggestions_message(suggestions)
raise exceptions.TableNotFoundError(
f"{err_message}{suggestions_message}"
)
display.message(str(e))
else:
display.message(str(e))
else:
if detailed_msg is not None:
display.message(detailed_msg)
e.modify_exception = True
raise e

@no_var_expand
@needs_local_scope
@line_magic("sql")
Expand Down Expand Up @@ -364,12 +393,17 @@ def interactive_execute_wrapper(**kwargs):

args = command.args

with_ = self._store.infer_dependencies(command.sql_original, args.save)
if with_:
command.set_sql_with(with_)
display.message(f"Generating CTE with stored snippets: {', '.join(with_)}")
if args.with_:
with_ = args.with_
else:
with_ = None
with_ = self._store.infer_dependencies(command.sql_original, args.save)
if with_:
command.set_sql_with(with_)
display.message(
f"Generating CTE with stored snippets : {pretty_print(with_)}"
)
else:
with_ = None

# Create the interactive slider
if args.interact and not is_interactive_mode:
Expand Down Expand Up @@ -405,7 +439,7 @@ def interactive_execute_wrapper(**kwargs):
raw_args = raw_args[1:-1]
args.connection_arguments = json.loads(raw_args)
except Exception as e:
print(e)
display.message(str(e))
raise e
else:
args.connection_arguments = {}
Expand Down Expand Up @@ -458,8 +492,7 @@ def interactive_execute_wrapper(**kwargs):

if not command.sql:
return
if args.with_:
show_deprecation_warning()

# store the query if needed
if args.save:
if "-" in args.save:
Expand Down Expand Up @@ -514,30 +547,12 @@ def interactive_execute_wrapper(**kwargs):
# JA: added DatabaseError for MySQL
except (ProgrammingError, OperationalError, DatabaseError) as e:
# Sqlite apparently return all errors as OperationalError :/
detailed_msg = detail(e, command.sql)
if self.short_errors:
if detailed_msg is not None:
err = exceptions.UsageError(detailed_msg)
raise err
# TODO: move to error_messages.py
# Added here due to circular dependency issue (#545)
elif "no such table" in str(e):
tables = query_util.extract_tables_from_query(command.sql)
for table in tables:
suggestions = get_close_matches(table, list(self._store))
if len(suggestions) > 0:
err_message = f"There is no table with name {table!r}."
suggestions_message = get_suggestions_message(suggestions)
raise exceptions.TableNotFoundError(
f"{err_message}{suggestions_message}"
)
print(e)
else:
print(e)
self._error_handling(e, command.sql)
except Exception as e:
# handle DuckDB exceptions
if "Catalog Error" in str(e):
self._error_handling(e, command.sql)
else:
if detailed_msg is not None:
print(detailed_msg)
e.modify_exception = True
raise e

legal_sql_identifier = re.compile(r"^[A-Za-z0-9#_$]+")
Expand Down
19 changes: 9 additions & 10 deletions src/sql/magic_plot.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,14 +80,21 @@ def execute(self, line="", cell="", local_ns=None):
"Example: %sqlplot histogram"
)

if cmd.args.line[0] not in SUPPORTED_PLOTS + ["hist", "box"]:
plot_str = util.pretty_print(SUPPORTED_PLOTS, last_delimiter="or")
raise exceptions.UsageError(
f"Unknown plot {cmd.args.line[0]!r}. Must be any of: " f"{plot_str}"
)

column = util.sanitize_identifier(column)
table = util.sanitize_identifier(cmd.args.table)

if cmd.args.with_:
util.show_deprecation_warning()
if cmd.args.line[0] in {"box", "boxplot"}:
with_ = cmd.args.with_
else:
with_ = self._check_table_exists(table)

if cmd.args.line[0] in {"box", "boxplot"}:
return plot.boxplot(
table=table,
column=column,
Expand All @@ -96,7 +103,6 @@ def execute(self, line="", cell="", local_ns=None):
conn=None,
)
elif cmd.args.line[0] in {"hist", "histogram"}:
with_ = self._check_table_exists(table)
return plot.histogram(
table=table,
column=column,
Expand All @@ -105,7 +111,6 @@ def execute(self, line="", cell="", local_ns=None):
conn=None,
)
elif cmd.args.line[0] in {"bar"}:
with_ = self._check_table_exists(table)
return plot.bar(
table=table,
column=column,
Expand All @@ -115,19 +120,13 @@ def execute(self, line="", cell="", local_ns=None):
conn=None,
)
elif cmd.args.line[0] in {"pie"}:
with_ = self._check_table_exists(table)
return plot.pie(
table=table,
column=column,
with_=with_,
show_num=cmd.args.show_numbers,
conn=None,
)
else:
plot_str = util.pretty_print(SUPPORTED_PLOTS, last_delimiter="or")
raise exceptions.UsageError(
f"Unknown plot {cmd.args.line[0]!r}. Must be any of: " f"{plot_str}"
)

@staticmethod
def _check_table_exists(table):
Expand Down
Loading

0 comments on commit 27bef63

Please sign in to comment.