Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQLachemy error overriding #400

Merged
merged 1 commit into from
May 18, 2023
Merged

Conversation

neelasha23
Copy link

@neelasha23 neelasha23 commented Apr 13, 2023

Describe your changes

Overriding sqlalchemy error handling for the case:

  1. Syntax errors in query (we are using sqlglot to parse the query and prompt possible errors/ suggestions).
  2. Password not provided/ in-accessible for postgres connections
  3. Incorrect URL connection string for DuckDB

Issue number

Closes #229

Checklist before requesting a review


📚 Documentation preview 📚: https://jupysql--400.org.readthedocs.build/en/400/

@neelasha23
Copy link
Author

I have added the Slack message link.
Regarding Point 1

Show a detailed error message - perhaps even show where the query faulted (analyze it on the fly).

I'm doubtful where the query faulted information is available from the underlying DB driver, I couldn't find any relevant resources. It also seems like Sqlite returns all errors as OperationalError. (Syntax errors should be returned as ProgrammingError).

Can we do something like :
Detect the most common errors from the error message, e.g., Syntax error. If Syntax error then provide a link to this page:
https://sqlite.org/lang.html
Similarly for other DBs.

@edublancas @idomic

@idomic
Copy link

idomic commented Apr 13, 2023

@neelasha23
Please update: Closes #X
Also, take a look at the first example: #229
It shows a syntax error as an operations error, perhaps we can see if the error str contains syntax?
I don't think dumping users to this page will be of any help in that case. Maybe some of our already-used packages have validation code? (sqlglot)

src/sql/magic.py Outdated Show resolved Hide resolved
src/sql/magic.py Outdated Show resolved Hide resolved
Copy link

@tonykploomber tonykploomber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor

@neelasha23 neelasha23 marked this pull request as draft April 17, 2023 03:44
@neelasha23 neelasha23 force-pushed the sql_error branch 3 times, most recently from 554d710 to bdd7f71 Compare April 18, 2023 14:28
@neelasha23
Copy link
Author

neelasha23 commented Apr 18, 2023

Have added error handling for the postgres issue mentioned in Asana.
Regarding the syntax errors, I tried out a few syntactically incorrect queries on sqlglot and these are my inferences:

  1. It is able to detect very simple errors like unexpected parantheses, incomplete keyword
  2. It detects typos but the error message is not very informative:
ParseError: Invalid expression / Unexpected token. Line 1, Col: 7.
INSET INTO foo VALUES (1, 'abc')
  1. Doesn't detect missing keywords or incomplete queries.
  2. Doesn't work on complicated queries.

Screenshot 2023-04-19 at 12 03 17 AM

I'm checking if there are any other libraries

@idomic @tonykploomber

@edublancas
Copy link

We'll not find any library that correctly parses all kinds of SQL dialects; this seems the wrong approach. We might end up with too many false negatives, I believe it's better to look for errors raised by the db driver and show complementary information: either a clearer error message or a link to our docs to fix it.

This issue needs better scoping: what errors do we want to catch, and what do we want to do?

@neelasha23
Copy link
Author

This issue needs better scoping: what errors do we want to catch, and what do we want to do?

I was going through stackoverflow posts on SQL errors for different db drivers. There are many errors which are use-case specific , e.g., using SQLAlchemy or Sqlite python client / Errors coming up with specific technologies like Ruby. Most of these errors are not relevant for our scenario. Majority of the problem still seems to be syntax errors (which the was the starting point of this issue), but there doesn't seem to be any library out there which can detect precisely where the syntax error occurred.
I'm unsure of which errors to include in the scope. Do we have any user data that can give some insights into which errors they are facing frequently?

@edublancas

@idomic
Copy link

idomic commented Apr 19, 2023

I assume 80% will be related to the main drivers, sqlalchemy, psycopg2, sqlite3 etc.

What if we cover the basics with sqlglot and just see if people are using it? We won't be able to catch everything (like mentioned above), but covering a big portion is a good resolution. If the research does show that syntax errors are the main issue, let's tackle this one first. I think also out of the types of errors you wrote, a lot of the issues will fall into 1/2. It's always a missing comma, or typos (I think for 2 like Eduardo mentioned we'll need to give better feedback).

@edublancas
Copy link

What if we cover the basics with sqlglot

I don't think we should rely on sqlglot as we've seen that the parser might make mistakes. we don't want to be in a position where we tell the user to "fix their query" when it's a perfectly valid SQL that will run just fine in the database.

a better approach is to let the code run, and only if it fails, proceed with some clear error messages.

@neelasha23
Copy link
Author

Yeah, so we are not not going to parse query with sqlglot initially. There is a try..except block in which the query is being executed. If an Exception raised we show the Slack community link. And only if the error message contains SyntaxError and sqloglot is able to detect some error we show it to the user along with the Slack message. Does this look fine?

@idomic @edublancas

Copy link

@idomic idomic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 comments.

CHANGELOG.md Outdated Show resolved Hide resolved
src/sql/error_message.py Outdated Show resolved Hide resolved
src/sql/error_message.py Outdated Show resolved Hide resolved
@idomic
Copy link

idomic commented Apr 21, 2023

When ready, tag reviewer and mark as ready for review.

@neelasha23 neelasha23 force-pushed the sql_error branch 3 times, most recently from 9a76ef4 to 0d7816c Compare April 21, 2023 19:35
@neelasha23 neelasha23 marked this pull request as ready for review April 21, 2023 19:51
@neelasha23
Copy link
Author

@tonykploomber Did you get a chance to review this?

Copy link

@tonykploomber tonykploomber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LG

@neelasha23 neelasha23 force-pushed the sql_error branch 3 times, most recently from 5e75fef to d87dba1 Compare May 8, 2023 11:50
@yafimvo
Copy link

yafimvo commented May 8, 2023

Added an exception handling when passing incorrect URL strings. This will look like:

I think it looks OK. Do we know if it's a bad URL issue? If so, I would consider removing the original error from the driver since it may confuse the user (he ran %sql duckdb://invalid and the error is referring to connect())

This would involve parsing the query and extracting the table names and column names. Also, missing column names can be used for any clause SELECT, GROUP BY etc.. so we need to cover all these cases and add tests. I think it would be better to handle this in another issue.

I agree

@neelasha23 @idomic

@neelasha23
Copy link
Author

neelasha23 commented May 8, 2023

I think it looks OK. Do we know if it's a bad URL issue? If so, I would consider removing the original error from the driver since it may confuse the user (he ran %sql duckdb://invalid and the error is referring to connect())

No we are trying to connect to the engine and if it throws any exception we raise UsageError with custom Ploomber links.
The original message gives some context host=invalid.

Any thoughts on point 3 here for syntax errors

@yafimvo

@yafimvo
Copy link

yafimvo commented May 8, 2023

Any thoughts on point 3 here for syntax errors

What if we change the order? Our error first, then the driver error, and then the reference to the Slack channel?

@neelasha23
Copy link
Author

What if we change the order? Our error first, then the driver error, and then the reference to the Slack channel?

Can be done . Maybe we should add a statement Original driver error before the driver error?
@idomic @yafimvo

@idomic
Copy link

idomic commented May 9, 2023

This would involve parsing the query and extracting the table names and column names. Also, missing column names can be used for any clause SELECT, GROUP BY etc.. so we need to cover all these cases and add tests. I think it would be better to handle this in another issue.

Yeah, let's keep it for later then.

Ok we can add this text so users know it the original driver error.

@yafimvo
Copy link

yafimvo commented May 9, 2023

Can be done . Maybe we should add a statement Original driver error before the driver error?

Yes, I think it's good and makes the error message more informative. Distinguishing between the errors will allow us to do something like this in the future.

2023-05-09.13-19-47.mp4

@neelasha23 @idomic

@neelasha23
Copy link
Author

Yes, I think it's good and makes the error message more informative. Distinguishing between the errors will allow us to do something like this in the future.

Yes, , nice one.

I have added the message for now and it looks like:
Screenshot 2023-05-09 at 4 05 21 PM

@yafimvo @idomic

@yafimvo
Copy link

yafimvo commented May 9, 2023

I have added the message for now and it looks like:

To keep it consistent with the connection error, can we include it in the UsageError under the same structure?

UsageError: {{Our message}}

{{Suggestions}}

{{Driver error}}

{{slack and docs references}}

@neelasha23
Copy link
Author

To keep it consistent with the connection error, can we include it in the UsageError under the same structure?

Can me done. Let me refactor the code.

@neelasha23 neelasha23 force-pushed the sql_error branch 2 times, most recently from 503b3b0 to 1c15365 Compare May 9, 2023 13:36
@neelasha23
Copy link
Author

neelasha23 commented May 9, 2023

Fixed the error messages as per the discussion with @yafimvo above.

Case1: sqlglot unable to detect any errors/ suggestions
Screenshot 2023-05-09 at 6 38 54 PM

Case2: Syntax error position detected
Screenshot 2023-05-09 at 6 38 46 PM

Case3: Alternate query suggestion
Screenshot 2023-05-09 at 6 38 29 PM

Does this look fine? Any other changes required?
@idomic

@idomic
Copy link

idomic commented May 15, 2023

Yes that looks fine, please fix the conflict and we're good to go.

Test

Lint

changelog

util

moved error_message

sqlglot

conflicts

tests

postgres

Error

changed import

added detail in connection

added detail in connection

test fix

Error msg modified

More tests

changelog

merge conflicts

Review cmts

Exceptions

usageerror

Doc note

changelog

printing ploomber link

printing ploomber link

err msg changed

Review comments

Integration test

Error msg

postgres test

runtimeerror

Empty commit

sqlalchemy version

ci fix test

ci fix test

moved modify_exceptions

skipping failing tests

Disabled modify_exception

fixed tests

revert changes

Revert

xfail

ipython usageerror

comment

comment

Empty commit

Fix

removed skip

win fix

ipython removed

setup

changed exception

drop table

error msg

sq brackets

format

fixed test

Empty-Commit

rebase

func naming

Revert xfail

moved strings

duckdb tests

changelog

File renamed

Integration tests

Lint

Lint

err msg

assert

Generic db

Revert test

revert space

integration

revert

sqlparse

added original msg

tests

original in connection

tests modified

test fix

redundant str
@neelasha23
Copy link
Author

Yes that looks fine, please fix the conflict and we're good to go.

Done
@idomic

@tonykploomber tonykploomber requested a review from yafimvo May 16, 2023 12:46
@tonykploomber
Copy link

Good to me

@tonykploomber
Copy link

@idomic
Is anything you want to review before we merge this one?

@idomic
Copy link

idomic commented May 16, 2023

@tonykploomber no go ahead.

@tonykploomber tonykploomber merged commit f827fcf into ploomber:master May 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Overriding sqlalchemy error handling
5 participants