Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unchecking 'Allow DML' returns incorrect errors and blocks more than DML statements #20565

Closed
3 tasks done
yeachan153 opened this issue Jun 30, 2022 · 2 comments
Closed
3 tasks done
Labels
#bug Bug report

Comments

@yeachan153
Copy link
Contributor

yeachan153 commented Jun 30, 2022

The button Allow DML when unchecked blocks quite a bit more than DML. For example, DDL statements are blocked, as are non-modifying statements like DESCRIBE. It also returns some confusing errors, for example running a query with incorrect syntax:

with a as(
  select *
  from bla
),
select * from a

or

DESCRIBE table X

returns the error:
image

How to reproduce the bug

  1. Untick Allow DML in your DB setting
  2. Run queries above on a table that exists

Expected results

  • Unticking allow DML should block DML statements only. Alternatively, if we want to block any kind of edit access, we should rename that in the front end to something else.
  • Queries with syntax errors should return with an error highlighting there's an issue with the syntax (either from the DB or superset), instead of stating only select queries are allowed

Environment

  • browser type and version: chrome
  • superset version: latest & 1.4.1
  • python version: 3.8.12
  • postgres version: 13.4

Checklist

Make sure to follow these steps before submitting your issue - thank you!

  • I have checked the superset logs for python stacktraces and included it here as text if there are any.
  • I have reproduced the issue with at least the latest released version of superset.
  • I have checked the issue tracker for the same issue and I haven't found one similar.

Additional context

Error gets raised here by the check here. Examining the is_select method here, we rely on sqlparse's get_type method, which is implemented here.

We can see a few things:

  • The get_type method also checks up front whether the SQL syntax is correct, and if not, returns it as UNKNOWN. This is what is happening with the first query, because the token that follows the subquery is punctuation, not DML. This should probably be separate methods, one to check the type, one to check the syntax in sqlparse.
  • Running a DESCRIBE table X returns an 'UNKNOWN' type from that method, so that gets blocked from running
  • Unchecking Allow DML blocks far more than just DML statements, it will also block DDL statements for instance.

Proposed direction

I'm unsure whether we would want to support syntax validation in Superset, or if that's the responsibility of the DB engine. If we want to push the responsibility down to the engine for snytax, then we can replace the get_type method by just checking each token, and whether they contain DML or DDL. For example something like:

if not database.allow_dml:
    for cur_query in parsed_query._parsed:
        for token in cur_query.tokens:
            if token.ttype == DML and token.value.lower() != "select":
                raise SupersetErrorException(
                    SupersetError(
                        message=__(f"DML statement `{token.value}` found in query: {str(cur_query)}"),
                        error_type=SupersetErrorType.DML_NOT_ALLOWED_ERROR,
                        level=ErrorLevel.ERROR,
                    )
                )
if not database.allow_ddl:
    elif token.ttype == DDL:
        raise SupersetErrorException(
            SupersetError(
                message=__(f"DDL statement `{token.value}` found in query: {str(cur_query)}"),
                error_type=SupersetErrorType.DDL_NOT_ALLOWED_ERROR,
                level=ErrorLevel.ERROR,
            )
        )

or if the idea behind the feature was to block any kind of modification, we should replace that button with something like "database.allow_modifications" or something along those lines and block both DDL & DML in the same loop.

If we do want to support syntax validation in Superset, that might be difficult since every DB has different syntax. In any case, it probably shouldn't happen in get_type since that results in confusing errors for the end user.

@yeachan153 yeachan153 added the #bug Bug report label Jun 30, 2022
@yeachan153 yeachan153 changed the title 'Allow DML' returns confusing errors and blocks more than DML statements Unchecking 'Allow DML' returns incorrect errors and blocks more than DML statements Jun 30, 2022
@fcomuniz
Copy link
Contributor

fcomuniz commented Dec 8, 2022

I investigated a similar issue, and it seems to come from the check if the statement is select
On superset.sql_parse.ParsedQuery.is_select function, for the presented query, the type of the query is Unknown, and goes untl the check

if any(token.ttype == DDL for token in parsed[0]) or any(
      token.ttype == DML and token.value != "SELECT" for token in parsed[0]
):
    return False

Which checks for the token.value !="SELECT", which means it only checks for the same casing, so a lowercase "select" will incorrectly return a False for the is_select function

i believe the check should be token.normalized != "SELECT"

@rusackas
Copy link
Member

I'm assuming that the linked PR should have closed this. Correct me if I'm misinterpreting!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
#bug Bug report
Projects
None yet
Development

No branches or pull requests

3 participants