Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trino backend fails with Superset 3.0.0 #25518

Closed
hovaesco opened this issue Oct 4, 2023 · 24 comments
Closed

Trino backend fails with Superset 3.0.0 #25518

hovaesco opened this issue Oct 4, 2023 · 24 comments
Assignees

Comments

@hovaesco
Copy link

hovaesco commented Oct 4, 2023

Trino backend fails with the below error while running with Superset 3.0.0. The issue occurs in Preset as well.

Downgrading Superset to 2.1.0 version solves the issue.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/hovaesco/Desktop/repos/superset/.venv/lib/python3.11/site-packages/flask/app.py", line 1484, in full_dispatch_request
    rv = self.dispatch_request()
         ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hovaesco/Desktop/repos/superset/.venv/lib/python3.11/site-packages/flask/app.py", line 1469, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hovaesco/Desktop/repos/superset/.venv/lib/python3.11/site-packages/flask_appbuilder/security/decorators.py", line 95, in wraps
    return f(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hovaesco/Desktop/repos/superset/.venv/lib/python3.11/site-packages/superset/views/base_api.py", line 127, in wraps
    raise ex
  File "/Users/hovaesco/Desktop/repos/superset/.venv/lib/python3.11/site-packages/superset/views/base_api.py", line 121, in wraps
    duration, response = time_function(f, self, *args, **kwargs)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hovaesco/Desktop/repos/superset/.venv/lib/python3.11/site-packages/superset/utils/core.py", line 1526, in time_function
    response = func(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hovaesco/Desktop/repos/superset/.venv/lib/python3.11/site-packages/superset/views/base_api.py", line 93, in wraps
    return f(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hovaesco/Desktop/repos/superset/.venv/lib/python3.11/site-packages/superset/utils/log.py", line 255, in wrapper
    value = f(*args, **kwargs)
            ^^^^^^^^^^^^^^^^^^
  File "/Users/hovaesco/Desktop/repos/superset/.venv/lib/python3.11/site-packages/superset/sqllab/api.py", line 310, in execute_sql_query
    command_result: CommandResult = command.run()
                                    ^^^^^^^^^^^^^
  File "/Users/hovaesco/Desktop/repos/superset/.venv/lib/python3.11/site-packages/superset/sqllab/commands/execute.py", line 121, in run
    raise ex
  File "/Users/hovaesco/Desktop/repos/superset/.venv/lib/python3.11/site-packages/superset/sqllab/commands/execute.py", line 103, in run
    status = self._run_sql_json_exec_from_scratch()
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hovaesco/Desktop/repos/superset/.venv/lib/python3.11/site-packages/superset/sqllab/commands/execute.py", line 161, in _run_sql_json_exec_from_scratch
    raise ex
  File "/Users/hovaesco/Desktop/repos/superset/.venv/lib/python3.11/site-packages/superset/sqllab/commands/execute.py", line 156, in _run_sql_json_exec_from_scratch
    return self._sql_json_executor.execute(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hovaesco/Desktop/repos/superset/.venv/lib/python3.11/site-packages/superset/sqllab/sql_json_executer.py", line 104, in execute
    raise SupersetGenericDBErrorException(
superset.exceptions.SupersetGenericDBErrorException: 'query_id'
@hovaesco
Copy link
Author

hovaesco commented Oct 4, 2023

This might be the root cause cfda30c

@rusackas
Copy link
Member

rusackas commented Oct 4, 2023

@giftig any ideas here, perchance? 🤞

(CC @john-bodley, @sujiplr, @cccs-tom, @villebro, @nytai) from the rolodex as a heads-up)

@hovaesco
Copy link
Author

hovaesco commented Oct 4, 2023

This might be the root cause cfda30c

Yup, switching to one commit pre this one fixes the issue.

@villebro
Copy link
Member

villebro commented Oct 4, 2023

I can reproduce this on master. @giftig I will ping you on Slack to see if we can collab on fixing this quickly.

@villebro
Copy link
Member

villebro commented Oct 5, 2023

Adding a reference to the offending PR: #24859

@giftig
Copy link
Contributor

giftig commented Oct 9, 2023

The mapped SupersetGenericDbErrorException obscures the underlying exception unfortunately but it looks to me like an AttributeError when accessing cursor.query_id, suggesting cursor isn't a trino client cursor like we expect it to be. That would mean execute_with_cursor is being given something other than a valid cursor somehow. I'll try to reproduce the issue and get ahold of the underlying exception today to figure out what's going on there.

@giftig
Copy link
Contributor

giftig commented Oct 9, 2023

I just tried this on the latest master but I'm not able to reproduce the issue; I've tried running against both trino version 403 and 428 (the latest docker image) and I see query results come back fine. I also see the debug logging surrounding the async code reporting correctly:

(supers→) superset-upstream (issue-25518) $ docker-compose logs | grep 'Running query'
superset_app             | 2023-10-09 10:17:20,472:DEBUG:superset.db_engine_specs.trino:Query 1: Running query: SELECT [...]
superset_app             | 2023-10-09 10:17:54,813:DEBUG:superset.db_engine_specs.trino:Query 2: Running query: SELECT [...]
superset_app             | 2023-10-09 10:18:03,445:DEBUG:superset.db_engine_specs.trino:Query 3: Running query: SELECT [...]
(supers→) superset-upstream (issue-25518) $ vim
(supers→) superset-upstream (issue-25518) $ docker-compose logs | grep 'Handling cursor'
superset_app             | 2023-10-09 10:17:20,571:DEBUG:superset.db_engine_specs.trino:Query 1: Handling cursor
superset_app             | 2023-10-09 10:17:54,912:DEBUG:superset.db_engine_specs.trino:Query 2: Handling cursor
superset_app             | 2023-10-09 10:18:03,545:DEBUG:superset.db_engine_specs.trino:Query 3: Handling cursor

This is using the version of the trino client library listed in superset's dependencies, which I installed by listing -e file:.[postgres,hive,trino] in local dependencies. @villebro did you use any other additional steps to reproduce? This seems to be the minimal setup.

I've also tried with async execution both on and off.

@giftig
Copy link
Contributor

giftig commented Oct 9, 2023

I've just tried against the 3.0 branch as well and it's working fine for me there too, so I'm a little stumped how to reproduce this. I'll try to catch up with @villebro when he's available.

@villebro
Copy link
Member

villebro commented Oct 9, 2023

@giftig I was reading through local stacktraces, and I saw this during query execution (I see the same on both async and sync execution):

sqlalchemy.exc.ProgrammingError: (sqlite3.ProgrammingError) SQLite objects created in a thread can only be used in that same thread. The object was created in thread id 6214594560 and this is thread id 8085426944.
[SQL: UPDATE "query" SET extra_json=?, tracking_url=?, changed_on=? WHERE "query".id = ?]
[parameters: [{'tracking_url': 'https://foobar.com/ui/query.html?20231009_144529_02675_g5bvj', 'extra_json': '{"columns": {}, "progress": "Running statement 1 out of 1", "cancel_query": "20231009_144529_02675_g5bvj"}', 'query_id': 1231}]]
(Background on this error at: https://sqlalche.me/e/14/f405)

I'm not able to try this locally with a proper metastore like Postgres, but the fact that it's happening on Preset (which I'm sure isn't running SQLite) would indicate to me that this is not only limited to SQLite.

@giftig
Copy link
Contributor

giftig commented Oct 9, 2023

I haven't tried with SQLite, only with postgres; it looks like the update error you're seeing there is preventing us from saving the query_id, and then query_id is missing from where it's assumed to be present in sql_lab.get_query_backoff_handler, so this'll be the root cause. What's unclear is why it's attempting to update this from a different thread; the code looks like this:

        execute_thread = threading.Thread(target=_execute, args=(execute_result,))
        execute_thread.start()

        # Wait for a query ID to be available before handling the cursor, as
        # it's required by that method; it may never become available on error.
        while not cursor.query_id and not execute_result.get("complete"):
            time.sleep(0.1)

        logger.debug("Query %d: Handling cursor", query.id)
        cls.handle_cursor(cursor, query, session)

and only attempts to run the actual cursor.execute call in a thread, then waits for query_id to appear and continues in the main thread. Hopefully we can dig out more information by looking at the full stack trace of the underlying query failure.

@giftig
Copy link
Contributor

giftig commented Oct 10, 2023

I haven't tried with SQLite, only with postgres

I actually just rechecked and I was wrong to assume the default docker-compose is using postgres; it starts a postgres container but doesn't use it, the default connection string is sqlite. So I was using the default sqlite setup and didn't see any issues. I've tried with postgres as well now, and I still haven't been able to reproduce it locally with either sqlite or postgres unfortunately.

Ville and I did some debugging with the errors he saw but the root cause (the sqlalchemy ProgrammingError quoted above) is odd since we don't run any sqlalchemy queries within our additional thread, it's only executing the trino query and coming back. We'll add some extra error handling / logging to try to figure out where this is happening as the one thing we're missing is the stack trace of the original error.

@giftig
Copy link
Contributor

giftig commented Oct 11, 2023

Hi @hovaesco can you share how you're running superset? OS, python version, docker or native, metastore database type, feature flags, anything else which may be relevant?

Currently I'm running the project's standard docker-compose up under fedora linux on a fresh clone of either 3.0 or master and I can't reproduce the issue, either with postgres or sqlite as my metastore db. @villebro has reproduced a similar issue on master but we're not clear what's different about our setup.

Would someone else be able to attempt to reproduce the issue as well and share any minimal steps required, along with platform information?

@hovaesco
Copy link
Author

@giftig

I cloned the repo and installed it directly from there. I run superset using the below command:

superset run -p 8088 --with-threads --reload --debugger

I'm able to reproduce the issue in sqllab just by running SELECT 1.

MacOS 13.5.1, Python 3.11.5 and Trino running in the other container.

Let me know how else I can assist.

@giftig
Copy link
Contributor

giftig commented Oct 11, 2023

I don't have access to a Mac to see if that's the issue, though I know @villebro is also on a mac. Are you able to try with python 3.9? That's the version the docker image is built against: https://github.com/apache/superset/blob/master/Dockerfile#L21

Alternatively could you try running with docker and see if you still have problems?

You said you're seeing the issue in Preset as well, and I'm guessing that's not going to be in a mac environment, so the most obvious difference is python version, though not sure why this would be making a difference here; the sqlalchemy version should be locked and it's sqlalchemy that's complaining technically sqlalchemy is just forwarding us the error from sqlite3 actually, so that may explain it.

I don't see any version-locking for sqlite3 though; can you check your version? I'm seeing 2.6.0 in the docker build:

>>> sqlite3.version
'2.6.0'

So far we've only been able to see an sqlite-specific error on @villebro 's setup so I'm also curious if there are different errors on your Preset setup, which presumably has a production backend like postgres.

@villebro
Copy link
Member

I'm on 3.10. Let me try with 3.9 to see if that changes things.

@villebro
Copy link
Member

Update: I see the same issue on 3.9 🙁

@giftig
Copy link
Contributor

giftig commented Oct 12, 2023

Can you check your sqlite3 version @villebro? And are you running this natively on your mac?

@villebro
Copy link
Member

@giftig it's the same as you:

>>> import sqlite3
>>> sqlite3.version
'2.6.0'

I'm running on Mac (I haven't been able to test on linux/k8s yet)

@giftig
Copy link
Contributor

giftig commented Oct 17, 2023

Hi @hovaesco, I've raised a new PR to reinstate the patch and add a small config fix for sqlite to resolve the issue you reported here, do you have time to try running with this change? If you're overriding SQLALCHEMY_DATABASE_URI make sure you have the ?check_same_thread=false config I've added to the default; that should solve the problem.

#25680

@hovaesco
Copy link
Author

@giftig I've just tested and it works fine ! thank you for the fix

@devery59
Copy link

devery59 commented Dec 4, 2023

Hi I have similar issue about trino Connection.
When i test Database Connection it looks fine. But when i query choose trino database, the Error occured.

Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1823, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/usr/local/lib/python3.9/site-packages/flask_appbuilder/security/decorators.py", line 95, in wraps
    return f(self, *args, **kwargs)
  File "/app/superset/views/base_api.py", line 127, in wraps
    raise ex
  File "/app/superset/views/base_api.py", line 121, in wraps
    duration, response = time_function(f, self, *args, **kwargs)
  File "/app/superset/utils/core.py", line 1526, in time_function
    response = func(*args, **kwargs)
  File "/app/superset/views/base_api.py", line 93, in wraps
    return f(self, *args, **kwargs)
  File "/app/superset/utils/log.py", line 255, in wrapper
    value = f(*args, **kwargs)
  File "/app/superset/sqllab/api.py", line 310, in execute_sql_query
    command_result: CommandResult = command.run()
  File "/app/superset/sqllab/commands/execute.py", line 121, in run
    raise ex
  File "/app/superset/sqllab/commands/execute.py", line 103, in run
    status = self._run_sql_json_exec_from_scratch()
  File "/app/superset/sqllab/commands/execute.py", line 161, in _run_sql_json_exec_from_scratch
    raise ex
  File "/app/superset/sqllab/commands/execute.py", line 156, in _run_sql_json_exec_from_scratch
    return self._sql_json_executor.execute(
  File "/app/superset/sqllab/sql_json_executer.py", line 111, in execute
    raise SupersetErrorsException(
superset.exceptions.SupersetErrorsException: [SupersetError(message="trino error: 'Cursor' object has no attribute 'query_id'", error_type=<SupersetErrorType.GENERIC_DB_ENGINE_ERROR: 'GENERIC_DB_ENGINE_ERROR'>, level=<ErrorLevel.ERROR: 'error'>, extra={'engine_name': 'Trino', 'issue_codes': [{'code': 1002, 'message': 'Issue 1002 - The database returned an unexpected error.'}]})]

I'm trying to upgrade superset version(1.4 to 3.0.2), there is no other issue.
This is my version

Superset version : 3.0.2
DB_host : AWS RDS(Postgresql)
Deployment : EKS using Helm

Is any advice for me? Thanks

@giftig
Copy link
Contributor

giftig commented Dec 4, 2023

@devery59 try checking you're using the recommended trino driver version. Your error complains that 'Cursor' object has no attribute 'query_id'", but it does have that attribute in 0.324.0, the version used by superset: https://github.com/trinodb/trino-python-client/blob/11dd0bc5b136e3f943712fd3c6f1f9ac8d4b0c37/trino/dbapi.py#L350

@devery59
Copy link

devery59 commented Dec 4, 2023

@giftig You're right! That error is trino version issue. Now I'm fine after edit my version.
Thanks for your help!

@rusackas
Copy link
Member

rusackas commented Feb 1, 2024

From the sound of it, we're all well and good here? Hope I'm not being too optimistic, but I'm closing this one. Maybe we can regroup on a new issue if there's more to be done here.

@rusackas rusackas closed this as completed Feb 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants