Java py4j gateway server for trino queries still be open even after query finishes and in later makes queris Hang #3223

divincode · 2023-03-09T07:44:45Z

Is there an existing issue for this?

I have searched the existing issues

Description

The default connector for trino is jdbc as there is no sqlalchemy support as per my knowledge from sqlalchemy community.
When running hue with multiple clients, the jdbc class calls the py4j server, which still exist it does not close after we get the results, this leads to memory usage and hence the queries get slower as time passes and after certain time it still hangs . eg

hue 42923 27.1 1.2 2145458 191580 ? Sl 07:28 0:10 java -classpath /usr/lib/hue/build/env/lib/python2.7/site-packages/py4j-0.9-py2.7.egg/py4j/../share/py4j/py4j0.9.jar:/usr/lib/trino/trino-j
hue 32133 17.3 1.2 2534676 321546 ? Sl 07:30 0:10 java -classpath /usr/lib/hue/build/env/lib/python2.7/site-packages/py4j-0.9-py2.7.egg/py4j/../share/py4j/py4j0.9.jar:/usr/lib/trino/trino-j
hue 21692 17.2 1.2 6928376 216781 ? Sl 07:32 0:10 java -classpath /usr/lib/hue/build/env/lib/python2.7/site-packages/py4j-0.9-py2.7.egg/py4j/../share/py4j/py4j0.9.jar:/usr/lib/trino/trino-j

For executing the trino query - code flow goes like this -

The code flow goes like this in short -


1. There is a api call (notebook/api/execute/{dialect}  -  here is dialect -> trino
2. This call goes through function - get_intepreter() , the interpreter returns a class - notebook.connectors.jdbc.JdbcApi  lass from the file -  /usr/lib/hue/desktop/libs/notebook/src/notebook/connectors/jdbc.py
3. This class will run execute , which calls other class from the file  - desktop/libs/librdbms/src/librdbms/jdbc.py
4. This class uses py4j as wrapper using jdbc connector to run the queries .

I have added various debugging points , to check where the bottleneck is

the bottle neck is this point -

def query_and_fetch(db, statement, n=None):
  data = None
  try:
    db.connect()
    curs = db.cursor()

    try:
      if curs.execute(statement):
        data = curs.fetchmany(n)
      meta = curs.description
      return data, meta
    finally:
      curs.close()
  except Exception as e:
    message = force_unicode(smart_str(e))
    if 'Access denied' in message:
      raise AuthenticationRequired()
    raise
  finally:
    db.close()

data = curs.fetchmany(n) this line is the bottleneck usually.

The issue should be solved if add the following lines in the close fxn of this file in jdbc class -
desktop/libs/librdbms/src/librdbms/jdbc.py

  def close(self):
    if self.conn is not None:
      self.conn.close()
      self.conn = None

#lines to be added
    try:
        self.gateway.shutdown()
    except Exception as e:
        LOG.error(e)

After adding these lines the child process of py4j gets killed. Verified by the ps auxxx and pstree command.

This can be a good first issue to solve to raise pr . Nowadays work on Hadoop so not that band width to raise here . Just trying to contribute to oss. This flow is also same for presto.

@Harshg999 @bjornalm

Regards
Vinay Devadiga

Steps To Reproduce

As stated in descritption use trino with hue , create multiple hue clients and fire huge trino queries . In some time, the py4j servers will take the memory , hence queries get hangs.

Logs

Attached above.

Hue version

Open Source 4.10

The text was updated successfully, but these errors were encountered:

bjornalm · 2023-03-09T08:51:06Z

@divincode Thanks a lot for the detailed report and suggested solution. I've added the Good first issue but will also assign it to one of our core devs according to our new github issue process https://gethue.com/blog/hue-community-2023/

hackeryang · 2024-08-20T08:57:16Z

We also met this error, it will cause py4j jdbc connections leakage, and hue's memory grows up quickly

divincode · 2024-08-20T09:00:09Z

which engine are u using?

hackeryang · 2024-08-20T09:03:36Z

which engine are u using?

We are also using trino 433 jdbc with hue, you mentioned that we can use this code to fix this:

#lines to be added
    try:
        self.gateway.shutdown()
    except Exception as e:
        LOG.error(e)

Thanks a lot, we will try to modify the source code like this~

divincode · 2024-08-20T09:12:43Z

I think this wont work though completely ..

Do try but i think we should use https://github.com/trinodb/trino-python-client ,

make the code changes to use this

def query_and_fetch(db, statement, n=None):
  data = None
  try:
    db.connect()
    curs = db.cursor()

    try:
      if curs.execute(statement):
        data = curs.fetchmany(n)
      meta = curs.description
      return data, meta
    finally:
      curs.close()
  except Exception as e:
    message = force_unicode(smart_str(e))
    if 'Access denied' in message:
      raise AuthenticationRequired()
    raise
  finally:
    db.close()

from trino.dbapi import connect

conn = connect(
    host="<host>",
    port=<port>,
    user="<username>",
    catalog="<catalog>",
    schema="<schema>",
)
cur = conn.cursor()
cur.execute("SELECT * FROM system.runtime.nodes")
rows = cur.fetchall()

use this this should remove py4j competely

zhuzhengjun01 · 2024-08-20T09:28:06Z

@divincode we find this problem
"/usr/share/hue/build/env/lib/python3.8/site-packages/py4j/java_gateway.py", line 980, in _get_connection
raise Py4JNetworkError("Gateway is not connected.")
I will try trino python client again

divincode · 2024-08-21T04:26:36Z

If it works can u please contribute back to the community ? @hackeryang @zhuzhengjun01

zhuzhengjun01 · 2024-08-21T06:01:35Z

If it works can u please contribute back to the community ? @hackeryang @zhuzhengjun01

I tried use shutdown interface in the close function , but it does't work well, py4j gateway did not shutdown correctly

divincode · 2024-08-21T06:03:57Z

No i mean the trino python client, would solve the use case. Please try that.

zhuzhengjun01 · 2024-08-21T06:06:29Z

No i mean the trino python client, would solve the use case. Please try that.

Oh understand, we extended hue jdbc_trino, but use trino java jdbc client, change to python client will be a big change, I will try again

zhuzhengjun01 · 2024-08-21T06:07:30Z

The main thing is that the jdbc connector does not support set sessions, and some of our trino syntax depends on this

zhuzhengjun01 · 2024-08-27T04:10:29Z

No i mean the trino python client, would solve the use case. Please try that.

I have tried using python client to connect to trino, the rss memory is fine, no more memory leaks, I will contribute this connector to the community if needed

divincode · 2024-08-27T07:25:48Z

Nice to hear ! do contribute it back or at least paste the diff , so that it can be used by others.

zhuzhengjun01 · 2024-08-27T07:57:55Z

Nice to hear ! do contribute it back or at least paste the diff , so that it can be used by others.
I added a new PR
#3831

divincode added the BUG Issue type for reporting failure due to bug in functionality label Mar 9, 2023

bjornalm added the good first issue This issue is suitable for beginner developers or first time contributors label Mar 9, 2023

bjornalm assigned nidhibhatg Mar 9, 2023

zhuzhengjun01 mentioned this issue Sep 12, 2024

[feature][trino-python-client] add new trino python client && support trino set session. #3831

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Java py4j gateway server for trino queries still be open even after query finishes and in later makes queris Hang #3223

Java py4j gateway server for trino queries still be open even after query finishes and in later makes queris Hang #3223

divincode commented Mar 9, 2023 •

edited

Loading

bjornalm commented Mar 9, 2023

hackeryang commented Aug 20, 2024

divincode commented Aug 20, 2024

hackeryang commented Aug 20, 2024 •

edited

Loading

divincode commented Aug 20, 2024

zhuzhengjun01 commented Aug 20, 2024

divincode commented Aug 21, 2024

zhuzhengjun01 commented Aug 21, 2024

divincode commented Aug 21, 2024

zhuzhengjun01 commented Aug 21, 2024

zhuzhengjun01 commented Aug 21, 2024

zhuzhengjun01 commented Aug 27, 2024

divincode commented Aug 27, 2024

zhuzhengjun01 commented Aug 27, 2024

Java py4j gateway server for trino queries still be open even after query finishes and in later makes queris Hang #3223

Java py4j gateway server for trino queries still be open even after query finishes and in later makes queris Hang #3223

Comments

divincode commented Mar 9, 2023 • edited Loading

Is there an existing issue for this?

Description

Steps To Reproduce

Logs

Hue version

bjornalm commented Mar 9, 2023

hackeryang commented Aug 20, 2024

divincode commented Aug 20, 2024

hackeryang commented Aug 20, 2024 • edited Loading

divincode commented Aug 20, 2024

zhuzhengjun01 commented Aug 20, 2024

divincode commented Aug 21, 2024

zhuzhengjun01 commented Aug 21, 2024

divincode commented Aug 21, 2024

zhuzhengjun01 commented Aug 21, 2024

zhuzhengjun01 commented Aug 21, 2024

zhuzhengjun01 commented Aug 27, 2024

divincode commented Aug 27, 2024

zhuzhengjun01 commented Aug 27, 2024

divincode commented Mar 9, 2023 •

edited

Loading

hackeryang commented Aug 20, 2024 •

edited

Loading