Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow introspection when using multiple custom types in a query #530

Closed
tristan opened this issue Jan 21, 2020 · 21 comments · Fixed by #1082
Closed

Slow introspection when using multiple custom types in a query #530

tristan opened this issue Jan 21, 2020 · 21 comments · Fixed by #1082

Comments

@tristan
Copy link

tristan commented Jan 21, 2020

  • asyncpg version: 0.20.1
  • PostgreSQL version: 12.1
  • Do you use a PostgreSQL SaaS? If so, which? Can you reproduce
    the issue with a local PostgreSQL install?
    : postgresql is installed locally on archlinux
  • Python version: 3.7.6 and 3.8.1
  • Platform: Linux 5.4.13-arch1-1
  • Do you use pgbouncer?: no
  • Did you install asyncpg with pip?: yes
  • If you built asyncpg locally, which version of Cython did you use?: N/A
  • Can the issue be reproduced under both asyncio and
    uvloop?
    : yes

I have a few custom types (CREATE TYPE ...) in my database, and I'm running into issues where asyncpg's introspection stage on queries using 2 or more of these types are taking > 1 second to complete.

e.g.

add log_min_duration_statement = 500 to the default postgresql.conf

Create a database test with schema:

CREATE TYPE FOURBIGINTS AS (i0 BIGINT, i1 BIGINT, i2 BIGINT, i3 BIGINT);
CREATE TYPE NUMBERS_AS_WORDS AS ENUM (
    'zero', 'one', 'two', 'three', 'four'
);
CREATE TABLE bigthings (
    thing_id BIGSERIAL PRIMARY KEY,
    num FOURBIGINTS,
    words NUMBERS_AS_WORDS
);

and run the following code:

import asyncio
import asyncpg

async def main():
    con_args = {
        'dsn': 'postgres:///test',
        'ssl': None,
        'min_size': 10,
        'max_size': 10
    }
    pool = await asyncpg.create_pool(**con_args)

    async with pool.acquire() as con:
        await con.execute(
            """\
        INSERT INTO bigthings
        (num, words)
        VALUES ($1::FOURBIGINTS, $2::NUMBERS_AS_WORDS)
        """,
            (0, 0, 0, 0), 'one')

asyncio.run(main())

watching the logs (on my system sudo journalctl -u postgresql -f, will show something like:

2020-01-21 14:45:49.066 CET [118544] LOG:  duration: 1593.511 ms  execute __asyncpg_stmt_2__: WITH RECURSIVE typeinfo_tree(
            oid, ns, name, kind, basetype, has_bin_io, elemtype, elemdelim,
            range_subtype, elem_has_bin_io, attrtypoids, attrnames, depth)
        AS (
            SELECT
...
2020-01-21 14:45:49.066 CET [118544] DETAIL:  parameters: $1 = '{16584,16582}'

I've traced this back to the call to _introspect_types in connection.py.

From a bit of testing, it only happens if there are multiple custom types used in the query. e.g. if i change the query to simply be INSERT INTO bigthings (num) VALUES ($1::FOURBIGINTS), then everything is nice and fast as expected, or if i change the bigthings.words column to a VARCHAR, then there is no problem. But as soon as I include two or more custom types (e.g. 2 enum types or 2 tuple types, or a mix) then I see the slow downs.

Is there anything I can do to either remove the need for this introspection (e.g. giving asyncpg some hints about these types), or maybe I'm doing something wrong that I can correct?

@elprans
Copy link
Member

elprans commented Jan 21, 2020

This is most likely the PostgreSQL's new JIT that is slow on that query. Try turning it off (jit = off).

@tristan
Copy link
Author

tristan commented Jan 22, 2020

Thanks @elprans that does seem to be the issue.

adding jit = off to the postgresql.conf made everything fast again.

Since I can't change the postgresql.conf in my deployed environment right now, I've worked around the issue in my own code for now by adding jit: off to the server_settings when i create my pool:

await asyncpg.create_pool(..., server_settings={'jit': 'off'})

@dmig-alarstudios
Copy link

dmig-alarstudios commented Jan 30, 2020

I've been also bitten by this bug.
Executing a simple query with name = ANY($1::varchar[]) triggers asyncpg to execute that huge introspection query. In some random cases, this query took seconds and even caused a command timeout.
I've been using this library without a pool, connection/disconnecting as needed.

Postgresql 10 and 9.4 -- experienced timeouts only on 9.4.

One more thing: it seems asyncpg doesn't use prepared queries for

SELECT
            t.oid,
            t.typelem     AS elemtype,
            t.typtype     AS kind
        FROM
            pg_catalog.pg_type AS t
            INNER JOIN pg_catalog.pg_namespace ns ON (ns.oid = t.typnamespace)
        WHERE
            t.typname = $1 AND ns.nspname = $2

I see this in logs repeating for each type encoder/decoder registration.

@otsuka
Copy link

otsuka commented May 15, 2020

Hi, @elprans

WITH RECURSIVE typeinfo_tree(...
I would like to know the purpose for which the query is being issued, could you tell me?

I've been having trouble with this query because it's too slow, so if I turned off JIT as you said, the query was faster. But what can I do to stop them from issuing the query?
What part of the SQL is causing issue the query?

@elprans
Copy link
Member

elprans commented May 15, 2020

I would like to know the purpose for which the query is being issued, could you tell me?

Postgres does not send enough information to describe query output at the protocol level. In order to decode complex data types, like composite types, ranges, arrays, etc, asyncpg needs to introspect those types (recursively) to determine what they are. Normally, the introspection is done once per connection, so if you use a pool you only pay the cost once.

@elprans
Copy link
Member

elprans commented May 15, 2020

I see this in logs repeating for each type encoder/decoder registration.

That shouldn't be the case. That query is executed using regular fetchrow() calls. The fact that you see the query in the log does not mean that a prepared statement wasn't used.

@otsuka
Copy link

otsuka commented May 16, 2020

Thank you very much for your answer, @elprans

@commonism
Copy link

To turn of the jit with sqlalchemy 1.4.0b asyncpg …

         self.engine = create_async_engine(
-            connection_string, pool_size=pool_size, **kwargs
+            connection_string, pool_size=pool_size, connect_args={'server_settings':{'jit': 'off'}}, **kwargs
         )

@liorchen
Copy link

or if you're using GINO

db = Gino(
    dsn=config.DB_DSN,
    pool_min_size=config.DB_POOL_MIN_SIZE,
    pool_max_size=config.DB_POOL_MAX_SIZE,
    echo=config.DB_ECHO,
    ssl=config.DB_SSL,
    use_connection_for_request=config.DB_USE_CONNECTION_FOR_REQUEST,
    retry_limit=config.DB_RETRY_LIMIT,
    retry_interval=config.DB_RETRY_INTERVAL,
    kwargs={'server_settings': {'jit': 'off'}}
)

achimnol added a commit to lablup/backend.ai-manager that referenced this issue Mar 24, 2021
…ries

* refs MagicStack/asyncpg#530: apply "jit: off" option to DB connections
  - It is specified in `ai.backend.manager.models.base.pgsql_connect_opts`
* Reuse the same single connection for all GraphQL resolvers and mutation methods
achimnol added a commit to lablup/backend.ai-manager that referenced this issue Mar 24, 2021
* fix: a long-standing transaction error
  - It is now reproducible reliably with the new SQLAlchemy + asyncpg combination!
  - Also fixed a hidden type conversion bug in AgentRegistry.set_kernel_status() due to
    a comma typo....
* fix/test: Update codes for population of DB fixtures
  - Eliminate manual primary key checks for duplicate entries by utilizing
    PostgreSQL's "on conflict" (upsert) support.
  - Fix up using special characters in database passwords by correctly escaping them
    using `urllib.parse.quote_plus()`.
* fix: Do not rely on `rowcount` for SELECT queries
  - rowcount in SQLAlchemy does NOT represent the number of fetched
    rows for SELECT queries, in contrast to the Python's standard DB API.
* fix: excessive DB connection establishment delay and optimize GQL queries
  - refs MagicStack/asyncpg#530: apply "jit: off" option to DB connections
  - It is specified in `ai.backend.manager.models.base.pgsql_connect_opts`
  - Reuse the same single connection for all GraphQL resolvers and mutation methods
* fix: consistently use urlquote for db passwords
* fix: all DB connections are now transactions
* refactor: Finally, rename "dbpool" to "db"
achimnol added a commit to lablup/backend.ai-manager that referenced this issue Mar 24, 2021
* fix: a long-standing transaction error
  - It is now reproducible reliably with the new SQLAlchemy + asyncpg combination!
  - Also fixed a hidden type conversion bug in AgentRegistry.set_kernel_status() due to
    a comma typo....
* fix/test: Update codes for population of DB fixtures
  - Eliminate manual primary key checks for duplicate entries by utilizing
    PostgreSQL's "on conflict" (upsert) support.
  - Fix up using special characters in database passwords by correctly escaping them
    using `urllib.parse.quote_plus()`.
* fix: Do not rely on `rowcount` for SELECT queries
  - rowcount in SQLAlchemy does NOT represent the number of fetched
    rows for SELECT queries, in contrast to the Python's standard DB API.
* fix: excessive DB connection establishment delay and optimize GQL queries
  - refs MagicStack/asyncpg#530: apply "jit: off" option to DB connections
  - It is specified in `ai.backend.manager.models.base.pgsql_connect_opts`
  - Reuse the same single connection for all GraphQL resolvers and mutation methods
* fix: consistently use urlquote for db passwords
* fix: all DB connections are now transactions
* refactor: Finally, rename "dbpool" to "db"

Backported-From: main
Backported-To: 20.09
michaelweiser added a commit to michaelweiser/PeekabooAV that referenced this issue Jan 4, 2022
In order to alleviate the serialisation caused by the single-threaded
REST API frontend server we switch the respective database functions to
asyncio. This allows the server to at least accept more requests while
database operations are in progress.

The works very well with MariaDB.

Postgres shows a slow response on initial operations up to about 15
seconds caused by an autovacuum operation triggered by an introspection
feature of asyncpg (MagicStack/asyncpg#530).
The workaround is to disable the JIT of newer postgresql versions
server-side for the time being.

sqlite flat-out runs into "database is locked" errors. More research is
required here.

We rewrite the database URL to specific asyncio dialects for now. The
plan is to switch to asyncio completely so we can leave them alone in
the end.

The SQLAlchemy dependency is raised to 1.4.24 because this is the first
version to support asyncmy, the new asyncio mysql driver.

Move sleeping for retries out of the critical section protected by
locking to allow for even more parallelism. Remove the lock on analysis
journal retrieval since reads should never conflict with each other. We
need to keep the locking in analysis_add() even with async because
multiple async calls of the routine may be in progress at various stages
of processing and conflict with and possibly deadlock each other,
particularly when using sqlite which will throw 'database is locked'
errors after a timeout. Having a threading and asyncio Lock protect
adding and updating of analyses from threading and asyncio context is
likely to not work as required. The only hope is to switch analysis
update to asyncio as well.

Start to move to the 2.0 SQLAlchemy API using session.execute() and
select(), delete() and update() statements. For the asyncio API this is
requied since session.query() is not supported there. We switch some
non-asyncio users as well while we're at it. This also allows for
reusing of statements across retries.

Start using session context handlers to get rid of explicit session
closing.

The testsuite is updated to match.
michaelweiser added a commit to michaelweiser/PeekabooAV that referenced this issue Jan 5, 2022
In order to alleviate the serialisation caused by the single-threaded
REST API frontend server we switch the respective database functions to
asyncio. This allows the server to at least accept more requests while
database operations are in progress.

The works very well with MariaDB.

Postgres shows a slow response on initial operations up to about 15
seconds caused by an autovacuum operation triggered by an introspection
feature of asyncpg (MagicStack/asyncpg#530).
The workaround is to disable the JIT of newer postgresql versions
server-side for the time being.

sqlite flat-out runs into "database is locked" errors. More research is
required here.

We rewrite the database URL to specific asyncio dialects for now. The
plan is to switch to asyncio completely so we can leave them alone in
the end.

The SQLAlchemy dependency is raised to 1.4.24 because this is the first
version to support asyncmy, the new asyncio mysql driver.

Move sleeping for retries out of the critical section protected by
locking to allow for even more parallelism. Remove the lock on analysis
journal retrieval since reads should never conflict with each other. We
need to keep the locking in analysis_add() even with async because
multiple async calls of the routine may be in progress at various stages
of processing and conflict with and possibly deadlock each other,
particularly when using sqlite which will throw 'database is locked'
errors after a timeout. Having a threading and asyncio Lock protect
adding and updating of analyses from threading and asyncio context is
likely to not work as required. The only hope is to switch analysis
update to asyncio as well.

Start to move to the 2.0 SQLAlchemy API using session.execute() and
select(), delete() and update() statements. For the asyncio API this is
requied since session.query() is not supported there. We switch some
non-asyncio users as well while we're at it. This also allows for
reusing of statements across retries.

Start using session context handlers to get rid of explicit session
closing.

The testsuite is updated to match.
michaelweiser added a commit to michaelweiser/PeekabooAV that referenced this issue Feb 18, 2022
In order to alleviate the serialisation caused by the single-threaded
REST API frontend server we switch the respective database functions to
asyncio. This allows the server to at least accept more requests while
database operations are in progress.

The works very well with MariaDB.

Postgres shows a slow response on initial operations up to about 15
seconds caused by an autovacuum operation triggered by an introspection
feature of asyncpg (MagicStack/asyncpg#530).
The workaround is to disable the JIT of newer postgresql versions
server-side for the time being.

sqlite flat-out runs into "database is locked" errors. More research is
required here.

We rewrite the database URL to specific asyncio dialects for now. The
plan is to switch to asyncio completely so we can leave them alone in
the end.

The SQLAlchemy dependency is raised to 1.4.24 because this is the first
version to support asyncmy, the new asyncio mysql driver.

Move sleeping for retries out of the critical section protected by
locking to allow for even more parallelism. Remove the lock on analysis
journal retrieval since reads should never conflict with each other. We
need to keep the locking in analysis_add() even with async because
multiple async calls of the routine may be in progress at various stages
of processing and conflict with and possibly deadlock each other,
particularly when using sqlite which will throw 'database is locked'
errors after a timeout. Having a threading and asyncio Lock protect
adding and updating of analyses from threading and asyncio context is
likely to not work as required. The only hope is to switch analysis
update to asyncio as well.

Start to move to the 2.0 SQLAlchemy API using session.execute() and
select(), delete() and update() statements. For the asyncio API this is
requied since session.query() is not supported there. We switch some
non-asyncio users as well while we're at it. This also allows for
reusing of statements across retries.

Start using session context handlers to get rid of explicit session
closing.

The testsuite is updated to match.
michaelweiser added a commit to michaelweiser/PeekabooAV that referenced this issue Feb 22, 2022
In order to alleviate the serialisation caused by the single-threaded
REST API frontend server we switch the respective database functions to
asyncio. This allows the server to at least accept more requests while
database operations are in progress.

The works very well with MariaDB.

Postgres shows a slow response on initial operations up to about 15
seconds caused by an autovacuum operation triggered by an introspection
feature of asyncpg (MagicStack/asyncpg#530).
The workaround is to disable the JIT of newer postgresql versions
server-side for the time being.

sqlite flat-out runs into "database is locked" errors. More research is
required here.

We rewrite the database URL to specific asyncio dialects for now. The
plan is to switch to asyncio completely so we can leave them alone in
the end.

The SQLAlchemy dependency is raised to 1.4.24 because this is the first
version to support asyncmy, the new asyncio mysql driver.

Move sleeping for retries out of the critical section protected by
locking to allow for even more parallelism. Remove the lock on analysis
journal retrieval since reads should never conflict with each other. We
need to keep the locking in analysis_add() even with async because
multiple async calls of the routine may be in progress at various stages
of processing and conflict with and possibly deadlock each other,
particularly when using sqlite which will throw 'database is locked'
errors after a timeout. Having a threading and asyncio Lock protect
adding and updating of analyses from threading and asyncio context is
likely to not work as required. The only hope is to switch analysis
update to asyncio as well.

Start to move to the 2.0 SQLAlchemy API using session.execute() and
select(), delete() and update() statements. For the asyncio API this is
requied since session.query() is not supported there. We switch some
non-asyncio users as well while we're at it. This also allows for
reusing of statements across retries.

Start using session context handlers to get rid of explicit session
closing.

The testsuite is updated to match.
michaelweiser added a commit to michaelweiser/PeekabooAV that referenced this issue Mar 3, 2022
In order to alleviate the serialisation caused by the single-threaded
REST API frontend server we switch the respective database functions to
asyncio. This allows the server to at least accept more requests while
database operations are in progress.

The works very well with MariaDB.

Postgres shows a slow response on initial operations up to about 15
seconds caused by an autovacuum operation triggered by an introspection
feature of asyncpg (MagicStack/asyncpg#530).
The workaround is to disable the JIT of newer postgresql versions
server-side for the time being.

sqlite flat-out runs into "database is locked" errors. More research is
required here.

We rewrite the database URL to specific asyncio dialects for now. The
plan is to switch to asyncio completely so we can leave them alone in
the end.

The SQLAlchemy dependency is raised to 1.4.24 because this is the first
version to support asyncmy, the new asyncio mysql driver.

Move sleeping for retries out of the critical section protected by
locking to allow for even more parallelism. Remove the lock on analysis
journal retrieval since reads should never conflict with each other. We
need to keep the locking in analysis_add() even with async because
multiple async calls of the routine may be in progress at various stages
of processing and conflict with and possibly deadlock each other,
particularly when using sqlite which will throw 'database is locked'
errors after a timeout. Having a threading and asyncio Lock protect
adding and updating of analyses from threading and asyncio context is
likely to not work as required. The only hope is to switch analysis
update to asyncio as well.

Start to move to the 2.0 SQLAlchemy API using session.execute() and
select(), delete() and update() statements. For the asyncio API this is
requied since session.query() is not supported there. We switch some
non-asyncio users as well while we're at it. This also allows for
reusing of statements across retries.

Start using session context handlers to get rid of explicit session
closing.

The testsuite is updated to match.
michaelweiser added a commit to scVENUS/PeekabooAV that referenced this issue Mar 9, 2022
In order to alleviate the serialisation caused by the single-threaded
REST API frontend server we switch the respective database functions to
asyncio. This allows the server to at least accept more requests while
database operations are in progress.

The works very well with MariaDB.

Postgres shows a slow response on initial operations up to about 15
seconds caused by an autovacuum operation triggered by an introspection
feature of asyncpg (MagicStack/asyncpg#530).
The workaround is to disable the JIT of newer postgresql versions
server-side for the time being.

sqlite flat-out runs into "database is locked" errors. More research is
required here.

We rewrite the database URL to specific asyncio dialects for now. The
plan is to switch to asyncio completely so we can leave them alone in
the end.

The SQLAlchemy dependency is raised to 1.4.24 because this is the first
version to support asyncmy, the new asyncio mysql driver.

Move sleeping for retries out of the critical section protected by
locking to allow for even more parallelism. Remove the lock on analysis
journal retrieval since reads should never conflict with each other. We
need to keep the locking in analysis_add() even with async because
multiple async calls of the routine may be in progress at various stages
of processing and conflict with and possibly deadlock each other,
particularly when using sqlite which will throw 'database is locked'
errors after a timeout. Having a threading and asyncio Lock protect
adding and updating of analyses from threading and asyncio context is
likely to not work as required. The only hope is to switch analysis
update to asyncio as well.

Start to move to the 2.0 SQLAlchemy API using session.execute() and
select(), delete() and update() statements. For the asyncio API this is
requied since session.query() is not supported there. We switch some
non-asyncio users as well while we're at it. This also allows for
reusing of statements across retries.

Start using session context handlers to get rid of explicit session
closing.

The testsuite is updated to match.
@mukhamux
Copy link

  1. If I use connect_args={'server_settings':{'jit': 'off'}} with connection pool, does first query will generate
    WITH RECURSIVE typeinfo_tree( oid, ns, name, kind, basetype, has_bin_io, elemtype, elemdelim, range_subtype, elem_has_bin_io, attrtypoids, attrnames, depth) AS ( SELECT ...?

  2. I tried disable jit on postgres side but my query again execute after
    WITH RECURSIVE typeinfo_tree( oid, ns, name, kind, basetype, has_bin_io, elemtype, elemdelim, range_subtype, elem_has_bin_io, attrtypoids, attrnames, depth) AS ( SELECT ..., how to completely disable jit?

@elprans
Copy link
Member

elprans commented Sep 14, 2022

how to completely disable jit?

connect(..., server_settings={"jit": "off"})

@mukhamux
Copy link

mukhamux commented Sep 15, 2022

@elprans this doesn't help, every first query i get WITH RECURSIVE typeinfo_tree( oid, ns, name, kind, basetype, has_bin_io, elemtype, elemdelim, range_subtype, elem_has_bin_io, attrtypoids, attrnames, depth) AS ( SELECT ...
I use it in sqlalchemy with:

        engine = create_async_engine(
            url,
            json_serializer=simplejson.dumps,
            json_deserializer=json_deserializer,
            connect_args={'server_settings': {'jit': 'off'}},
            **kwargs,
        )

@mukhamux
Copy link

mukhamux commented Sep 18, 2022

@elprans this doesn't help, every first query i get WITH RECURSIVE typeinfo_tree( oid, ns, name, kind, basetype, has_bin_io, elemtype, elemdelim, range_subtype, elem_has_bin_io, attrtypoids, attrnames, depth) AS ( SELECT ... I use it in sqlalchemy with:

        engine = create_async_engine(
            url,
            json_serializer=simplejson.dumps,
            json_deserializer=json_deserializer,
            connect_args={'server_settings': {'jit': 'off'}},
            **kwargs,
        )

look like any case doesn't work if you don't make
await conn.run_sync(Base.metadata.create_all) after create async engine
can someone tell me is it true or not?

@stalkerg
Copy link

Did somebody bug report to PG maillists?

@stevanmilic
Copy link

Is it possible to perform introspection manually? Without waiting for the query to be performed? Right now we need to manually execute the query in order to cache introspection. jit = off does help with better performance of the introspection queries, but still there's a significant delay for initial query.

@stalkerg
Copy link

stalkerg commented Feb 7, 2023

@stevanmilic agree it will be good to have a way to provide a static schema for asyncpg to avoid such runtime overhead.

@stevanmilic
Copy link

@elprans is it possible to execute SQL query without going through the prepare statement flow? It's rather a blocker for high freq apps to use asyncpg with enums if you must go through introspection, which is really slow.

Something like execute function on connection, but that would return results.

@2-5
Copy link

2-5 commented Feb 9, 2023

I worked around this problem by registering all enums on connection creation:

ENUM_TYPE = (
    ("public", "order_type_t"),
    ("orders", "user_type_t"),
)

async def register_enum_types(conn):
    for schema, typename in ENUM_TYPE:
        await conn.set_builtin_type_codec(typename, schema=schema, codec_name="text")

pool = await asyncpg.create_pool(..., init=register_enum_types)

@elprans
Copy link
Member

elprans commented Feb 13, 2023

I have a WIP patch to automatically disable JIT for the introspection query.

is it possible to execute SQL query without going through the prepare statement flow?

Yes, via execute() and without any arguments. Not sure if useful in your use case.

elprans added a commit that referenced this issue Sep 23, 2023
The misapplication of JIT to asyncpg introspection queries has been a
constant source of user complaints.

Closes: #530
Closes: #1078
Previously: #875, #794, #782, #741, #727 (and probably more).
elprans added a commit that referenced this issue Oct 7, 2023
The misapplication of JIT to asyncpg introspection queries has been a
constant source of user complaints.

Closes: #530
Closes: #1078
Previously: #875, #794, #782, #741, #727 (and probably more).
@saikumar1752
Copy link

@elprans , I see that you have added disabling jit in the function _introspect_types in PR #1082 .
Shouldn't it be done to the function below it as well?

@elprans
Copy link
Member

elprans commented Feb 16, 2024

No, introspect_type is a very simple query and is not affected by the JIT pathology.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.