Destination Postgres: Very poor performance #21557
Labels
area/connectors
Connector related issues
community
connectors/destinations-database
connectors/source/postgres
frozen
Not being actively worked on
releaseStage/alpha
team/destinations
Destinations team's backlog
type/bug
Something isn't working
Environment
Current Behavior
For big tables it is impossible to run the normalization jobs. After a initial backfill that took 33 hours, the next sync which added another ~200000 rows has been running normalization for over 12 hours now.
I am trying with append mode only but deduped + history is even worse.
Expected Behavior
I would like the correct indexes to be created by ingestion and also dbt normalization step.
Steps to Reproduce
Looking at the tables generated and compiled dbt it is clear that there is space for improvements.
_airbyte_ab_id
but the subsequent model that runs the incremental step queries the table using a filter on_airbyte_emitted_at
which means that it will always run a seq scan of the whole table._airbyte_emitted_at
which is good if we want to run any subsequent incremental models using that column, but does not have an index on_airbyte_ab_id
which is the unique key used for updating the table by dbt. The code generated by dbt for incremental models for postgres, runs a delete and then an insert. The delete snippet is below:Without an index on
_airbyte_ab_id
this will too run a seq scan on table.4. Finally, incremental code generated for postgres might not use an index even if there is one because there is extra logic for the query (not sure about this, did not test it):
Not sure how good postgres is in figuring out at run time whether the first part of the coalesce is true or not, because if it can't figure that out during planning, it will also not use an index.
Are you willing to submit a PR?
No.
The text was updated successfully, but these errors were encountered: