-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: Django's table introspection query is slow (seconds to minutes, depending on number of tables) #57924
Comments
Hello, I am Blathers. I am here to help you get the issue triaged. I have CC'd a few people who may be able to assist you:
If we have not gotten back to your issue within a few business days, you can try the following:
🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
A starting point here would be to add a benchmark to bench/ddl_analysis (to be renamed) and see how many KV roundtrips are happening. |
I added benchmarks in ddl_analysis and tested with increasing number of tables:
So it seems that the number of roundtrips scales linearly with the number of tables in the cluster. Also, the fact that it does 278 roundtrips even when there is just one table seems like a lot. |
Here's the explain plan
I also captured a trace of running the query. I think part of the issue is that it's running the cockroach/pkg/sql/sem/builtins/pg_builtins.go Lines 1079 to 1089 in d86781c
I don't know if there's a way to reorder that filter condition ( cc @jordanlewis @rytaft tagging you since you're on-call at the moment -- I was wondering:
|
Yes, we can certainly do this, though I'm not completely sure whether it would be sufficiently useful if the virtual index didn't also force you to specify the database and schema name. Perhaps a more compelling way to fix this would be to make |
But ultimately that will still cause a descriptor to be fetched once for each table in the database right -- not so different than what ends up happening now? Or do you mean that doing it this way will allow us to add more caching? I'm included the statement bundle since I forgot to attach it before. |
If we use the right kind of resolution, it will use leases (aka the cache). |
We don't currently support re-ordering filters in a single filter statement. But we could increase the cost of cockroach/pkg/sql/opt/xform/coster.go Line 154 in 3db4737
|
I agree @jordanlewis 's suggestion to just make pg_table_is_visible less expensive seems better. Thanks for the pointer though @rytaft. I will play around with the costing to see if that does anything good for us here. In the meantime, until we fix this within the DB, @timgraham could you update django-cockroachdb to make this query instead:
I believe it is functionally equivalent, and it will avoid all of these extra lookups. My benchmark shows this only makes 4 KV roundtrips, no matter how many tables are in the database. |
It seems like some refactoring will be needed to make that happen. the separately, i was wondering why the |
Discussed in this internal thread: https://cockroachlabs.slack.com/archives/C0168LW5THS/p1611947857055600 The conclusion was to try to grow the The issue is that there are around ~20 builtins that use the internal executor, so all of those might also have a similar problem as |
What is your situation?
Django uses this query for table introspection:
Observed performance
The query takes 2-5 seconds with tens of tables or several minutes with hundreds of tables. The Django test suite issues the query around 175 times totaling around 8 minutes of the test suite's 68 minutes run time.
Also, it's currently infeasible to run Django's test suite all at once because the query takes about 4 minutes if the ~1400 tables for all of the test suite are present.
Build Tag: v21.1.0-alpha.1-289-g960b4cfc54
Build Time: 2020/12/12 06:45:40
Build Commit ID: 960b4cf
Requested resolution
The text was updated successfully, but these errors were encountered: