Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Slow query performance(?) in list users API due to naive join to profiles table #14254

Open
DMRobertson opened this issue Oct 21, 2022 · 2 comments
Labels
A-Admin-API A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db A-Performance Performance, both client-facing and admin-facing O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Minor Blocks non-critical functionality, workarounds exist. T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.

Comments

@DMRobertson
Copy link
Contributor

Noticed in https://github.com/matrix-org/synapse/pull/14205/files#r1001678137

The join condition is users.name = '@' || profiles.user_id || ':' || 'matrix.example.com'. This isn't of the form profiles.user_id = X, and so postgres can't use the index on profiles.user_id to to lookup a profiles row.

I think the condition

    LEFT JOIN profiles AS p ON p.user_id = substring(u.name from ('[^@:]+'))

might work here. It seems to be standard sql (though I haven't been able to find a reference for exactly how that regex is matched). In my testing, the proposed query performs/is planned much better:

matrix=> EXPLAIN ANALYZE SELECT
    name, user_type, is_guest, admin, deactivated, shadow_banned,
    displayname, avatar_url, creation_ts * 1000 as creation_ts, approved
FROM users as u
    LEFT JOIN profiles AS p ON p.user_id = substring(u.name from ('[^@:]+'))
    LEFT JOIN erased_users AS eu ON u.name = eu.user_id
ORDER BY u.name ASC
LIMIT 1000;
                                                                     QUERY PLAN                                                               
       
══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
═══════
 Limit  (cost=1.13..668.34 rows=1000 width=148) (actual time=0.068..11.325 rows=1000 loops=1)
   ->  Nested Loop Left Join  (cost=1.13..25417598.82 rows=38095570 width=148) (actual time=0.067..11.150 rows=1000 loops=1)
         ->  Index Scan using users_name_key on users u  (cost=0.56..2012173.49 rows=38095570 width=72) (actual time=0.023..1.248 rows=1000 lo
ops=1)
         ->  Index Scan using profiles_user_id_key on profiles p  (cost=0.57..0.61 rows=1 width=64) (actual time=0.006..0.006 rows=1 loops=100
0)
               Index Cond: (user_id = "substring"(u.name, '[^@:]+'::text))
 Planning time: 0.327 ms
 Execution time: 11.543 ms
(7 rows)

Compared to the old one:

matrix=> EXPLAIN ANALYZE SELECT
matrix->     name, user_type, is_guest, admin, deactivated, shadow_banned,
matrix->     displayname, avatar_url, creation_ts * 1000 as creation_ts, approved
matrix-> FROM users as u
matrix->     LEFT JOIN profiles AS p ON u.name = '@' || p.user_id || ':' || 'matrix.org'
matrix->     LEFT JOIN erased_users AS eu ON u.name = eu.user_id
matrix-> ORDER BY u.name ASC
matrix-> LIMIT 1000;
                                                                   QUERY PLAN                                                                 
  
══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
══
 Limit  (cost=5884726.81..5884729.31 rows=1000 width=148) (actual time=48806.860..48806.961 rows=1000 loops=1)
   ->  Sort  (cost=5884726.81..5979965.74 rows=38095570 width=148) (actual time=48806.858..48806.913 rows=1000 loops=1)
         Sort Key: u.name
         Sort Method: top-N heapsort  Memory: 334kB
         ->  Hash Right Join  (cost=1693293.32..3795987.80 rows=38095570 width=148) (actual time=13079.431..42453.234 rows=37895655 loops=1)
               Hash Cond: (((('@'::text || p.user_id) || ':'::text) || 'matrix.org'::text) = u.name)
               ->  Seq Scan on profiles p  (cost=0.00..632205.41 rows=37935041 width=64) (actual time=0.009..4309.116 rows=37895652 loops=1)
               ->  Hash  (cost=770665.70..770665.70 rows=38095570 width=72) (actual time=13075.223..13075.223 rows=37895655 loops=1)
                     Buckets: 262144  Batches: 512  Memory Usage: 7337kB
                     ->  Seq Scan on users u  (cost=0.00..770665.70 rows=38095570 width=72) (actual time=0.015..5704.975 rows=37895655 loops=1
)
 Planning time: 0.278 ms
 Execution time: 48807.123 ms
(12 rows)

Time: 48808.020 ms (00:48.808)
@DMRobertson DMRobertson added A-Performance Performance, both client-facing and admin-facing A-Admin-API S-Minor Blocks non-critical functionality, workarounds exist. T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db O-Uncommon Most users are unlikely to come across this or unexpected workflow labels Oct 21, 2022
@DMRobertson
Copy link
Contributor Author

TODO:

  • Test if this works on sqlite
  • Sanity check that the regex is sound
    • Are there problems with mxids containing multiple @ or : characters?
    • What are the SQL standard rules for how the regex matches---does substring return as greedy a substring as possible starting from the first match?
    • Do postgres/sqlite conform?

@dklimpel
Copy link
Contributor

dklimpel commented Apr 6, 2023

This is related to:

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Admin-API A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db A-Performance Performance, both client-facing and admin-facing O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Minor Blocks non-critical functionality, workarounds exist. T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.
Projects
None yet
Development

No branches or pull requests

2 participants