-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Improve performance of _get_state_groups_from_groups_txn #7567
Improve performance of _get_state_groups_from_groups_txn #7567
Conversation
d600189
to
b53790f
Compare
b53790f
to
914be9f
Compare
Since this is failing on all worker tests but not monolith tests, I have a horrible feeling that SyTest was relying on this SQL to be slow. I know @erikjohnston did some race fixes recently. I'll try rebasing this PR to see if CI fares any better a second time. |
914be9f
to
6af29d4
Compare
6af29d4
to
277ff60
Compare
…7567) The query keeps showing up in my slow query log. This changes the plan under the top-level Sort node from WindowAgg (cost=280335.88..292963.15 rows=561212 width=80) (actual time=138.651..160.562 rows=27112 loops=1) -> Sort (cost=280335.88..281738.91 rows=561212 width=84) (actual time=138.597..140.622 rows=27112 loops=1) Sort Key: state_groups_state.type, state_groups_state.state_key, state_groups_state.state_group Sort Method: quicksort Memory: 4581kB -> Nested Loop (cost=2.83..226745.22 rows=561212 width=84) (actual time=21.548..47.657 rows=27112 loops=1) -> HashAggregate (cost=2.27..3.28 rows=101 width=8) (actual time=21.526..21.535 rows=20 loops=1) Group Key: state.state_group -> CTE Scan on state (cost=0.00..2.02 rows=101 width=8) (actual time=21.280..21.493 rows=20 loops=1) -> Index Scan using state_groups_state_type_idx on state_groups_state (cost=0.56..2189.40 rows=5557 width=84) (actual time=0.005..0.991 rows=1356 loops=20) Index Cond: (state_group = state.state_group) to Nested Loop (cost=2.83..226745.22 rows=561212 width=84) (actual time=24.194..52.834 rows=27112 loops=1) -> HashAggregate (cost=2.27..3.28 rows=101 width=8) (actual time=24.130..24.138 rows=20 loops=1) Group Key: state.state_group -> CTE Scan on state (cost=0.00..2.02 rows=101 width=8) (actual time=23.887..24.113 rows=20 loops=1) -> Index Scan using state_groups_state_type_idx on state_groups_state (cost=0.56..2189.40 rows=5557 width=84) (actual time=0.016..1.159 rows=1356 loops=20) Index Cond: (state_group = state.state_group) This cuts the execution time from ~190ms to ~130ms, i.e. a reduction of ~30%. The full plans are visualised at https://explain.depesz.com/s/WpbT and https://explain.depesz.com/s/KlEk Signed-off-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
277ff60
to
f16584c
Compare
@anoadragon453 Turns out I messed up when adapting the query from my experimentation to the code, and was sorting the I've fixed it in the last force push |
Oooh, interesting. I didn't realise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do indeed believe this works! Awesome, thank you!
…rg-hotfixes Synapse 1.15.0rc1 (2020-06-09) ============================== Features -------- - Advertise support for Client-Server API r0.6.0 and remove related unstable feature flags. ([\#6585](#6585)) - Add an option to disable autojoining rooms for guest accounts. ([\#6637](#6637)) - For SAML authentication, add the ability to pass email addresses to be added to new users' accounts via SAML attributes. Contributed by Christopher Cooper. ([\#7385](#7385)) - Add admin APIs to allow server admins to manage users' devices. Contributed by @dklimpel. ([\#7481](#7481)) - Add support for generating thumbnails for WebP images. Previously, users would see an empty box instead of preview image. ([\#7586](#7586)) - Support the standardized `m.login.sso` user-interactive authentication flow. ([\#7630](#7630)) Bugfixes -------- - Allow new users to be registered via the admin API even if the monthly active user limit has been reached. Contributed by @dkimpel. ([\#7263](#7263)) - Fix email notifications not being enabled for new users when created via the Admin API. ([\#7267](#7267)) - Fix str placeholders in an instance of `PrepareDatabaseException`. Introduced in Synapse v1.8.0. ([\#7575](#7575)) - Fix a bug in automatic user creation during first time login with `m.login.jwt`. Regression in v1.6.0. Contributed by @olof. ([\#7585](#7585)) - Fix a bug causing the cross-signing keys to be ignored when resyncing a device list. ([\#7594](#7594)) - Fix metrics failing when there is a large number of active background processes. ([\#7597](#7597)) - Fix bug where returning rooms for a group would fail if it included a room that the server was not in. ([\#7599](#7599)) - Fix duplicate key violation when persisting read markers. ([\#7607](#7607)) - Prevent an entire iteration of the device list resync loop from failing if one server responds with a malformed result. ([\#7609](#7609)) - Fix exceptions when fetching events from a remote host fails. ([\#7622](#7622)) - Make `synctl restart` start synapse if it wasn't running. ([\#7624](#7624)) - Pass device information through to the login endpoint when using the login fallback. ([\#7629](#7629)) - Advertise the `m.login.token` login flow when OpenID Connect is enabled. ([\#7631](#7631)) - Fix bug in account data replication stream. ([\#7656](#7656)) Improved Documentation ---------------------- - Update the OpenBSD installation instructions. ([\#7587](#7587)) - Advertise Python 3.8 support in `setup.py`. ([\#7602](#7602)) - Add a link to `#synapse:matrix.org` in the troubleshooting section of the README. ([\#7603](#7603)) - Clarifications to the admin api documentation. ([\#7647](#7647)) Internal Changes ---------------- - Convert the identity handler to async/await. ([\#7561](#7561)) - Improve query performance for fetching state from a PostgreSQL database. ([\#7567](#7567)) - Speed up processing of federation stream RDATA rows. ([\#7584](#7584)) - Add comment to systemd example to show postgresql dependency. ([\#7591](#7591)) - Refactor `Ratelimiter` to limit the amount of expensive config value accesses. ([\#7595](#7595)) - Convert groups handlers to async/await. ([\#7600](#7600)) - Clean up exception handling in `SAML2ResponseResource`. ([\#7614](#7614)) - Check that all asynchronous tasks succeed and general cleanup of `MonthlyActiveUsersTestCase` and `TestMauLimit`. ([\#7619](#7619)) - Convert `get_user_id_by_threepid` to async/await. ([\#7620](#7620)) - Switch to upstream `dh-virtualenv` rather than our fork for Debian package builds. ([\#7621](#7621)) - Update CI scripts to check the number in the newsfile fragment. ([\#7623](#7623)) - Check if the localpart of a Matrix ID is reserved for guest users earlier in the registration flow, as well as when responding to requests to `/register/available`. ([\#7625](#7625)) - Minor cleanups to OpenID Connect integration. ([\#7628](#7628)) - Attempt to fix flaky test: `PhoneHomeStatsTestCase.test_performance_100`. ([\#7634](#7634)) - Fix typos of `m.olm.curve25519-aes-sha2` and `m.megolm.v1.aes-sha2` in comments, test files. ([\#7637](#7637)) - Convert user directory, state deltas, and stats handlers to async/await. ([\#7640](#7640)) - Remove some unused constants. ([\#7644](#7644)) - Fix type information on `assert_*_is_admin` methods. ([\#7645](#7645)) - Convert registration handler to async/await. ([\#7649](#7649))
…7567) The query keeps showing up in my slow query log. This changes the plan under the top-level Sort node from ``` WindowAgg (cost=280335.88..292963.15 rows=561212 width=80) (actual time=138.651..160.562 rows=27112 loops=1) -> Sort (cost=280335.88..281738.91 rows=561212 width=84) (actual time=138.597..140.622 rows=27112 loops=1) Sort Key: state_groups_state.type, state_groups_state.state_key, state_groups_state.state_group Sort Method: quicksort Memory: 4581kB -> Nested Loop (cost=2.83..226745.22 rows=561212 width=84) (actual time=21.548..47.657 rows=27112 loops=1) -> HashAggregate (cost=2.27..3.28 rows=101 width=8) (actual time=21.526..21.535 rows=20 loops=1) Group Key: state.state_group -> CTE Scan on state (cost=0.00..2.02 rows=101 width=8) (actual time=21.280..21.493 rows=20 loops=1) -> Index Scan using state_groups_state_type_idx on state_groups_state (cost=0.56..2189.40 rows=5557 width=84) (actual time=0.005..0.991 rows=1356 loops=20) Index Cond: (state_group = state.state_group) ``` to ``` Nested Loop (cost=2.83..226745.22 rows=561212 width=84) (actual time=24.194..52.834 rows=27112 loops=1) -> HashAggregate (cost=2.27..3.28 rows=101 width=8) (actual time=24.130..24.138 rows=20 loops=1) Group Key: state.state_group -> CTE Scan on state (cost=0.00..2.02 rows=101 width=8) (actual time=23.887..24.113 rows=20 loops=1) -> Index Scan using state_groups_state_type_idx on state_groups_state (cost=0.56..2189.40 rows=5557 width=84) (actual time=0.016..1.159 rows=1356 loops=20) Index Cond: (state_group = state.state_group) ``` This cuts the execution time from ~190ms to ~130ms, i.e. a reduction of ~30%. The full plans are visualised at https://explain.depesz.com/s/WpbT and https://explain.depesz.com/s/KlEk Signed-off-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
The query keeps showing up in my slow query log.
This changes the plan under the top-level Sort node from
to
This cuts the execution time from ~190ms to ~130ms, i.e. a reduction
of ~30%.
The full plans are visualised at https://explain.depesz.com/s/WpbT and
https://explain.depesz.com/s/KlEk
Pull Request Checklist
EventStore
toEventWorkerStore
.".code blocks
.