Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Improve performance of remove_hidden_devices_from_device_inbox #11420

Closed
wants to merge 5 commits into from

Conversation

babolivier
Copy link
Contributor

See #11401 for context.

By walking the devices table instead of the device_inbox one.

Here's the query details on abolivier.bzh's database:

synapse=# explain analyze SELECT user_id, device_id, stream_id
FROM devices
INNER JOIN device_inbox USING (user_id, device_id)
WHERE
    hidden = TRUE
    AND user_id >= '@foo:bar'
ORDER BY user_id
LIMIT 100;
                                                                                 QUERY PLAN                                                                                 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=24.36..13861.73 rows=100 width=65) (actual time=106.046..106.047 rows=0 loops=1)
   ->  Merge Join  (cost=24.36..100898.78 rows=729 width=65) (actual time=106.044..106.046 rows=0 loops=1)
         Merge Cond: ((devices.user_id = device_inbox.user_id) AND (devices.device_id = device_inbox.device_id))
         ->  Index Scan using device_uniqueness on devices  (cost=0.28..180.59 rows=560 width=57) (actual time=0.075..0.911 rows=467 loops=1)
               Index Cond: (user_id >= '@foo:bar'::text)
               Filter: hidden
               Rows Removed by Filter: 147
         ->  Index Only Scan using device_inbox_user_stream_id on device_inbox  (cost=0.54..100217.89 rows=109879 width=42) (actual time=0.141..92.477 rows=101229 loops=1)
               Heap Fetches: 6968
 Planning time: 3.127 ms
 Execution time: 106.130 ms
(11 rows)

So not ideal but looks like an improvement over its previous incarnation.

@babolivier babolivier requested a review from a team as a code owner November 24, 2021 14:10
@richvdh richvdh self-assigned this Nov 24, 2021
ORDER BY stream_id
user_id >= ?
AND hidden = ?
ORDER BY user_id
LIMIT ?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this limit will mean that we may not find all the device_inbox rows for a given user - we may only end up deleting half of them before deciding to move onto the next user.

I think a better strategy is to remove the join against device_inbox, and just look for hidden devices, without worrying about whether they have device_inbox rows. Then, do a DELETE FROM device_inbox for each such device.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to do this initially, but the original docstring mentions that it might be a bit heavy if the device has tons of pending messages in its inbox. For example, on abolivier.bzh's database, I've got devices with over 60k rows in device_inbox (which aren't hidden devices but that's probably because I've already run the previous incarnation of this update), which sounds like a lot to delete in one go.

On top of that, I don't see how we might not delete entries for all devices for a given user, given the condition in the query is user_id >= ?, so if we don't do every device from a user we should just do the rest in the next run?

Copy link
Member

@richvdh richvdh Nov 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On top of that, I don't see how we might not delete entries for all devices for a given user, given the condition in the query is user_id >= ?, so if we don't do every device from a user we should just do the rest in the next run?

oh sorry, you're right.

However, from your analysis in the PR comment:

         ->  Index Only Scan using device_inbox_user_stream_id on device_inbox  (cost=0.54..100217.89 rows=109879 width=42) (actual time=0.141..92.477 rows=101229 loops=1)
               Heap Fetches: 6968

this is no good. It's a scan of the entirety of the device_inbox_user_stream_id index: there is no index cond here. It's acceptable for your local db with 100k rows, but won't be for a larger db.

Honestly, looking at this again, I think we're better off rewriting it again (sorry!) to do the same as #11421 (ie, walk through the device_inbox table for a sequence of stream_ids. Hell, why not combine it with #11421?

other ideas...

  • we can probably get away with deleting all the messages for a device at once, even if its in the 10s of thousands, though it gets really nasty given we'll be considering 100 devices on the first pass.
  • maybe a two-stage loop: first find a device, then work through a range of stream_ids for that device on each run of the bg update. That gets annoyingly fiddly in terms of tracking state, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think you're right here, let's combine it with #11421

@babolivier babolivier requested a review from richvdh November 24, 2021 15:08
@richvdh richvdh removed their request for review November 24, 2021 17:04
@richvdh richvdh removed their assignment Nov 24, 2021
@babolivier babolivier closed this Nov 24, 2021
@babolivier babolivier deleted the babolivier/hidden_devices_inbox branch January 14, 2022 15:32
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants