-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Remove get rooms for user with stream ordering #13991
Remove get rooms for user with stream ordering #13991
Conversation
By getting the joined rooms before the current token we avoid any reading history to confirm a user *was* in a room. We can then use any membership change events, which we already fetch during sync, to determine the final list of joined room IDs.
membership_change_events = [] | ||
if since_token: | ||
membership_change_events = await self.store.get_membership_changes_for_user( | ||
user_id, since_token.room_key, now_token.room_key, self.rooms_to_exclude |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copied from old draft PR
Isn't since_token significantly older than we need? Can we take the current token before we fetch get_rooms_for_user and then only get the changes between the two?
Member Author
Fizzadar Fizzadar 2 days ago
Yes much older - if we switch to using deltas as above this would be far more efficient.
In the current state of this PR though it makes sense to call get_membership_changes_for_user between the since_token & now because we later fetch this anyway (twice), so passing it around as part of the SyncResultBuilder removes those duplicate lookups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right. But doesn't that mean we'll be handling a bunch of spurious changes below? Specifically won't we hit the log line for every membership change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh nice spot yeah that'll potentially check tonnes of events 🤦 Have pushed up a82b8d7 which limits it to the specific window between get rooms + current token.
) | ||
# User joined a room - we have to then check the room state to ensure we | ||
# respect any bans if there's a race between the join and ban events. | ||
if event.membership == Membership.JOIN: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copied from old draft PR
I think we need to be careful that we handle a join -> join transition? e.g. a display name update?
I think we could look at the current_state_deltas table to get a more accurate sense of how the state has changed?
erikjohnston erikjohnston 3 days ago
Actually, no we don't need to worry about join -> join transitions. I was thinking of a different un-applicable scenario.
current_state_deltas probably still the right table?
Member Author
Fizzadar Fizzadar 2 days ago
Was just trying to figure this out but I don't think current_state_deltas has all the information we need - specifically the membership field in event content means we can't determine whether an event was a join or invite.
Pulling in the events would solve this, but at that point it's essentially the same as the current get_membership_changes_for_user call I think, is there a reason this isn't suitable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
current_state_deltas
will give you the actual changes in current state between two tokens, whereas get_membership_changes_for_user
is a bit dodge and only gets an membership events that have been persisted in the range (these are technically not the same thing).
Pulling in the events would solve this, but at that point it's essentially the same as the current
get_membership_changes_for_user
call I think
You'd want to join against room_memberships
, so current_state_deltas
may be cheaper as there will likely be less state changes than events in the room.
Though it occurs to me that we only keep current_state_deltas
for a month, so we'd have to handle the case that the since token given was too old. Ideally tell the client to do a full resync, but at that point we're creeping the scope of this PR a bunch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to ask clients to re-init sync currently? Assuming this is possible I think all the membership change lookups in sync could be optimised out to use the current_stream_delta
x room_memberships
and drop the get_membership_changes_for_user
calls entirely. I wonder/hope if this could dramatically simplify the current membership change handling in sync.
I think this fits into a subsequent PR though, definitely out of scope here! As it stands I believe this change will behave as expected; the sync handler already uses the membership change events to calculate room changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the best that we can do currently is soft log the user out, which isn't the most ideal. Will double check if the sliding sync proposal has a mechanism
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, sliding sync supports it
membership_change_events = [] | ||
if since_token: | ||
membership_change_events = await self.store.get_membership_changes_for_user( | ||
user_id, since_token.room_key, now_token.room_key, self.rooms_to_exclude |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right. But doesn't that mean we'll be handling a bunch of spurious changes below? Specifically won't we hit the log line for every membership change?
) | ||
# User joined a room - we have to then check the room state to ensure we | ||
# respect any bans if there's a race between the join and ban events. | ||
if event.membership == Membership.JOIN: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
current_state_deltas
will give you the actual changes in current state between two tokens, whereas get_membership_changes_for_user
is a bit dodge and only gets an membership events that have been persisted in the range (these are technically not the same thing).
Pulling in the events would solve this, but at that point it's essentially the same as the current
get_membership_changes_for_user
call I think
You'd want to join against room_memberships
, so current_state_deltas
may be cheaper as there will likely be less state changes than events in the room.
Though it occurs to me that we only keep current_state_deltas
for a month, so we'd have to handle the case that the since token given was too old. Ideally tell the client to do a full resync, but at that point we're creeping the scope of this PR a bunch
synapse/handlers/sync.py
Outdated
extrems = await self.store.get_forward_extremities_for_room_at_stream_ordering( | ||
room_id, event.internal_metadata.stream_ordering | ||
) | ||
user_ids_in_room = await self.state.get_current_user_ids_in_room( | ||
room_id, extrems | ||
) | ||
if user_id in user_ids_in_room: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an expensive check. Can we instead check if the given user is now in the room? I think that might be good enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think that works - ef2950a.
By getting the joined rooms before the current token we avoid any reading history to confirm a user *was* in a room. We can then use any membership change events, which we already fetch during sync, to determine the final list of joined room IDs. # Conflicts: # synapse/handlers/sync.py
Builds upon #13787. I have copied over some comments from Erik on the reference PR (I have re-applied this on
develop
for this PR).Aim here is complete removal of the
get_rooms_for_user_with_stream_ordering
function and it's join on the events table. The final use is in the sync handler which currently fetches a users room after the current token, leaving a window during which a user may be joined to a room. The ordering is used to check this.This approach flips the idea on it's head by getting the users rooms first, then the token, and then any membership changes between. Crucially the membership events are already fetched as part of sync and so this doesn't increase the queries required. Unfortunately due to race conditions with bans any rooms located during this still need a check against their state at membership-event-time, but this is identical to the current implementation.
I have also stored the membership events on the
SyncResultBuilder
object to avoid looking this up multiple times during a sync, which closes a TODO note in the sync handler.Signed off by Nick @ Beeper (@Fizzadar).
Pull Request Checklist
EventStore
toEventWorkerStore
.".code blocks
.(run the linters)