-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sliding Sync: Use stream_ordering
based timeline pagination for incremental sync
#17510
Sliding Sync: Use stream_ordering
based timeline pagination for incremental sync
#17510
Conversation
synapse/handlers/sliding_sync.py
Outdated
limit=room_sync_config.timeline_limit + 1, | ||
event_filter=None, | ||
timeline_events, new_room_key = ( | ||
await self.store.get_room_events_stream_for_room( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As an alternative, we could adapt paginate_room_events(...)
to conditionally use topological_ordering
vs stream_ordering
synapse/handlers/sync.py
Outdated
) | ||
# We want to return the events in ascending order (the last event is the | ||
# most recent). | ||
events = events.reverse() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we seem to need to reverse this everywhere it's used, we could just make it the default. To not create surprises, I tried to align with what paginate_room_events(...)
currently does.
synapse/handlers/sliding_sync.py
Outdated
# We add one so we can determine if there are enough events to saturate | ||
# the limit or not (see `limited`) | ||
limit=room_sync_config.timeline_limit + 1, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a possible question of, do we even want to do this?
Using stream_ordering
makes sense in terms of /sync
returning new items that the server received. it's also what the spec says we should do. We also make sure that if a new event is received, it's returned to the client. Otherwise, it seems like there is an edge case if we use topological_ordering
that a new event might not be returned because it's topological position is calculated to be older.
Using topological_ordering
makes sense in terms of matching /messages
so people don't have to worry about different orders of things between API's.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, so the way that sync v2 works in synapse currently is:
- The first time a room is sent down
/sync
we usepaginate_room_events
to get a chunk of history for the client, this basically results in the same as if no history was sent down/messages
was used. - We use
get_room_events_stream_for_rooms
to get all updates for rooms since the last/sync
, which is in stream ordering.
I think this is probably the right way of looking at it, we use topological ordering when we paginate backwards, but stream ordering to fetch new updates to the room (even if those updates happened a while ago in the past).
What we probably want to do for sliding sync is to more or less replicate that behaviour: when we have an incremental fetch all updates (i.e. get_room_events_stream_for_rooms
for joined rooms and fetch all membership changes), and then only use paginate_room_events
for the new rooms.
(This also has the advantage we'd only need to sort and filter rooms we know have updates)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds pretty reasonable to me 👍
I've created a spec issue to track this: matrix-org/matrix-spec#1917
stream_ordering
based timeline paginatingstream_ordering
based timeline pagination
room_ids, from_key.stream | ||
) | ||
elif direction == Direction.BACKWARDS: | ||
if to_key is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to_key
is never none?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense but probably should just update the signature to what get_room_events_stream_for_room(...)
uses where it can be None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated and aligned ✅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this: we should enforce that to_key
is not none when we're going forwards, as we always need to have an upper bound when querying the events stream (as otherwise we'll potentially pick up events that are not fully persisted).
We could do this in the type system, but probably easier to doc and assert it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should do that in this PR (some future task) given that it looks like we don't actually do this in practice yet. For example /messages?dir=f
-> get_messages()
will call paginate_room_events_by_topological_ordering(from_key=xxx, to_key=None, Direction.FORWARDS)
. Making this change would require some more default tokens in various places that should have their own PR to think through.
Added some FIXME
comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh lord really? Bleurgh, then fair.
synapse/handlers/sliding_sync.py
Outdated
# We add one so we can determine if there are enough events to saturate | ||
# the limit or not (see `limited`) | ||
limit=room_sync_config.timeline_limit + 1, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, so the way that sync v2 works in synapse currently is:
- The first time a room is sent down
/sync
we usepaginate_room_events
to get a chunk of history for the client, this basically results in the same as if no history was sent down/messages
was used. - We use
get_room_events_stream_for_rooms
to get all updates for rooms since the last/sync
, which is in stream ordering.
I think this is probably the right way of looking at it, we use topological ordering when we paginate backwards, but stream ordering to fetch new updates to the room (even if those updates happened a while ago in the past).
What we probably want to do for sliding sync is to more or less replicate that behaviour: when we have an incremental fetch all updates (i.e. get_room_events_stream_for_rooms
for joined rooms and fetch all membership changes), and then only use paginate_room_events
for the new rooms.
(This also has the advantage we'd only need to sort and filter rooms we know have updates)
…gical_ordering(...)`
…s_by_stream_ordering(...)`
stream_ordering
based timeline paginationstream_ordering
based timeline pagination for incremental sync
…imeline-paginating
): | ||
# Token selection matches what we do in `_paginate_room_events_txn` if there | ||
# are no rows | ||
return [], to_key if to_key else from_key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should keep these outside, as if we hit the code paths we will save getting a DB connection and switching threads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have it in both places? I moved it because we have plenty of places that use _paginate_room_events_by_topological_ordering_txn(...)
directly but don't have the checks outside
(currently updated with the checks in both places)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, having in both is fine, though ideally we'd always try and catch it before we went on a DB thread
room_ids, from_key.stream | ||
) | ||
elif direction == Direction.BACKWARDS: | ||
if to_key is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this: we should enforce that to_key
is not none when we're going forwards, as we always need to have an upper bound when querying the events stream (as otherwise we'll potentially pick up events that are not fully persisted).
We could do this in the type system, but probably easier to doc and assert it
…imeline-paginating Conflicts: synapse/storage/databases/main/stream.py
Thanks for the review @erikjohnston 🐓 |
# Synapse 1.114.0 (2024-09-02) This release enables support for [MSC4186](matrix-org/matrix-spec-proposals#4186) — Simplified Sliding Sync. This allows using the upcoming releases of the Element X mobile apps without having to run a Sliding Sync Proxy. ### Features - Enable native sliding sync support ([MSC3575](matrix-org/matrix-spec-proposals#3575) and [MSC4186](matrix-org/matrix-spec-proposals#4186)) by default. ([\#17648](element-hq/synapse#17648)) # Synapse 1.114.0rc3 (2024-08-30) ### Bugfixes - Fix regression in v1.114.0rc2 that caused workers to fail to start. ([\#17626](element-hq/synapse#17626)) # Synapse 1.114.0rc2 (2024-08-30) ### Features - Improve cross-signing upload when using [MSC3861](matrix-org/matrix-spec-proposals#3861) to use a custom UIA flow stage, with web fallback support. ([\#17509](element-hq/synapse#17509)) - Make `hash_password` script accept password input from stdin. ([\#17608](element-hq/synapse#17608)) ### Bugfixes - Fix hierarchy returning 403 when room is accessible through federation. Contributed by Krishan (@kfiven). ([\#17194](element-hq/synapse#17194)) - Fix content-length on federation `/thumbnail` responses. ([\#17532](element-hq/synapse#17532)) - Fix authenticated media responses using a wrong limit when following redirects over federation. ([\#17543](element-hq/synapse#17543)) ### Internal Changes - MSC3861: load the issuer and account management URLs from OIDC discovery. ([\#17407](element-hq/synapse#17407)) - Refactor sliding sync class into multiple files. ([\#17595](element-hq/synapse#17595)) - Store sliding sync per-connection state in the database. ([\#17599](element-hq/synapse#17599)) - Make the sliding sync `PerConnectionState` class immutable. ([\#17600](element-hq/synapse#17600)) - Add support to `@tag_args` for standalone functions. ([\#17604](element-hq/synapse#17604)) - Speed up incremental syncs in sliding sync by adding some more caching. ([\#17606](element-hq/synapse#17606)) - Always return the user's own read receipts in sliding sync. ([\#17617](element-hq/synapse#17617)) - Replace `isort` and `black` with `ruff`. ([\#17620](element-hq/synapse#17620)) - Refactor sliding sync code to move room list logic out into a separate class. ([\#17622](element-hq/synapse#17622)) ### Updates to locked dependencies * Bump attrs from 23.2.0 to 24.2.0. ([\#17609](element-hq/synapse#17609)) * Bump cryptography from 42.0.8 to 43.0.0. ([\#17584](element-hq/synapse#17584)) * Bump phonenumbers from 8.13.43 to 8.13.44. ([\#17610](element-hq/synapse#17610)) * Bump pygithub from 2.3.0 to 2.4.0. ([\#17612](element-hq/synapse#17612)) * Bump pyyaml from 6.0.1 to 6.0.2. ([\#17611](element-hq/synapse#17611)) * Bump sentry-sdk from 2.12.0 to 2.13.0. ([\#17585](element-hq/synapse#17585)) * Bump serde from 1.0.206 to 1.0.208. ([\#17581](element-hq/synapse#17581)) * Bump serde from 1.0.208 to 1.0.209. ([\#17613](element-hq/synapse#17613)) * Bump serde_json from 1.0.124 to 1.0.125. ([\#17582](element-hq/synapse#17582)) * Bump serde_json from 1.0.125 to 1.0.127. ([\#17614](element-hq/synapse#17614)) * Bump types-jsonschema from 4.23.0.20240712 to 4.23.0.20240813. ([\#17583](element-hq/synapse#17583)) * Bump types-setuptools from 71.1.0.20240726 to 71.1.0.20240818. ([\#17586](element-hq/synapse#17586)) # Synapse 1.114.0rc1 (2024-08-20) ### Features - Add a flag to `/versions`, `org.matrix.simplified_msc3575`, to indicate whether experimental sliding sync support has been enabled. ([\#17571](element-hq/synapse#17571)) - Handle changes in `timeline_limit` in experimental sliding sync. ([\#17579](element-hq/synapse#17579)) - Correctly track read receipts that should be sent down in experimental sliding sync. ([\#17575](element-hq/synapse#17575), [\#17589](element-hq/synapse#17589), [\#17592](element-hq/synapse#17592)) ### Bugfixes - Start handlers for new media endpoints when media resource configured. ([\#17483](element-hq/synapse#17483)) - Fix timeline ordering (using `stream_ordering` instead of topological ordering) in experimental [MSC3575](matrix-org/matrix-spec-proposals#3575) Sliding Sync `/sync` endpoint. ([\#17510](element-hq/synapse#17510)) - Fix experimental sliding sync implementation to remember any updates in rooms that were not sent down immediately. ([\#17535](element-hq/synapse#17535)) - Better exclude partially stated rooms if we must await full state in experimental [MSC3575](matrix-org/matrix-spec-proposals#3575) Sliding Sync `/sync` endpoint. ([\#17538](element-hq/synapse#17538)) - Handle lower-case http headers in `_Mulitpart_Parser_Protocol`. ([\#17545](element-hq/synapse#17545)) - Fix fetching federation signing keys from servers that omit `old_verify_keys`. Contributed by @tulir @ Beeper. ([\#17568](element-hq/synapse#17568)) - Fix bug where we would respond with an error when a remote server asked for media that had a length of 0, using the new multipart federation media endpoint. ([\#17570](element-hq/synapse#17570)) ### Improved Documentation - Clarify default behaviour of the [`auto_accept_invites.worker_to_run_on`](https://element-hq.github.io/synapse/develop/usage/configuration/config_documentation.html#auto-accept-invites) option. ([\#17515](element-hq/synapse#17515)) - Improve docstrings for profile methods. ([\#17559](element-hq/synapse#17559)) ### Internal Changes - Add more tracing to experimental [MSC3575](matrix-org/matrix-spec-proposals#3575) Sliding Sync `/sync` endpoint. ([\#17514](element-hq/synapse#17514)) - Fixup comment in sliding sync implementation. ([\#17531](element-hq/synapse#17531)) - Replace override of deprecated method `HTTPAdapter.get_connection` with `get_connection_with_tls_context`. ([\#17536](element-hq/synapse#17536)) - Fix performance of device lists in `/key/changes` and sliding sync. ([\#17537](element-hq/synapse#17537), [\#17548](element-hq/synapse#17548)) - Bump setuptools from 67.6.0 to 72.1.0. ([\#17542](element-hq/synapse#17542)) - Add a utility function for generating random event IDs. ([\#17557](element-hq/synapse#17557)) - Speed up responding to media requests. ([\#17558](element-hq/synapse#17558), [\#17561](element-hq/synapse#17561), [\#17564](element-hq/synapse#17564), [\#17566](element-hq/synapse#17566), [\#17567](element-hq/synapse#17567), [\#17569](element-hq/synapse#17569)) - Test github token before running release script steps. ([\#17562](element-hq/synapse#17562)) - Reduce log spam of multipart files. ([\#17563](element-hq/synapse#17563)) - Refactor per-connection state in experimental sliding sync handler. ([\#17574](element-hq/synapse#17574)) - Add histogram metrics for sliding sync processing time. ([\#17593](element-hq/synapse#17593)) ### Updates to locked dependencies * Bump bytes from 1.6.1 to 1.7.1. ([\#17526](element-hq/synapse#17526)) * Bump lxml from 5.2.2 to 5.3.0. ([\#17550](element-hq/synapse#17550)) * Bump phonenumbers from 8.13.42 to 8.13.43. ([\#17551](element-hq/synapse#17551)) * Bump regex from 1.10.5 to 1.10.6. ([\#17527](element-hq/synapse#17527)) * Bump sentry-sdk from 2.10.0 to 2.12.0. ([\#17553](element-hq/synapse#17553)) * Bump serde from 1.0.204 to 1.0.206. ([\#17556](element-hq/synapse#17556)) * Bump serde_json from 1.0.122 to 1.0.124. ([\#17555](element-hq/synapse#17555)) * Bump sigstore/cosign-installer from 3.5.0 to 3.6.0. ([\#17549](element-hq/synapse#17549)) * Bump types-pyyaml from 6.0.12.20240311 to 6.0.12.20240808. ([\#17552](element-hq/synapse#17552)) * Bump types-requests from 2.31.0.20240406 to 2.32.0.20240712. ([\#17524](element-hq/synapse#17524)) # Synapse 1.113.0 (2024-08-13) No significant changes since 1.113.0rc1. # Synapse 1.113.0rc1 (2024-08-06) ### Features - Track which rooms have been sent to clients in the experimental [MSC3575](matrix-org/matrix-spec-proposals#3575) Sliding Sync `/sync` endpoint. ([\#17447](element-hq/synapse#17447)) - Add Account Data extension support to experimental [MSC3575](matrix-org/matrix-spec-proposals#3575) Sliding Sync `/sync` endpoint. ([\#17477](element-hq/synapse#17477)) - Add receipts extension support to experimental [MSC3575](matrix-org/matrix-spec-proposals#3575) Sliding Sync `/sync` endpoint. ([\#17489](element-hq/synapse#17489)) - Add typing notification extension support to experimental [MSC3575](matrix-org/matrix-spec-proposals#3575) Sliding Sync `/sync` endpoint. ([\#17505](element-hq/synapse#17505)) ### Bugfixes - Update experimental [MSC3575](matrix-org/matrix-spec-proposals#3575) Sliding Sync `/sync` endpoint to handle invite/knock rooms when filtering. ([\#17450](element-hq/synapse#17450)) - Fix a bug introduced in v1.110.0 which caused `/keys/query` to return incomplete results, leading to high network activity and CPU usage on Matrix clients. ([\#17499](element-hq/synapse#17499)) ### Improved Documentation - Update the [`allowed_local_3pids`](https://element-hq.github.io/synapse/v1.112/usage/configuration/config_documentation.html#allowed_local_3pids) config option's msisdn address to a working example. ([\#17476](element-hq/synapse#17476)) ### Internal Changes - Change sliding sync to use their own token format in preparation for storing per-connection state. ([\#17452](element-hq/synapse#17452)) - Ensure we don't send down negative `bump_stamp` in experimental sliding sync endpoint. ([\#17478](element-hq/synapse#17478)) - Do not send down empty room entries down experimental sliding sync endpoint. ([\#17479](element-hq/synapse#17479)) - Refactor Sliding Sync tests to better utilize the `SlidingSyncBase`. ([\#17481](element-hq/synapse#17481), [\#17482](element-hq/synapse#17482)) - Add some opentracing tags and logging to the experimental sliding sync implementation. ([\#17501](element-hq/synapse#17501)) - Split and move Sliding Sync tests so we have some more sane test file sizes. ([\#17504](element-hq/synapse#17504)) - Update the `limited` field description in the Sliding Sync response to accurately describe what it actually represents. ([\#17507](element-hq/synapse#17507)) - Easier to understand `timeline` assertions in Sliding Sync tests. ([\#17511](element-hq/synapse#17511)) - Reset the sliding sync connection if we don't recognize the per-connection state position. ([\#17529](element-hq/synapse#17529)) ### Updates to locked dependencies * Bump bcrypt from 4.1.3 to 4.2.0. ([\#17495](element-hq/synapse#17495)) * Bump black from 24.4.2 to 24.8.0. ([\#17522](element-hq/synapse#17522)) * Bump phonenumbers from 8.13.39 to 8.13.42. ([\#17521](element-hq/synapse#17521)) * Bump ruff from 0.5.4 to 0.5.5. ([\#17494](element-hq/synapse#17494)) * Bump serde_json from 1.0.120 to 1.0.121. ([\#17493](element-hq/synapse#17493)) * Bump serde_json from 1.0.121 to 1.0.122. ([\#17525](element-hq/synapse#17525)) * Bump towncrier from 23.11.0 to 24.7.1. ([\#17523](element-hq/synapse#17523)) * Bump types-pyopenssl from 24.1.0.20240425 to 24.1.0.20240722. ([\#17496](element-hq/synapse#17496)) * Bump types-setuptools from 70.1.0.20240627 to 71.1.0.20240726. ([\#17497](element-hq/synapse#17497))
Use
stream_ordering
basedtimeline
pagination for incremental/sync
in Sliding Sync. Previously, we were always using atopological_ordering
but we should only be using that for historical scenarios (initial/sync
, newly joined, or haven't sent the room down the connection before).This is slightly different than what the spec suggests
But we've had a discussion below in this PR and this matches what Sync v2 already does and seems like it makes sense. Created a spec issue matrix-org/matrix-spec#1917 to clarify this.
Related issues:
/sync
matrix-org/matrix-spec#1917/messages
to use stream ordering for events matrix-org/matrix-spec#852Pull Request Checklist
EventStore
toEventWorkerStore
.".code blocks
.(run the linters)