Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Faster joins: use initial list of servers if we don't have the full state yet #14408

Merged
merged 8 commits into from
Nov 24, 2022

Conversation

MatMaul
Copy link
Contributor

@MatMaul MatMaul commented Nov 10, 2022

Part of #14058.

Signed-off-by: Mathieu Velten mathieuv@matrix.org

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Pull request includes a sign off
  • Code style is correct
    (run the linters)

@MatMaul MatMaul force-pushed the mv/partial-join-fed-send branch from 78114f9 to 8cc46ad Compare November 10, 2022 17:08
Signed-off-by: Mathieu Velten <mathieuv@matrix.org>
@MatMaul MatMaul self-assigned this Nov 10, 2022
@MatMaul
Copy link
Contributor Author

MatMaul commented Nov 10, 2022

Also, is it worth adding a cache or is it ok ?

My point of view is that it should be ok to take the perf hit of not caching since it's a transient state that shouldn't last.

@MatMaul MatMaul marked this pull request as ready for review November 10, 2022 19:59
@MatMaul MatMaul requested a review from a team as a code owner November 10, 2022 19:59
synapse/federation/sender/__init__.py Outdated Show resolved Hide resolved
synapse/federation/sender/__init__.py Outdated Show resolved Hide resolved
changelog.d/14408.misc Outdated Show resolved Hide resolved
@squahtx
Copy link
Contributor

squahtx commented Nov 11, 2022

Also, is it worth adding a cache or is it ok ?

My point of view is that it should be ok to take the perf hit of not caching since it's a transient state that shouldn't last.

I think it's okay. We don't expect to stay in the partial state for long, as you rightly point out.

Mathieu Velten and others added 7 commits November 14, 2022 14:06
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
This reverts commit 87d1c1d.
@MatMaul MatMaul requested a review from squahtx November 14, 2022 14:48
Comment on lines +439 to +451
# During partial join we use the set of servers that we got
# when beginning the join. It's still possible that we send
# events to servers that left the room in the meantime, but
# we consider that an acceptable risk since it is only our own
# events that we leak and not other server's ones.
partial_state_destinations = (
await self.store.get_partial_state_servers_at_join(
event.room_id
)
)

if len(partial_state_destinations) > 0:
destinations = partial_state_destinations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realised that we won't send events to servers that join after us.

self._storage_controllers.state.get_current_hosts_in_room_or_partial_state_approximation(room_id) returns a list that does include those servers (apologies for the naming), but it never returns an empty list.
We may want to go back to the original structure (elif await self.store.is_partial_state_room(event.room_id):) but use that method instead.

Sorry for not catching this earlier!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose there's a question here: how much are we happy to trust this approximation? In prior discussions I gathered that we were just going to stick with the original list handed out over /send_join — is that no longer the case?
I guess the client will see the memberships for the 'extras', even if we end up rejecting them later on, it won't ever send events to servers that the user couldn't see, so maybe it's OK?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm missing / have forgotten context from prior discussions. If we've decided to go with only the original list, then please ignore the comment!

It's hard to say what memberships an end user might actually see, given that they'll be doing lazy loading syncs, which only provide memberships when a user sends a message. I think the member list in Element Web might not work until the room is fully synced.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't merged that yet because I am unsure.

To my understanding, the approximation is initial list + updates from membership events that we received after.
There is a possibility that we can't validate those events out of missing state, hence I think we probably want to not use the approximation ?
We also can't be sure of the initial servers_in_room but at least we trust only one server there.

In the grand scheme of things it's not a big trouble because we potentially only leak our own events.

For now I am leaning towards merging it like that.

@MatMaul MatMaul merged commit 39cde58 into develop Nov 24, 2022
@MatMaul MatMaul deleted the mv/partial-join-fed-send branch November 24, 2022 17:09
H-Shay pushed a commit that referenced this pull request Dec 13, 2022
…tate yet (#14408)


Signed-off-by: Mathieu Velten <mathieuv@matrix.org>
Co-authored-by: Sean Quah <8349537+squahtx@users.noreply.github.com>
Fizzadar added a commit to beeper/synapse-legacy-fork that referenced this pull request Dec 15, 2022
Synapse 1.73.0 (2022-12-06)
===========================

Please note that legacy Prometheus metric names have been removed in this release; see [the upgrade notes](https://github.com/matrix-org/synapse/blob/release-v1.73/docs/upgrade.md#legacy-prometheus-metric-names-have-now-been-removed) for more details.

No significant changes since 1.73.0rc2.

Synapse 1.73.0rc2 (2022-12-01)
==============================

Bugfixes
--------

- Fix a regression in Synapse 1.73.0rc1 where Synapse's main process would stop responding to HTTP requests when a user with a large number of devices logs in. ([\matrix-org#14582](matrix-org#14582))

Synapse 1.73.0rc1 (2022-11-29)
==============================

Features
--------

- Speed-up `/messages` with `filter_events_for_client` optimizations. ([\matrix-org#14527](matrix-org#14527))
- Improve DB performance by reducing amount of data that gets read in `device_lists_changes_in_room`. ([\matrix-org#14534](matrix-org#14534))
- Adds support for handling avatar in SSO OIDC login. Contributed by @ashfame. ([\matrix-org#13917](matrix-org#13917))
- Move MSC3030 `/timestamp_to_event` endpoints to stable `v1` location (`/_matrix/client/v1/rooms/<roomID>/timestamp_to_event?ts=<timestamp>&dir=<direction>`, `/_matrix/federation/v1/timestamp_to_event/<roomID>?ts=<timestamp>&dir=<direction>`). ([\matrix-org#14471](matrix-org#14471))
- Reduce database load of [Client-Server endpoints](https://spec.matrix.org/v1.5/client-server-api/#aggregations) which return bundled aggregations. ([\matrix-org#14491](matrix-org#14491), [\matrix-org#14508](matrix-org#14508), [\matrix-org#14510](matrix-org#14510))
- Add unstable support for an Extensible Events room version (`org.matrix.msc1767.10`) via [MSC1767](matrix-org/matrix-spec-proposals#1767), [MSC3931](matrix-org/matrix-spec-proposals#3931), [MSC3932](matrix-org/matrix-spec-proposals#3932), and [MSC3933](matrix-org/matrix-spec-proposals#3933). ([\matrix-org#14520](matrix-org#14520), [\matrix-org#14521](matrix-org#14521), [\matrix-org#14524](matrix-org#14524))
- Prune user's old devices on login if they have too many. ([\matrix-org#14038](matrix-org#14038), [\matrix-org#14580](matrix-org#14580))

Bugfixes
--------

- Fix a long-standing bug where paginating from the start of a room did not work. Contributed by @gnunicorn. ([\matrix-org#14149](matrix-org#14149))
- Fix a bug introduced in Synapse 1.58.0 where a user with presence state `org.matrix.msc3026.busy` would mistakenly be set to `online` when calling `/sync` or `/events` on a worker process. ([\matrix-org#14393](matrix-org#14393))
- Fix a bug introduced in Synapse 1.70.0 where a receipt's thread ID was not sent over federation. ([\matrix-org#14466](matrix-org#14466))
- Fix a long-standing bug where the [List media admin API](https://matrix-org.github.io/synapse/latest/admin_api/media_admin_api.html#list-all-media-in-a-room) would fail when processing an image with broken thumbnail information. ([\matrix-org#14537](matrix-org#14537))
- Fix a bug introduced in Synapse 1.67.0 where two logging context warnings would be logged on startup. ([\matrix-org#14574](matrix-org#14574))
- In application service transactions that include the experimental `org.matrix.msc3202.device_one_time_key_counts` key, include a duplicate key of `org.matrix.msc3202.device_one_time_keys_count` to match the name proposed by [MSC3202](matrix-org/matrix-spec-proposals#3202). ([\matrix-org#14565](matrix-org#14565))
- Fix a bug introduced in Synapse 0.9 where Synapse would fail to fetch server keys whose IDs contain a forward slash. ([\matrix-org#14490](matrix-org#14490))

Improved Documentation
----------------------

- Fixed link to 'Synapse administration endpoints'. ([\matrix-org#14499](matrix-org#14499))

Deprecations and Removals
-------------------------

- Remove legacy Prometheus metrics names. They were deprecated in Synapse v1.69.0 and disabled by default in Synapse v1.71.0. ([\matrix-org#14538](matrix-org#14538))

Internal Changes
----------------

- Improve type hinting throughout Synapse. ([\matrix-org#14055](matrix-org#14055), [\matrix-org#14412](matrix-org#14412), [\matrix-org#14529](matrix-org#14529), [\matrix-org#14452](matrix-org#14452)).
- Remove old stream ID tracking code. Contributed by Nick @beeper (@Fizzadar). ([\matrix-org#14376](matrix-org#14376), [\matrix-org#14468](matrix-org#14468))
- Remove the `worker_main_http_uri` configuration setting. This is now handled via internal replication. ([\matrix-org#14400](matrix-org#14400), [\matrix-org#14476](matrix-org#14476))
- Refactor `federation_sender` and `pusher` configuration loading. ([\matrix-org#14496](matrix-org#14496))
([\matrix-org#14509](matrix-org#14509), [\matrix-org#14573](matrix-org#14573))
- Faster joins: do not wait for full state when creating events to send. ([\matrix-org#14403](matrix-org#14403))
- Faster joins: filter out non local events when a room doesn't have its full state. ([\matrix-org#14404](matrix-org#14404))
- Faster joins: send events to initial list of servers if we don't have the full state yet. ([\matrix-org#14408](matrix-org#14408))
- Faster joins: use servers list approximation received during `send_join` (potentially updated with received membership events) in `assert_host_in_room`. ([\matrix-org#14515](matrix-org#14515))
- Fix type logic in TCP replication code that prevented correctly ignoring blank commands. ([\matrix-org#14449](matrix-org#14449))
- Remove option to skip locking of tables when performing emulated upserts, to avoid a class of bugs in future. ([\matrix-org#14469](matrix-org#14469))
- `scripts-dev/federation_client`: Fix routing on servers with `.well-known` files. ([\matrix-org#14479](matrix-org#14479))
- Reduce default third party invite rate limit to 216 invites per day. ([\matrix-org#14487](matrix-org#14487))
- Refactor conversion of device list changes in room to outbound pokes to track unconverted rows using a `(stream ID, room ID)` position instead of updating the `converted_to_destinations` flag on every row. ([\matrix-org#14516](matrix-org#14516))
- Add more prompts to the bug report form. ([\matrix-org#14522](matrix-org#14522))
- Extend editorconfig rules on indent and line length to `.pyi` files. ([\matrix-org#14526](matrix-org#14526))
- Run Rust CI when `Cargo.lock` changes. This is particularly useful for dependabot updates. ([\matrix-org#14571](matrix-org#14571))
- Fix a possible variable shadow in `create_new_client_event`. ([\matrix-org#14575](matrix-org#14575))
- Bump various dependencies in the `poetry.lock` file and in CI scripts. ([\matrix-org#14557](matrix-org#14557), [\matrix-org#14559](matrix-org#14559), [\matrix-org#14560](matrix-org#14560), [\matrix-org#14500](matrix-org#14500), [\matrix-org#14501](matrix-org#14501), [\matrix-org#14502](matrix-org#14502), [\matrix-org#14503](matrix-org#14503), [\matrix-org#14504](matrix-org#14504), [\matrix-org#14505](matrix-org#14505)).

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCgAdFiEE8SRSDO7gYkSP4chELS76LzL74EcFAmOPLnYACgkQLS76LzL7
# 4Edwpg/+KXpg2ZdiJ0Yaly9VHVeiqdHRi5D7WPS6n8YBsdRx9EQHzOBkD5HAW8hE
# oz0c+zDS01ORlEWD825NYXjgaE1ijtZFvGxsftYTVuTYlVRR2m+r9jhDv9pVHT53
# TKtQVKpG0IUsuyukRBrweDcEeO0MA0nGpvaaQUhmftzWgy4yD3AjZyIgx0Ckg8pg
# OwgrzGqA7FQs4MEeOxmk1H39fZg4dlo4nmI4whvAodgaGeS9sU8t+3Qj4PVod8v/
# AkVesJcruaTHuVMb+Xp8JKezb09SsIR94gmHalC5sL+41+6XAy9BtQ/cRDfCReG3
# U1I1x1h1+EQjTP6XzMmjQHLbfI2gUJBC4I2p3e2gZ4cMm9rVz94R1dBiRk8ZgRIC
# cJFD9BvaAtb2PSTvyFBoHsrrn/u12i8fYFWu4Z4rO6dOGI83dZHeZzVw4UsVeqIK
# 5+njQwcwQsrwL3AKLjbbdqmbmhXcF6LchIK2L+NuuvdiOfvXvkO0bdjBryVEbMqB
# IOtAAWzwYaoUwVucMbBtXt/EqQS7biGkbDxsL8CDvaBwM/JSsUWXBafsV1FmxF2A
# q6KAeKpfelefoegosTYD0Md+l39xdF8Z19XaKV3GeHZEY+HE3RJXJm+Pa8SJ+IF8
# Y1od9cB/H+fYSsWCWj1OJNqTIAozh6f1Pe2nFuFDxdBwABXc/pg=
# =IBEL
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue Dec  6 11:58:46 2022 GMT
# gpg:                using RSA key F124520CEEE062448FE1C8442D2EFA2F32FBE047
# gpg: Can't check signature: No public key

# Conflicts:
#	poetry.lock
#	synapse/push/bulk_push_rule_evaluator.py
#	synapse/storage/databases/main/account_data.py
#	synapse/storage/databases/main/receipts.py
realtyem added a commit to realtyem/synapse-unraid that referenced this pull request Dec 18, 2022
Synapse 1.73.0 (2022-12-06)
===========================

Please note that legacy Prometheus metric names have been removed in this release; see [the upgrade notes](https://github.com/matrix-org/synapse/blob/release-v1.73/docs/upgrade.md#legacy-prometheus-metric-names-have-now-been-removed) for more details.

No significant changes since 1.73.0rc2.

Synapse 1.73.0rc2 (2022-12-01)
==============================

Bugfixes
--------

- Fix a regression in Synapse 1.73.0rc1 where Synapse's main process would stop responding to HTTP requests when a user with a large number of devices logs in. ([\#14582](matrix-org/synapse#14582))

Synapse 1.73.0rc1 (2022-11-29)
==============================

Features
--------

- Speed-up `/messages` with `filter_events_for_client` optimizations. ([\#14527](matrix-org/synapse#14527))
- Improve DB performance by reducing amount of data that gets read in `device_lists_changes_in_room`. ([\#14534](matrix-org/synapse#14534))
- Adds support for handling avatar in SSO OIDC login. Contributed by @ashfame. ([\#13917](matrix-org/synapse#13917))
- Move MSC3030 `/timestamp_to_event` endpoints to stable `v1` location (`/_matrix/client/v1/rooms/<roomID>/timestamp_to_event?ts=<timestamp>&dir=<direction>`, `/_matrix/federation/v1/timestamp_to_event/<roomID>?ts=<timestamp>&dir=<direction>`). ([\#14471](matrix-org/synapse#14471))
- Reduce database load of [Client-Server endpoints](https://spec.matrix.org/v1.5/client-server-api/#aggregations) which return bundled aggregations. ([\#14491](matrix-org/synapse#14491), [\#14508](matrix-org/synapse#14508), [\#14510](matrix-org/synapse#14510))
- Add unstable support for an Extensible Events room version (`org.matrix.msc1767.10`) via [MSC1767](matrix-org/matrix-spec-proposals#1767), [MSC3931](matrix-org/matrix-spec-proposals#3931), [MSC3932](matrix-org/matrix-spec-proposals#3932), and [MSC3933](matrix-org/matrix-spec-proposals#3933). ([\#14520](matrix-org/synapse#14520), [\#14521](matrix-org/synapse#14521), [\#14524](matrix-org/synapse#14524))
- Prune user's old devices on login if they have too many. ([\#14038](matrix-org/synapse#14038), [\#14580](matrix-org/synapse#14580))

Bugfixes
--------

- Fix a long-standing bug where paginating from the start of a room did not work. Contributed by @gnunicorn. ([\#14149](matrix-org/synapse#14149))
- Fix a bug introduced in Synapse 1.58.0 where a user with presence state `org.matrix.msc3026.busy` would mistakenly be set to `online` when calling `/sync` or `/events` on a worker process. ([\#14393](matrix-org/synapse#14393))
- Fix a bug introduced in Synapse 1.70.0 where a receipt's thread ID was not sent over federation. ([\#14466](matrix-org/synapse#14466))
- Fix a long-standing bug where the [List media admin API](https://matrix-org.github.io/synapse/latest/admin_api/media_admin_api.html#list-all-media-in-a-room) would fail when processing an image with broken thumbnail information. ([\#14537](matrix-org/synapse#14537))
- Fix a bug introduced in Synapse 1.67.0 where two logging context warnings would be logged on startup. ([\#14574](matrix-org/synapse#14574))
- In application service transactions that include the experimental `org.matrix.msc3202.device_one_time_key_counts` key, include a duplicate key of `org.matrix.msc3202.device_one_time_keys_count` to match the name proposed by [MSC3202](matrix-org/matrix-spec-proposals#3202). ([\#14565](matrix-org/synapse#14565))
- Fix a bug introduced in Synapse 0.9 where Synapse would fail to fetch server keys whose IDs contain a forward slash. ([\#14490](matrix-org/synapse#14490))

Improved Documentation
----------------------

- Fixed link to 'Synapse administration endpoints'. ([\#14499](matrix-org/synapse#14499))

Deprecations and Removals
-------------------------

- Remove legacy Prometheus metrics names. They were deprecated in Synapse v1.69.0 and disabled by default in Synapse v1.71.0. ([\#14538](matrix-org/synapse#14538))

Internal Changes
----------------

- Improve type hinting throughout Synapse. ([\#14055](matrix-org/synapse#14055), [\#14412](matrix-org/synapse#14412), [\#14529](matrix-org/synapse#14529), [\#14452](matrix-org/synapse#14452)).
- Remove old stream ID tracking code. Contributed by Nick @beeper (@Fizzadar). ([\#14376](matrix-org/synapse#14376), [\#14468](matrix-org/synapse#14468))
- Remove the `worker_main_http_uri` configuration setting. This is now handled via internal replication. ([\#14400](matrix-org/synapse#14400), [\#14476](matrix-org/synapse#14476))
- Refactor `federation_sender` and `pusher` configuration loading. ([\#14496](matrix-org/synapse#14496))
([\#14509](matrix-org/synapse#14509), [\#14573](matrix-org/synapse#14573))
- Faster joins: do not wait for full state when creating events to send. ([\#14403](matrix-org/synapse#14403))
- Faster joins: filter out non local events when a room doesn't have its full state. ([\#14404](matrix-org/synapse#14404))
- Faster joins: send events to initial list of servers if we don't have the full state yet. ([\#14408](matrix-org/synapse#14408))
- Faster joins: use servers list approximation received during `send_join` (potentially updated with received membership events) in `assert_host_in_room`. ([\#14515](matrix-org/synapse#14515))
- Fix type logic in TCP replication code that prevented correctly ignoring blank commands. ([\#14449](matrix-org/synapse#14449))
- Remove option to skip locking of tables when performing emulated upserts, to avoid a class of bugs in future. ([\#14469](matrix-org/synapse#14469))
- `scripts-dev/federation_client`: Fix routing on servers with `.well-known` files. ([\#14479](matrix-org/synapse#14479))
- Reduce default third party invite rate limit to 216 invites per day. ([\#14487](matrix-org/synapse#14487))
- Refactor conversion of device list changes in room to outbound pokes to track unconverted rows using a `(stream ID, room ID)` position instead of updating the `converted_to_destinations` flag on every row. ([\#14516](matrix-org/synapse#14516))
- Add more prompts to the bug report form. ([\#14522](matrix-org/synapse#14522))
- Extend editorconfig rules on indent and line length to `.pyi` files. ([\#14526](matrix-org/synapse#14526))
- Run Rust CI when `Cargo.lock` changes. This is particularly useful for dependabot updates. ([\#14571](matrix-org/synapse#14571))
- Fix a possible variable shadow in `create_new_client_event`. ([\#14575](matrix-org/synapse#14575))
- Bump various dependencies in the `poetry.lock` file and in CI scripts. ([\#14557](matrix-org/synapse#14557), [\#14559](matrix-org/synapse#14559), [\#14560](matrix-org/synapse#14560), [\#14500](matrix-org/synapse#14500), [\#14501](matrix-org/synapse#14501), [\#14502](matrix-org/synapse#14502), [\#14503](matrix-org/synapse#14503), [\#14504](matrix-org/synapse#14504), [\#14505](matrix-org/synapse#14505)).
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants