-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Prevent account_data content from being sent over TCP replication #6333
Conversation
To be honest I'd be tempted to only send the Throwing huge JSON blobs to workers is probably not going to help the performance any. |
Not a bad point. I'll check what's involved in that. |
Looks like: synapse/synapse/replication/tcp/streams/_base.py Lines 169 to 173 in 998f7fe
is where the data is taken out of the database. This seems like it's fundamental to a lot of things that get sent over replication, and might be kinda scary to change. Will investigate more tomorrow. |
78d4067
to
c4bdf2d
Compare
…_account_data_sync * 'develop' of github.com:matrix-org/synapse: Blacklist PurgeRoomTestCase (#6361) Set room version default to 5
Synapse 1.7.0rc1 (2019-12-09) ============================= Features -------- - Implement per-room message retention policies. ([\#5815](#5815), [\#6436](#6436)) - Add etag and count fields to key backup endpoints to help clients guess if there are new keys. ([\#5858](#5858)) - Add `/admin/v2/users` endpoint with pagination. Contributed by Awesome Technologies Innovationslabor GmbH. ([\#5925](#5925)) - Require User-Interactive Authentication for `/account/3pid/add`, meaning the user's password will be required to add a third-party ID to their account. ([\#6119](#6119)) - Implement the `/_matrix/federation/unstable/net.atleastfornow/state/<context>` API as drafted in MSC2314. ([\#6176](#6176)) - Configure privacy-preserving settings by default for the room directory. ([\#6354](#6354)) - Add ephemeral messages support by partially implementing [MSC2228](matrix-org/matrix-spec-proposals#2228). ([\#6409](#6409)) - Add support for [MSC 2367](matrix-org/matrix-spec-proposals#2367), which allows specifying a reason on all membership events. ([\#6434](#6434)) Bugfixes -------- - Transfer non-standard power levels on room upgrade. ([\#6237](#6237)) - Fix error from the Pillow library when uploading RGBA images. ([\#6241](#6241)) - Correctly apply the event filter to the `state`, `events_before` and `events_after` fields in the response to `/context` requests. ([\#6329](#6329)) - Fix caching devices for remote users when using workers, so that we don't attempt to refetch (and potentially fail) each time a user requests devices. ([\#6332](#6332)) - Prevent account data syncs getting lost across TCP replication. ([\#6333](#6333)) - Fix bug: TypeError in `register_user()` while using LDAP auth module. ([\#6406](#6406)) - Fix an intermittent exception when handling read-receipts. ([\#6408](#6408)) - Fix broken guest registration when there are existing blocks of numeric user IDs. ([\#6420](#6420)) - Fix startup error when http proxy is defined. ([\#6421](#6421)) - Fix error when using synapse_port_db on a vanilla synapse db. ([\#6449](#6449)) - Fix uploading multiple cross signing signatures for the same user. ([\#6451](#6451)) - Fix bug which lead to exceptions being thrown in a loop when a cross-signed device is deleted. ([\#6462](#6462)) - Fix `synapse_port_db` not exiting with a 0 code if something went wrong during the port process. ([\#6470](#6470)) - Improve sanity-checking when receiving events over federation. ([\#6472](#6472)) - Fix inaccurate per-block Prometheus metrics. ([\#6491](#6491)) - Fix small performance regression for sending invites. ([\#6493](#6493)) - Back out cross-signing code added in Synapse 1.5.0, which caused a performance regression. ([\#6494](#6494)) Improved Documentation ---------------------- - Update documentation and variables in user contributed systemd reference file. ([\#6369](#6369), [\#6490](#6490)) - Fix link in the user directory documentation. ([\#6388](#6388)) - Add build instructions to the docker readme. ([\#6390](#6390)) - Switch Ubuntu package install recommendation to use python3 packages in INSTALL.md. ([\#6443](#6443)) - Write some docs for the quarantine_media api. ([\#6458](#6458)) - Convert CONTRIBUTING.rst to markdown (among other small fixes). ([\#6461](#6461)) Deprecations and Removals ------------------------- - Remove admin/v1/users_paginate endpoint. Contributed by Awesome Technologies Innovationslabor GmbH. ([\#5925](#5925)) - Remove fallback for federation with old servers which lack the /federation/v1/state_ids API. ([\#6488](#6488)) Internal Changes ---------------- - Add benchmarks for structured logging and improve output performance. ([\#6266](#6266)) - Improve the performance of outputting structured logging. ([\#6322](#6322)) - Refactor some code in the event authentication path for clarity. ([\#6343](#6343), [\#6468](#6468), [\#6480](#6480)) - Clean up some unnecessary quotation marks around the codebase. ([\#6362](#6362)) - Complain on startup instead of 500'ing during runtime when `public_baseurl` isn't set when necessary. ([\#6379](#6379)) - Add a test scenario to make sure room history purges don't break `/messages` in the future. ([\#6392](#6392)) - Clarifications for the email configuration settings. ([\#6423](#6423)) - Add more tests to the blacklist when running in worker mode. ([\#6429](#6429)) - Refactor data store layer to support multiple databases in the future. ([\#6454](#6454), [\#6464](#6464), [\#6469](#6469), [\#6487](#6487)) - Port synapse.rest.client.v1 to async/await. ([\#6482](#6482)) - Port synapse.rest.client.v2_alpha to async/await. ([\#6483](#6483)) - Port SyncHandler to async/await. ([\#6484](#6484))
Synapse 1.7.0 (2019-12-13) ========================== This release changes the default settings so that only local authenticated users can query the server's room directory. See the [upgrade notes](UPGRADE.rst#upgrading-to-v170) for details. Support for SQLite versions before 3.11 is now deprecated. A future release will refuse to start if used with an SQLite version before 3.11. Administrators are reminded that SQLite should not be used for production instances. Instructions for migrating to Postgres are available [here](docs/postgres.md). A future release of synapse will, by default, disable federation for servers using SQLite. No significant changes since 1.7.0rc2. Synapse 1.7.0rc2 (2019-12-11) ============================= Bugfixes -------- - Fix incorrect error message for invalid requests when setting user's avatar URL. ([\#6497](#6497)) - Fix support for SQLite 3.7. ([\#6499](#6499)) - Fix regression where sending email push would not work when using a pusher worker. ([\#6507](#6507), [\#6509](#6509)) Synapse 1.7.0rc1 (2019-12-09) ============================= Features -------- - Implement per-room message retention policies. ([\#5815](#5815), [\#6436](#6436)) - Add etag and count fields to key backup endpoints to help clients guess if there are new keys. ([\#5858](#5858)) - Add `/admin/v2/users` endpoint with pagination. Contributed by Awesome Technologies Innovationslabor GmbH. ([\#5925](#5925)) - Require User-Interactive Authentication for `/account/3pid/add`, meaning the user's password will be required to add a third-party ID to their account. ([\#6119](#6119)) - Implement the `/_matrix/federation/unstable/net.atleastfornow/state/<context>` API as drafted in MSC2314. ([\#6176](#6176)) - Configure privacy-preserving settings by default for the room directory. ([\#6355](#6355)) - Add ephemeral messages support by partially implementing [MSC2228](matrix-org/matrix-spec-proposals#2228). ([\#6409](#6409)) - Add support for [MSC 2367](matrix-org/matrix-spec-proposals#2367), which allows specifying a reason on all membership events. ([\#6434](#6434)) Bugfixes -------- - Transfer non-standard power levels on room upgrade. ([\#6237](#6237)) - Fix error from the Pillow library when uploading RGBA images. ([\#6241](#6241)) - Correctly apply the event filter to the `state`, `events_before` and `events_after` fields in the response to `/context` requests. ([\#6329](#6329)) - Fix caching devices for remote users when using workers, so that we don't attempt to refetch (and potentially fail) each time a user requests devices. ([\#6332](#6332)) - Prevent account data syncs getting lost across TCP replication. ([\#6333](#6333)) - Fix bug: TypeError in `register_user()` while using LDAP auth module. ([\#6406](#6406)) - Fix an intermittent exception when handling read-receipts. ([\#6408](#6408)) - Fix broken guest registration when there are existing blocks of numeric user IDs. ([\#6420](#6420)) - Fix startup error when http proxy is defined. ([\#6421](#6421)) - Fix error when using synapse_port_db on a vanilla synapse db. ([\#6449](#6449)) - Fix uploading multiple cross signing signatures for the same user. ([\#6451](#6451)) - Fix bug which lead to exceptions being thrown in a loop when a cross-signed device is deleted. ([\#6462](#6462)) - Fix `synapse_port_db` not exiting with a 0 code if something went wrong during the port process. ([\#6470](#6470)) - Improve sanity-checking when receiving events over federation. ([\#6472](#6472)) - Fix inaccurate per-block Prometheus metrics. ([\#6491](#6491)) - Fix small performance regression for sending invites. ([\#6493](#6493)) - Back out cross-signing code added in Synapse 1.5.0, which caused a performance regression. ([\#6494](#6494)) Improved Documentation ---------------------- - Update documentation and variables in user contributed systemd reference file. ([\#6369](#6369), [\#6490](#6490)) - Fix link in the user directory documentation. ([\#6388](#6388)) - Add build instructions to the docker readme. ([\#6390](#6390)) - Switch Ubuntu package install recommendation to use python3 packages in INSTALL.md. ([\#6443](#6443)) - Write some docs for the quarantine_media api. ([\#6458](#6458)) - Convert CONTRIBUTING.rst to markdown (among other small fixes). ([\#6461](#6461)) Deprecations and Removals ------------------------- - Remove admin/v1/users_paginate endpoint. Contributed by Awesome Technologies Innovationslabor GmbH. ([\#5925](#5925)) - Remove fallback for federation with old servers which lack the /federation/v1/state_ids API. ([\#6488](#6488)) Internal Changes ---------------- - Add benchmarks for structured logging and improve output performance. ([\#6266](#6266)) - Improve the performance of outputting structured logging. ([\#6322](#6322)) - Refactor some code in the event authentication path for clarity. ([\#6343](#6343), [\#6468](#6468), [\#6480](#6480)) - Clean up some unnecessary quotation marks around the codebase. ([\#6362](#6362)) - Complain on startup instead of 500'ing during runtime when `public_baseurl` isn't set when necessary. ([\#6379](#6379)) - Add a test scenario to make sure room history purges don't break `/messages` in the future. ([\#6392](#6392)) - Clarifications for the email configuration settings. ([\#6423](#6423)) - Add more tests to the blacklist when running in worker mode. ([\#6429](#6429)) - Refactor data store layer to support multiple databases in the future. ([\#6454](#6454), [\#6464](#6464), [\#6469](#6469), [\#6487](#6487)) - Port synapse.rest.client.v1 to async/await. ([\#6482](#6482)) - Port synapse.rest.client.v2_alpha to async/await. ([\#6483](#6483)) - Port SyncHandler to async/await. ([\#6484](#6484))
There was an issue with changes to
account_data
not being reflected upon a/sync
request from a client, specifically changes tom.direct
. The client would send aPUT /.../m.direct
, but never receive the new version down/sync
.This seemed to only happen on servers running in worker mode for large user accounts. It turns out that
m.direct
for large user accounts is huge, literally a dictionary for every user_id and room_id for a direct chat that your account has.When a change is made on the master process, the updated account_data entry is replicated over to synchrotron workers, to tell them that they need to send down a new copy to clients when they ask for it. Unfortunately, the content of this replication was so large that twisted would deny it and drop the request:
This means the synchrotron would never get the update, and never know to send the updated account_data entry to clients.
Instead, since the account_data cache on the synchrotron is already invalidated through replication, we can simply stop sending the content of account_data.m.direct over instead. Replications will, the synchrotron will know to pull m.direct data out of the database a fresh, and everything works again.