From d19bccdbecfeee3e59666748db7fd971cd7978d2 Mon Sep 17 00:00:00 2001 From: Patrick Cloke Date: Thu, 13 May 2021 14:37:20 -0400 Subject: [PATCH 01/52] Update SSO mapping providers documentation about unique IDs. (#9980) --- changelog.d/9980.doc | 1 + docs/sso_mapping_providers.md | 18 +++++++++++------- 2 files changed, 12 insertions(+), 7 deletions(-) create mode 100644 changelog.d/9980.doc diff --git a/changelog.d/9980.doc b/changelog.d/9980.doc new file mode 100644 index 000000000000..d30ed0601da6 --- /dev/null +++ b/changelog.d/9980.doc @@ -0,0 +1 @@ +Clarify documentation around SSO mapping providers generating unique IDs and localparts. diff --git a/docs/sso_mapping_providers.md b/docs/sso_mapping_providers.md index 50020d1a4ad1..6db2dc8be5b9 100644 --- a/docs/sso_mapping_providers.md +++ b/docs/sso_mapping_providers.md @@ -67,8 +67,8 @@ A custom mapping provider must specify the following methods: - Arguments: - `userinfo` - A `authlib.oidc.core.claims.UserInfo` object to extract user information from. - - This method must return a string, which is the unique identifier for the - user. Commonly the ``sub`` claim of the response. + - This method must return a string, which is the unique, immutable identifier + for the user. Commonly the `sub` claim of the response. * `map_user_attributes(self, userinfo, token, failures)` - This method must be async. - Arguments: @@ -87,7 +87,9 @@ A custom mapping provider must specify the following methods: `localpart` value, such as `john.doe1`. - Returns a dictionary with two keys: - `localpart`: A string, used to generate the Matrix ID. If this is - `None`, the user is prompted to pick their own username. + `None`, the user is prompted to pick their own username. This is only used + during a user's first login. Once a localpart has been associated with a + remote user ID (see `get_remote_user_id`) it cannot be updated. - `displayname`: An optional string, the display name for the user. * `get_extra_attributes(self, userinfo, token)` - This method must be async. @@ -153,8 +155,8 @@ A custom mapping provider must specify the following methods: information from. - `client_redirect_url` - A string, the URL that the client will be redirected to. - - This method must return a string, which is the unique identifier for the - user. Commonly the ``uid`` claim of the response. + - This method must return a string, which is the unique, immutable identifier + for the user. Commonly the `uid` claim of the response. * `saml_response_to_user_attributes(self, saml_response, failures, client_redirect_url)` - Arguments: - `saml_response` - A `saml2.response.AuthnResponse` object to extract user @@ -172,8 +174,10 @@ A custom mapping provider must specify the following methods: redirected to. - This method must return a dictionary, which will then be used by Synapse to build a new user. The following keys are allowed: - * `mxid_localpart` - The mxid localpart of the new user. If this is - `None`, the user is prompted to pick their own username. + * `mxid_localpart` - A string, the mxid localpart of the new user. If this is + `None`, the user is prompted to pick their own username. This is only used + during a user's first login. Once a localpart has been associated with a + remote user ID (see `get_remote_user_id`) it cannot be updated. * `displayname` - The displayname of the new user. If not provided, will default to the value of `mxid_localpart`. * `emails` - A list of emails for the new user. If not provided, will From 976216959b3a216a3403b953338f92852c45645c Mon Sep 17 00:00:00 2001 From: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com> Date: Fri, 14 May 2021 09:21:00 +0100 Subject: [PATCH 02/52] Update minimum supported version in postgres.md (#9988) --- changelog.d/9988.doc | 1 + docs/postgres.md | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) create mode 100644 changelog.d/9988.doc diff --git a/changelog.d/9988.doc b/changelog.d/9988.doc new file mode 100644 index 000000000000..87cda376a678 --- /dev/null +++ b/changelog.d/9988.doc @@ -0,0 +1 @@ +Fix outdated minimum PostgreSQL version in postgres.md. diff --git a/docs/postgres.md b/docs/postgres.md index 680685d04ef4..b99fad8a6e24 100644 --- a/docs/postgres.md +++ b/docs/postgres.md @@ -1,6 +1,6 @@ # Using Postgres -Postgres version 9.5 or later is known to work. +Synapse supports PostgreSQL versions 9.6 or later. ## Install postgres client libraries From c14f99be461d8ac9a36ad548e8e463feeda6394c Mon Sep 17 00:00:00 2001 From: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> Date: Fri, 14 May 2021 10:51:08 +0100 Subject: [PATCH 03/52] Support enabling opentracing by user (#9978) Add a config option which allows enabling opentracing by user id, eg for debugging requests made by a test user. --- changelog.d/9978.feature | 1 + docs/opentracing.md | 10 +++++----- docs/sample_config.yaml | 20 ++++++++++++++------ synapse/api/auth.py | 5 +++++ synapse/config/tracer.py | 37 +++++++++++++++++++++++++++++++------ 5 files changed, 56 insertions(+), 17 deletions(-) create mode 100644 changelog.d/9978.feature diff --git a/changelog.d/9978.feature b/changelog.d/9978.feature new file mode 100644 index 000000000000..851adb9f6ea8 --- /dev/null +++ b/changelog.d/9978.feature @@ -0,0 +1 @@ +Add a configuration option which allows enabling opentracing by user id. diff --git a/docs/opentracing.md b/docs/opentracing.md index 4c7a56a5d7f3..f91362f11221 100644 --- a/docs/opentracing.md +++ b/docs/opentracing.md @@ -42,17 +42,17 @@ To receive OpenTracing spans, start up a Jaeger server. This can be done using docker like so: ```sh -docker run -d --name jaeger +docker run -d --name jaeger \ -p 6831:6831/udp \ -p 6832:6832/udp \ -p 5778:5778 \ -p 16686:16686 \ -p 14268:14268 \ - jaegertracing/all-in-one:1.13 + jaegertracing/all-in-one:1 ``` Latest documentation is probably at - +https://www.jaegertracing.io/docs/latest/getting-started. ## Enable OpenTracing in Synapse @@ -62,7 +62,7 @@ as shown in the [sample config](./sample_config.yaml). For example: ```yaml opentracing: - tracer_enabled: true + enabled: true homeserver_whitelist: - "mytrustedhomeserver.org" - "*.myotherhomeservers.com" @@ -90,4 +90,4 @@ to two problems, namely: ## Configuring Jaeger Sampling strategies can be set as in this document: - +. diff --git a/docs/sample_config.yaml b/docs/sample_config.yaml index 67ad57b1aa9a..2952f2ba3201 100644 --- a/docs/sample_config.yaml +++ b/docs/sample_config.yaml @@ -2845,7 +2845,8 @@ opentracing: #enabled: true # The list of homeservers we wish to send and receive span contexts and span baggage. - # See docs/opentracing.rst + # See docs/opentracing.rst. + # # This is a list of regexes which are matched against the server_name of the # homeserver. # @@ -2854,19 +2855,26 @@ opentracing: #homeserver_whitelist: # - ".*" + # A list of the matrix IDs of users whose requests will always be traced, + # even if the tracing system would otherwise drop the traces due to + # probabilistic sampling. + # + # By default, the list is empty. + # + #force_tracing_for_users: + # - "@user1:server_name" + # - "@user2:server_name" + # Jaeger can be configured to sample traces at different rates. # All configuration options provided by Jaeger can be set here. - # Jaeger's configuration mostly related to trace sampling which + # Jaeger's configuration is mostly related to trace sampling which # is documented here: - # https://www.jaegertracing.io/docs/1.13/sampling/. + # https://www.jaegertracing.io/docs/latest/sampling/. # #jaeger_config: # sampler: # type: const # param: 1 - - # Logging whether spans were started and reported - # # logging: # false diff --git a/synapse/api/auth.py b/synapse/api/auth.py index efc926d0941d..458306eba5ec 100644 --- a/synapse/api/auth.py +++ b/synapse/api/auth.py @@ -87,6 +87,7 @@ def __init__(self, hs: "HomeServer"): ) self._track_appservice_user_ips = hs.config.track_appservice_user_ips self._macaroon_secret_key = hs.config.macaroon_secret_key + self._force_tracing_for_users = hs.config.tracing.force_tracing_for_users async def check_from_context( self, room_version: str, event, context, do_sig_check=True @@ -208,6 +209,8 @@ async def get_user_by_req( opentracing.set_tag("authenticated_entity", user_id) opentracing.set_tag("user_id", user_id) opentracing.set_tag("appservice_id", app_service.id) + if user_id in self._force_tracing_for_users: + opentracing.set_tag(opentracing.tags.SAMPLING_PRIORITY, 1) return requester @@ -260,6 +263,8 @@ async def get_user_by_req( opentracing.set_tag("user_id", user_info.user_id) if device_id: opentracing.set_tag("device_id", device_id) + if user_info.token_owner in self._force_tracing_for_users: + opentracing.set_tag(opentracing.tags.SAMPLING_PRIORITY, 1) return requester except KeyError: diff --git a/synapse/config/tracer.py b/synapse/config/tracer.py index db22b5b19fb6..d0ea17261f87 100644 --- a/synapse/config/tracer.py +++ b/synapse/config/tracer.py @@ -12,6 +12,8 @@ # See the License for the specific language governing permissions and # limitations under the License. +from typing import Set + from synapse.python_dependencies import DependencyException, check_requirements from ._base import Config, ConfigError @@ -32,6 +34,8 @@ def read_config(self, config, **kwargs): {"sampler": {"type": "const", "param": 1}, "logging": False}, ) + self.force_tracing_for_users: Set[str] = set() + if not self.opentracer_enabled: return @@ -48,6 +52,19 @@ def read_config(self, config, **kwargs): if not isinstance(self.opentracer_whitelist, list): raise ConfigError("Tracer homeserver_whitelist config is malformed") + force_tracing_for_users = opentracing_config.get("force_tracing_for_users", []) + if not isinstance(force_tracing_for_users, list): + raise ConfigError( + "Expected a list", ("opentracing", "force_tracing_for_users") + ) + for i, u in enumerate(force_tracing_for_users): + if not isinstance(u, str): + raise ConfigError( + "Expected a string", + ("opentracing", "force_tracing_for_users", f"index {i}"), + ) + self.force_tracing_for_users.add(u) + def generate_config_section(cls, **kwargs): return """\ ## Opentracing ## @@ -64,7 +81,8 @@ def generate_config_section(cls, **kwargs): #enabled: true # The list of homeservers we wish to send and receive span contexts and span baggage. - # See docs/opentracing.rst + # See docs/opentracing.rst. + # # This is a list of regexes which are matched against the server_name of the # homeserver. # @@ -73,19 +91,26 @@ def generate_config_section(cls, **kwargs): #homeserver_whitelist: # - ".*" + # A list of the matrix IDs of users whose requests will always be traced, + # even if the tracing system would otherwise drop the traces due to + # probabilistic sampling. + # + # By default, the list is empty. + # + #force_tracing_for_users: + # - "@user1:server_name" + # - "@user2:server_name" + # Jaeger can be configured to sample traces at different rates. # All configuration options provided by Jaeger can be set here. - # Jaeger's configuration mostly related to trace sampling which + # Jaeger's configuration is mostly related to trace sampling which # is documented here: - # https://www.jaegertracing.io/docs/1.13/sampling/. + # https://www.jaegertracing.io/docs/latest/sampling/. # #jaeger_config: # sampler: # type: const # param: 1 - - # Logging whether spans were started and reported - # # logging: # false """ From 498084228b89d30462df0a5adfcc737fdc21d314 Mon Sep 17 00:00:00 2001 From: Dan Callahan Date: Fri, 14 May 2021 10:58:46 +0100 Subject: [PATCH 04/52] Use Python's secrets module instead of random (#9984) Functionally identical, but more obviously cryptographically secure. ...Explicit is better than implicit? Avoids needing to know that SystemRandom() implies a CSPRNG, and complies with the big scary red box on the documentation for random: > Warning: > The pseudo-random generators of this module should not be used for > security purposes. For security or cryptographic uses, see the > secrets module. https://docs.python.org/3/library/random.html Signed-off-by: Dan Callahan --- changelog.d/9984.misc | 1 + synapse/util/stringutils.py | 19 +++++++++++-------- 2 files changed, 12 insertions(+), 8 deletions(-) create mode 100644 changelog.d/9984.misc diff --git a/changelog.d/9984.misc b/changelog.d/9984.misc new file mode 100644 index 000000000000..97bd747f2653 --- /dev/null +++ b/changelog.d/9984.misc @@ -0,0 +1 @@ +Simplify a few helper functions. diff --git a/synapse/util/stringutils.py b/synapse/util/stringutils.py index 4f25cd1d26ae..40cd51a8cac8 100644 --- a/synapse/util/stringutils.py +++ b/synapse/util/stringutils.py @@ -13,8 +13,8 @@ # See the License for the specific language governing permissions and # limitations under the License. import itertools -import random import re +import secrets import string from collections.abc import Iterable from typing import Optional, Tuple @@ -35,18 +35,21 @@ # MXC_REGEX = re.compile("^mxc://([^/]+)/([^/#?]+)$") -# random_string and random_string_with_symbols are used for a range of things, -# some cryptographically important, some less so. We use SystemRandom to make sure -# we get cryptographically-secure randoms. -rand = random.SystemRandom() - def random_string(length: int) -> str: - return "".join(rand.choice(string.ascii_letters) for _ in range(length)) + """Generate a cryptographically secure string of random letters. + + Drawn from the characters: `a-z` and `A-Z` + """ + return "".join(secrets.choice(string.ascii_letters) for _ in range(length)) def random_string_with_symbols(length: int) -> str: - return "".join(rand.choice(_string_with_symbols) for _ in range(length)) + """Generate a cryptographically secure string of random letters/numbers/symbols. + + Drawn from the characters: `a-z`, `A-Z`, `0-9`, and `.,;:^&*-_+=#~@` + """ + return "".join(secrets.choice(_string_with_symbols) for _ in range(length)) def is_ascii(s: bytes) -> bool: From bd918d874f4eb459b9c2f9af6e8e994b6b19d264 Mon Sep 17 00:00:00 2001 From: Dan Callahan Date: Fri, 14 May 2021 10:58:52 +0100 Subject: [PATCH 05/52] Simplify exception handling in is_ascii. (#9985) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit We can get away with just catching UnicodeError here. ⋮ +-- ValueError | +-- UnicodeError | +-- UnicodeDecodeError | +-- UnicodeEncodeError | +-- UnicodeTranslateError ⋮ https://docs.python.org/3/library/exceptions.html#exception-hierarchy Signed-off-by: Dan Callahan --- changelog.d/9985.misc | 1 + synapse/util/stringutils.py | 4 +--- 2 files changed, 2 insertions(+), 3 deletions(-) create mode 100644 changelog.d/9985.misc diff --git a/changelog.d/9985.misc b/changelog.d/9985.misc new file mode 100644 index 000000000000..97bd747f2653 --- /dev/null +++ b/changelog.d/9985.misc @@ -0,0 +1 @@ +Simplify a few helper functions. diff --git a/synapse/util/stringutils.py b/synapse/util/stringutils.py index 40cd51a8cac8..f02943219120 100644 --- a/synapse/util/stringutils.py +++ b/synapse/util/stringutils.py @@ -55,9 +55,7 @@ def random_string_with_symbols(length: int) -> str: def is_ascii(s: bytes) -> bool: try: s.decode("ascii").encode("ascii") - except UnicodeDecodeError: - return False - except UnicodeEncodeError: + except UnicodeError: return False return True From ebdef256b36ec67a8aac9721eea8971dd5cf361f Mon Sep 17 00:00:00 2001 From: Dan Callahan Date: Fri, 14 May 2021 10:58:57 +0100 Subject: [PATCH 06/52] Remove superfluous call to bool() (#9986) Our strtobool already returns a bool, so no need to re-cast here Signed-off-by: Dan Callahan --- changelog.d/9986.misc | 1 + synapse/config/registration.py | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) create mode 100644 changelog.d/9986.misc diff --git a/changelog.d/9986.misc b/changelog.d/9986.misc new file mode 100644 index 000000000000..97bd747f2653 --- /dev/null +++ b/changelog.d/9986.misc @@ -0,0 +1 @@ +Simplify a few helper functions. diff --git a/synapse/config/registration.py b/synapse/config/registration.py index e6f52b4f40ea..d9dc55a0c3c5 100644 --- a/synapse/config/registration.py +++ b/synapse/config/registration.py @@ -349,4 +349,4 @@ def add_arguments(parser): def read_arguments(self, args): if args.enable_registration is not None: - self.enable_registration = bool(strtobool(str(args.enable_registration))) + self.enable_registration = strtobool(str(args.enable_registration)) From 52ed9655edf8849d68a178e1c76040c79824a353 Mon Sep 17 00:00:00 2001 From: Dan Callahan Date: Fri, 14 May 2021 10:59:10 +0100 Subject: [PATCH 07/52] Remove unnecessary SystemRandom from SQLBaseStore (#9987) It's not obvious that instances of SQLBaseStore each need their own instances of random.SystemRandom(); let's just use random directly. Introduced by 52839886d664576831462e033b88e5aba4c019e3 Signed-off-by: Dan Callahan --- changelog.d/9987.misc | 1 + synapse/storage/_base.py | 2 -- synapse/storage/databases/main/registration.py | 3 ++- 3 files changed, 3 insertions(+), 3 deletions(-) create mode 100644 changelog.d/9987.misc diff --git a/changelog.d/9987.misc b/changelog.d/9987.misc new file mode 100644 index 000000000000..02c088e3e6c2 --- /dev/null +++ b/changelog.d/9987.misc @@ -0,0 +1 @@ +Remove unnecessary property from SQLBaseStore. diff --git a/synapse/storage/_base.py b/synapse/storage/_base.py index 3d98d3f5f822..0623da9aa196 100644 --- a/synapse/storage/_base.py +++ b/synapse/storage/_base.py @@ -14,7 +14,6 @@ # See the License for the specific language governing permissions and # limitations under the License. import logging -import random from abc import ABCMeta from typing import TYPE_CHECKING, Any, Collection, Iterable, Optional, Union @@ -44,7 +43,6 @@ def __init__(self, database: DatabasePool, db_conn: Connection, hs: "HomeServer" self._clock = hs.get_clock() self.database_engine = database.engine self.db_pool = database - self.rand = random.SystemRandom() def process_replication_rows( self, diff --git a/synapse/storage/databases/main/registration.py b/synapse/storage/databases/main/registration.py index 6e5ee557d2ef..e5c5cf8ff065 100644 --- a/synapse/storage/databases/main/registration.py +++ b/synapse/storage/databases/main/registration.py @@ -14,6 +14,7 @@ # See the License for the specific language governing permissions and # limitations under the License. import logging +import random import re from typing import TYPE_CHECKING, Any, Dict, List, Optional, Tuple, Union @@ -997,7 +998,7 @@ def set_expiration_date_for_user_txn(self, txn, user_id, use_delta=False): expiration_ts = now_ms + self._account_validity_period if use_delta: - expiration_ts = self.rand.randrange( + expiration_ts = random.randrange( expiration_ts - self._account_validity_startup_job_max_delta, expiration_ts, ) From 5090f26b636bf4439575767a2272d033fb33b2d5 Mon Sep 17 00:00:00 2001 From: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> Date: Fri, 14 May 2021 11:12:36 +0100 Subject: [PATCH 08/52] Minor `@cachedList` enhancements (#9975) - use a tuple rather than a list for the iterable that is passed into the wrapped function, for performance - test that we can pass an iterable and that keys are correctly deduped. --- changelog.d/9975.misc | 1 + synapse/storage/databases/main/devices.py | 2 +- .../storage/databases/main/end_to_end_keys.py | 4 ++-- .../databases/main/user_erasure_store.py | 13 +++++-------- synapse/util/caches/descriptors.py | 14 ++++++++------ tests/util/caches/test_descriptors.py | 17 ++++++++++++++--- 6 files changed, 31 insertions(+), 20 deletions(-) create mode 100644 changelog.d/9975.misc diff --git a/changelog.d/9975.misc b/changelog.d/9975.misc new file mode 100644 index 000000000000..28b1e40c2b99 --- /dev/null +++ b/changelog.d/9975.misc @@ -0,0 +1 @@ +Minor enhancements to the `@cachedList` descriptor. diff --git a/synapse/storage/databases/main/devices.py b/synapse/storage/databases/main/devices.py index c9346de31673..a1f98b7e3854 100644 --- a/synapse/storage/databases/main/devices.py +++ b/synapse/storage/databases/main/devices.py @@ -665,7 +665,7 @@ async def get_device_list_last_stream_id_for_remote( cached_method_name="get_device_list_last_stream_id_for_remote", list_name="user_ids", ) - async def get_device_list_last_stream_id_for_remotes(self, user_ids: str): + async def get_device_list_last_stream_id_for_remotes(self, user_ids: Iterable[str]): rows = await self.db_pool.simple_select_many_batch( table="device_lists_remote_extremeties", column="user_id", diff --git a/synapse/storage/databases/main/end_to_end_keys.py b/synapse/storage/databases/main/end_to_end_keys.py index 398d6b6acb85..9ba5778a8826 100644 --- a/synapse/storage/databases/main/end_to_end_keys.py +++ b/synapse/storage/databases/main/end_to_end_keys.py @@ -473,7 +473,7 @@ def _get_bare_e2e_cross_signing_keys(self, user_id): num_args=1, ) async def _get_bare_e2e_cross_signing_keys_bulk( - self, user_ids: List[str] + self, user_ids: Iterable[str] ) -> Dict[str, Dict[str, dict]]: """Returns the cross-signing keys for a set of users. The output of this function should be passed to _get_e2e_cross_signing_signatures_txn if @@ -497,7 +497,7 @@ async def _get_bare_e2e_cross_signing_keys_bulk( def _get_bare_e2e_cross_signing_keys_bulk_txn( self, txn: Connection, - user_ids: List[str], + user_ids: Iterable[str], ) -> Dict[str, Dict[str, dict]]: """Returns the cross-signing keys for a set of users. The output of this function should be passed to _get_e2e_cross_signing_signatures_txn if diff --git a/synapse/storage/databases/main/user_erasure_store.py b/synapse/storage/databases/main/user_erasure_store.py index acf6b2fb643d..1ecdd40c38d2 100644 --- a/synapse/storage/databases/main/user_erasure_store.py +++ b/synapse/storage/databases/main/user_erasure_store.py @@ -12,6 +12,8 @@ # See the License for the specific language governing permissions and # limitations under the License. +from typing import Dict, Iterable + from synapse.storage._base import SQLBaseStore from synapse.util.caches.descriptors import cached, cachedList @@ -37,21 +39,16 @@ async def is_user_erased(self, user_id: str) -> bool: return bool(result) @cachedList(cached_method_name="is_user_erased", list_name="user_ids") - async def are_users_erased(self, user_ids): + async def are_users_erased(self, user_ids: Iterable[str]) -> Dict[str, bool]: """ Checks which users in a list have requested erasure Args: - user_ids (iterable[str]): full user id to check + user_ids: full user ids to check Returns: - dict[str, bool]: - for each user, whether the user has requested erasure. + for each user, whether the user has requested erasure. """ - # this serves the dual purpose of (a) making sure we can do len and - # iterate it multiple times, and (b) avoiding duplicates. - user_ids = tuple(set(user_ids)) - rows = await self.db_pool.simple_select_many_batch( table="erased_users", column="user_id", diff --git a/synapse/util/caches/descriptors.py b/synapse/util/caches/descriptors.py index ac4a078b260b..3a4d027095bb 100644 --- a/synapse/util/caches/descriptors.py +++ b/synapse/util/caches/descriptors.py @@ -322,8 +322,8 @@ def _wrapped(*args, **kwargs): class DeferredCacheListDescriptor(_CacheDescriptorBase): """Wraps an existing cache to support bulk fetching of keys. - Given a list of keys it looks in the cache to find any hits, then passes - the list of missing keys to the wrapped function. + Given an iterable of keys it looks in the cache to find any hits, then passes + the tuple of missing keys to the wrapped function. Once wrapped, the function returns a Deferred which resolves to the list of results. @@ -437,7 +437,9 @@ def errback(f): return f args_to_call = dict(arg_dict) - args_to_call[self.list_name] = list(missing) + # copy the missing set before sending it to the callee, to guard against + # modification. + args_to_call[self.list_name] = tuple(missing) cached_defers.append( defer.maybeDeferred( @@ -522,14 +524,14 @@ def cachedList( Used to do batch lookups for an already created cache. A single argument is specified as a list that is iterated through to lookup keys in the - original cache. A new list consisting of the keys that weren't in the cache - get passed to the original function, the result of which is stored in the + original cache. A new tuple consisting of the (deduplicated) keys that weren't in + the cache gets passed to the original function, the result of which is stored in the cache. Args: cached_method_name: The name of the single-item lookup method. This is only used to find the cache to use. - list_name: The name of the argument that is the list to use to + list_name: The name of the argument that is the iterable to use to do batch lookups in the cache. num_args: Number of arguments to use as the key in the cache (including list_name). Defaults to all named parameters. diff --git a/tests/util/caches/test_descriptors.py b/tests/util/caches/test_descriptors.py index 178ac8a68cc7..bbbc276697e2 100644 --- a/tests/util/caches/test_descriptors.py +++ b/tests/util/caches/test_descriptors.py @@ -666,18 +666,20 @@ async def list_fn(self, args1, arg2): with LoggingContext("c1") as c1: obj = Cls() obj.mock.return_value = {10: "fish", 20: "chips"} + + # start the lookup off d1 = obj.list_fn([10, 20], 2) self.assertEqual(current_context(), SENTINEL_CONTEXT) r = yield d1 self.assertEqual(current_context(), c1) - obj.mock.assert_called_once_with([10, 20], 2) + obj.mock.assert_called_once_with((10, 20), 2) self.assertEqual(r, {10: "fish", 20: "chips"}) obj.mock.reset_mock() # a call with different params should call the mock again obj.mock.return_value = {30: "peas"} r = yield obj.list_fn([20, 30], 2) - obj.mock.assert_called_once_with([30], 2) + obj.mock.assert_called_once_with((30,), 2) self.assertEqual(r, {20: "chips", 30: "peas"}) obj.mock.reset_mock() @@ -692,6 +694,15 @@ async def list_fn(self, args1, arg2): obj.mock.assert_not_called() self.assertEqual(r, {10: "fish", 20: "chips", 30: "peas"}) + # we should also be able to use a (single-use) iterable, and should + # deduplicate the keys + obj.mock.reset_mock() + obj.mock.return_value = {40: "gravy"} + iterable = (x for x in [10, 40, 40]) + r = yield obj.list_fn(iterable, 2) + obj.mock.assert_called_once_with((40,), 2) + self.assertEqual(r, {10: "fish", 40: "gravy"}) + @defer.inlineCallbacks def test_invalidate(self): """Make sure that invalidation callbacks are called.""" @@ -717,7 +728,7 @@ async def list_fn(self, args1, arg2): # cache miss obj.mock.return_value = {10: "fish", 20: "chips"} r1 = yield obj.list_fn([10, 20], 2, on_invalidate=invalidate0) - obj.mock.assert_called_once_with([10, 20], 2) + obj.mock.assert_called_once_with((10, 20), 2) self.assertEqual(r1, {10: "fish", 20: "chips"}) obj.mock.reset_mock() From 6482075c95957ad980d9c1323f9f982e6f7aaff4 Mon Sep 17 00:00:00 2001 From: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> Date: Fri, 14 May 2021 11:46:35 +0100 Subject: [PATCH 09/52] Run `black` on the scripts (#9981) Turns out these scripts weren't getting linted. --- changelog.d/9981.misc | 1 + scripts-dev/build_debian_packages | 105 +++++++++++++++++++----------- scripts-dev/lint.sh | 18 ++++- scripts/export_signing_key | 13 +++- scripts/generate_config | 18 ++--- scripts/hash_password | 6 +- scripts/synapse_port_db | 46 +++++++------ tox.ini | 10 +++ 8 files changed, 141 insertions(+), 76 deletions(-) create mode 100644 changelog.d/9981.misc diff --git a/changelog.d/9981.misc b/changelog.d/9981.misc new file mode 100644 index 000000000000..677c9b4cbdc1 --- /dev/null +++ b/changelog.d/9981.misc @@ -0,0 +1 @@ +Run `black` on files in the `scripts` directory. diff --git a/scripts-dev/build_debian_packages b/scripts-dev/build_debian_packages index 07d018db9985..546724f89fea 100755 --- a/scripts-dev/build_debian_packages +++ b/scripts-dev/build_debian_packages @@ -21,18 +21,18 @@ DISTS = ( "debian:buster", "debian:bullseye", "debian:sid", - "ubuntu:bionic", # 18.04 LTS (our EOL forced by Py36 on 2021-12-23) - "ubuntu:focal", # 20.04 LTS (our EOL forced by Py38 on 2024-10-14) - "ubuntu:groovy", # 20.10 (EOL 2021-07-07) + "ubuntu:bionic", # 18.04 LTS (our EOL forced by Py36 on 2021-12-23) + "ubuntu:focal", # 20.04 LTS (our EOL forced by Py38 on 2024-10-14) + "ubuntu:groovy", # 20.10 (EOL 2021-07-07) "ubuntu:hirsute", # 21.04 (EOL 2022-01-05) ) -DESC = '''\ +DESC = """\ Builds .debs for synapse, using a Docker image for the build environment. By default, builds for all known distributions, but a list of distributions can be passed on the commandline for debugging. -''' +""" class Builder(object): @@ -46,7 +46,7 @@ class Builder(object): """Build deb for a single distribution""" if self._failed: - print("not building %s due to earlier failure" % (dist, )) + print("not building %s due to earlier failure" % (dist,)) raise Exception("failed") try: @@ -68,48 +68,65 @@ class Builder(object): # we tend to get source packages which are full of debs. (We could hack # around that with more magic in the build_debian.sh script, but that # doesn't solve the problem for natively-run dpkg-buildpakage). - debsdir = os.path.join(projdir, '../debs') + debsdir = os.path.join(projdir, "../debs") os.makedirs(debsdir, exist_ok=True) if self.redirect_stdout: - logfile = os.path.join(debsdir, "%s.buildlog" % (tag, )) + logfile = os.path.join(debsdir, "%s.buildlog" % (tag,)) print("building %s: directing output to %s" % (dist, logfile)) stdout = open(logfile, "w") else: stdout = None # first build a docker image for the build environment - subprocess.check_call([ - "docker", "build", - "--tag", "dh-venv-builder:" + tag, - "--build-arg", "distro=" + dist, - "-f", "docker/Dockerfile-dhvirtualenv", - "docker", - ], stdout=stdout, stderr=subprocess.STDOUT) + subprocess.check_call( + [ + "docker", + "build", + "--tag", + "dh-venv-builder:" + tag, + "--build-arg", + "distro=" + dist, + "-f", + "docker/Dockerfile-dhvirtualenv", + "docker", + ], + stdout=stdout, + stderr=subprocess.STDOUT, + ) container_name = "synapse_build_" + tag with self._lock: self.active_containers.add(container_name) # then run the build itself - subprocess.check_call([ - "docker", "run", - "--rm", - "--name", container_name, - "--volume=" + projdir + ":/synapse/source:ro", - "--volume=" + debsdir + ":/debs", - "-e", "TARGET_USERID=%i" % (os.getuid(), ), - "-e", "TARGET_GROUPID=%i" % (os.getgid(), ), - "-e", "DEB_BUILD_OPTIONS=%s" % ("nocheck" if skip_tests else ""), - "dh-venv-builder:" + tag, - ], stdout=stdout, stderr=subprocess.STDOUT) + subprocess.check_call( + [ + "docker", + "run", + "--rm", + "--name", + container_name, + "--volume=" + projdir + ":/synapse/source:ro", + "--volume=" + debsdir + ":/debs", + "-e", + "TARGET_USERID=%i" % (os.getuid(),), + "-e", + "TARGET_GROUPID=%i" % (os.getgid(),), + "-e", + "DEB_BUILD_OPTIONS=%s" % ("nocheck" if skip_tests else ""), + "dh-venv-builder:" + tag, + ], + stdout=stdout, + stderr=subprocess.STDOUT, + ) with self._lock: self.active_containers.remove(container_name) if stdout is not None: stdout.close() - print("Completed build of %s" % (dist, )) + print("Completed build of %s" % (dist,)) def kill_containers(self): with self._lock: @@ -117,9 +134,14 @@ class Builder(object): for c in active: print("killing container %s" % (c,)) - subprocess.run([ - "docker", "kill", c, - ], stdout=subprocess.DEVNULL) + subprocess.run( + [ + "docker", + "kill", + c, + ], + stdout=subprocess.DEVNULL, + ) with self._lock: self.active_containers.remove(c) @@ -130,31 +152,38 @@ def run_builds(dists, jobs=1, skip_tests=False): def sig(signum, _frame): print("Caught SIGINT") builder.kill_containers() + signal.signal(signal.SIGINT, sig) with ThreadPoolExecutor(max_workers=jobs) as e: res = e.map(lambda dist: builder.run_build(dist, skip_tests), dists) # make sure we consume the iterable so that exceptions are raised. - for r in res: + for _ in res: pass -if __name__ == '__main__': +if __name__ == "__main__": parser = argparse.ArgumentParser( description=DESC, ) parser.add_argument( - '-j', '--jobs', type=int, default=1, - help='specify the number of builds to run in parallel', + "-j", + "--jobs", + type=int, + default=1, + help="specify the number of builds to run in parallel", ) parser.add_argument( - '--no-check', action='store_true', - help='skip running tests after building', + "--no-check", + action="store_true", + help="skip running tests after building", ) parser.add_argument( - 'dist', nargs='*', default=DISTS, - help='a list of distributions to build for. Default: %(default)s', + "dist", + nargs="*", + default=DISTS, + help="a list of distributions to build for. Default: %(default)s", ) args = parser.parse_args() run_builds(dists=args.dist, jobs=args.jobs, skip_tests=args.no_check) diff --git a/scripts-dev/lint.sh b/scripts-dev/lint.sh index 9761e975943c..869eb2372d51 100755 --- a/scripts-dev/lint.sh +++ b/scripts-dev/lint.sh @@ -80,8 +80,22 @@ else # then lint everything! if [[ -z ${files+x} ]]; then # Lint all source code files and directories - # Note: this list aims the mirror the one in tox.ini - files=("synapse" "docker" "tests" "scripts-dev" "scripts" "contrib" "synctl" "setup.py" "synmark" "stubs" ".buildkite") + # Note: this list aims to mirror the one in tox.ini + files=( + "synapse" "docker" "tests" + # annoyingly, black doesn't find these so we have to list them + "scripts/export_signing_key" + "scripts/generate_config" + "scripts/generate_log_config" + "scripts/hash_password" + "scripts/register_new_matrix_user" + "scripts/synapse_port_db" + "scripts-dev" + "scripts-dev/build_debian_packages" + "scripts-dev/sign_json" + "scripts-dev/update_database" + "contrib" "synctl" "setup.py" "synmark" "stubs" ".buildkite" + ) fi fi diff --git a/scripts/export_signing_key b/scripts/export_signing_key index 0ed167ea855c..bf0139bd64b8 100755 --- a/scripts/export_signing_key +++ b/scripts/export_signing_key @@ -30,7 +30,11 @@ def exit(status: int = 0, message: Optional[str] = None): def format_plain(public_key: nacl.signing.VerifyKey): print( "%s:%s %s" - % (public_key.alg, public_key.version, encode_verify_key_base64(public_key),) + % ( + public_key.alg, + public_key.version, + encode_verify_key_base64(public_key), + ) ) @@ -50,7 +54,10 @@ if __name__ == "__main__": parser = argparse.ArgumentParser() parser.add_argument( - "key_file", nargs="+", type=argparse.FileType("r"), help="The key file to read", + "key_file", + nargs="+", + type=argparse.FileType("r"), + help="The key file to read", ) parser.add_argument( @@ -63,7 +70,7 @@ if __name__ == "__main__": parser.add_argument( "--expiry-ts", type=int, - default=int(time.time() * 1000) + 6*3600000, + default=int(time.time() * 1000) + 6 * 3600000, help=( "The expiry time to use for -x, in milliseconds since 1970. The default " "is (now+6h)." diff --git a/scripts/generate_config b/scripts/generate_config index 771cbf8d95ad..931b40c045f3 100755 --- a/scripts/generate_config +++ b/scripts/generate_config @@ -11,23 +11,22 @@ if __name__ == "__main__": parser.add_argument( "--config-dir", default="CONFDIR", - help="The path where the config files are kept. Used to create filenames for " - "things like the log config and the signing key. Default: %(default)s", + "things like the log config and the signing key. Default: %(default)s", ) parser.add_argument( "--data-dir", default="DATADIR", help="The path where the data files are kept. Used to create filenames for " - "things like the database and media store. Default: %(default)s", + "things like the database and media store. Default: %(default)s", ) parser.add_argument( "--server-name", default="SERVERNAME", help="The server name. Used to initialise the server_name config param, but also " - "used in the names of some of the config files. Default: %(default)s", + "used in the names of some of the config files. Default: %(default)s", ) parser.add_argument( @@ -41,21 +40,22 @@ if __name__ == "__main__": "--generate-secrets", action="store_true", help="Enable generation of new secrets for things like the macaroon_secret_key." - "By default, these parameters will be left unset." + "By default, these parameters will be left unset.", ) parser.add_argument( - "-o", "--output-file", - type=argparse.FileType('w'), + "-o", + "--output-file", + type=argparse.FileType("w"), default=sys.stdout, help="File to write the configuration to. Default: stdout", ) parser.add_argument( "--header-file", - type=argparse.FileType('r'), + type=argparse.FileType("r"), help="File from which to read a header, which will be printed before the " - "generated config.", + "generated config.", ) args = parser.parse_args() diff --git a/scripts/hash_password b/scripts/hash_password index a30767f758a1..1d6fb0d70022 100755 --- a/scripts/hash_password +++ b/scripts/hash_password @@ -41,7 +41,7 @@ if __name__ == "__main__": parser.add_argument( "-c", "--config", - type=argparse.FileType('r'), + type=argparse.FileType("r"), help=( "Path to server config file. " "Used to read in bcrypt_rounds and password_pepper." @@ -72,8 +72,8 @@ if __name__ == "__main__": pw = unicodedata.normalize("NFKC", password) hashed = bcrypt.hashpw( - pw.encode('utf8') + password_pepper.encode("utf8"), + pw.encode("utf8") + password_pepper.encode("utf8"), bcrypt.gensalt(bcrypt_rounds), - ).decode('ascii') + ).decode("ascii") print(hashed) diff --git a/scripts/synapse_port_db b/scripts/synapse_port_db index 5fb5bb35f77d..7c7645c05a63 100755 --- a/scripts/synapse_port_db +++ b/scripts/synapse_port_db @@ -294,8 +294,7 @@ class Porter(object): return table, already_ported, total_to_port, forward_chunk, backward_chunk async def get_table_constraints(self) -> Dict[str, Set[str]]: - """Returns a map of tables that have foreign key constraints to tables they depend on. - """ + """Returns a map of tables that have foreign key constraints to tables they depend on.""" def _get_constraints(txn): # We can pull the information about foreign key constraints out from @@ -504,7 +503,9 @@ class Porter(object): return def build_db_store( - self, db_config: DatabaseConnectionConfig, allow_outdated_version: bool = False, + self, + db_config: DatabaseConnectionConfig, + allow_outdated_version: bool = False, ): """Builds and returns a database store using the provided configuration. @@ -740,7 +741,7 @@ class Porter(object): return col outrows = [] - for i, row in enumerate(rows): + for row in rows: try: outrows.append( tuple(conv(j, col) for j, col in enumerate(row) if j > 0) @@ -890,8 +891,7 @@ class Porter(object): await self.postgres_store.db_pool.runInteraction("setup_user_id_seq", r) async def _setup_events_stream_seqs(self) -> None: - """Set the event stream sequences to the correct values. - """ + """Set the event stream sequences to the correct values.""" # We get called before we've ported the events table, so we need to # fetch the current positions from the SQLite store. @@ -920,12 +920,14 @@ class Porter(object): ) await self.postgres_store.db_pool.runInteraction( - "_setup_events_stream_seqs", _setup_events_stream_seqs_set_pos, + "_setup_events_stream_seqs", + _setup_events_stream_seqs_set_pos, ) - async def _setup_sequence(self, sequence_name: str, stream_id_tables: Iterable[str]) -> None: - """Set a sequence to the correct value. - """ + async def _setup_sequence( + self, sequence_name: str, stream_id_tables: Iterable[str] + ) -> None: + """Set a sequence to the correct value.""" current_stream_ids = [] for stream_id_table in stream_id_tables: max_stream_id = await self.sqlite_store.db_pool.simple_select_one_onecol( @@ -939,14 +941,19 @@ class Porter(object): next_id = max(current_stream_ids) + 1 def r(txn): - sql = "ALTER SEQUENCE %s RESTART WITH" % (sequence_name, ) - txn.execute(sql + " %s", (next_id, )) + sql = "ALTER SEQUENCE %s RESTART WITH" % (sequence_name,) + txn.execute(sql + " %s", (next_id,)) - await self.postgres_store.db_pool.runInteraction("_setup_%s" % (sequence_name,), r) + await self.postgres_store.db_pool.runInteraction( + "_setup_%s" % (sequence_name,), r + ) async def _setup_auth_chain_sequence(self) -> None: curr_chain_id = await self.sqlite_store.db_pool.simple_select_one_onecol( - table="event_auth_chains", keyvalues={}, retcol="MAX(chain_id)", allow_none=True + table="event_auth_chains", + keyvalues={}, + retcol="MAX(chain_id)", + allow_none=True, ) def r(txn): @@ -968,8 +975,7 @@ class Porter(object): class Progress(object): - """Used to report progress of the port - """ + """Used to report progress of the port""" def __init__(self): self.tables = {} @@ -994,8 +1000,7 @@ class Progress(object): class CursesProgress(Progress): - """Reports progress to a curses window - """ + """Reports progress to a curses window""" def __init__(self, stdscr): self.stdscr = stdscr @@ -1020,7 +1025,7 @@ class CursesProgress(Progress): self.total_processed = 0 self.total_remaining = 0 - for table, data in self.tables.items(): + for data in self.tables.values(): self.total_processed += data["num_done"] - data["start"] self.total_remaining += data["total"] - data["num_done"] @@ -1111,8 +1116,7 @@ class CursesProgress(Progress): class TerminalProgress(Progress): - """Just prints progress to the terminal - """ + """Just prints progress to the terminal""" def update(self, table, num_done): super(TerminalProgress, self).update(table, num_done) diff --git a/tox.ini b/tox.ini index ecd609271d1d..da77d124fc0e 100644 --- a/tox.ini +++ b/tox.ini @@ -34,7 +34,17 @@ lint_targets = synapse tests scripts + # annoyingly, black doesn't find these so we have to list them + scripts/export_signing_key + scripts/generate_config + scripts/generate_log_config + scripts/hash_password + scripts/register_new_matrix_user + scripts/synapse_port_db scripts-dev + scripts-dev/build_debian_packages + scripts-dev/sign_json + scripts-dev/update_database stubs contrib synctl From 66609122260ad151359b9c0028634094cf51b5c5 Mon Sep 17 00:00:00 2001 From: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> Date: Fri, 14 May 2021 13:14:48 +0100 Subject: [PATCH 10/52] Update postgres docs (#9989) --- changelog.d/9988.doc | 2 +- changelog.d/9989.doc | 1 + docs/postgres.md | 198 +++++++++++++++++++++---------------------- 3 files changed, 98 insertions(+), 103 deletions(-) create mode 100644 changelog.d/9989.doc diff --git a/changelog.d/9988.doc b/changelog.d/9988.doc index 87cda376a678..25338c44c312 100644 --- a/changelog.d/9988.doc +++ b/changelog.d/9988.doc @@ -1 +1 @@ -Fix outdated minimum PostgreSQL version in postgres.md. +Updates to the PostgreSQL documentation (`postgres.md`). diff --git a/changelog.d/9989.doc b/changelog.d/9989.doc new file mode 100644 index 000000000000..25338c44c312 --- /dev/null +++ b/changelog.d/9989.doc @@ -0,0 +1 @@ +Updates to the PostgreSQL documentation (`postgres.md`). diff --git a/docs/postgres.md b/docs/postgres.md index b99fad8a6e24..f83155e52a30 100644 --- a/docs/postgres.md +++ b/docs/postgres.md @@ -33,28 +33,15 @@ Assuming your PostgreSQL database user is called `postgres`, first authenticate # Or, if your system uses sudo to get administrative rights sudo -u postgres bash -Then, create a user ``synapse_user`` with: +Then, create a postgres user and a database with: + # this will prompt for a password for the new user createuser --pwprompt synapse_user -Before you can authenticate with the `synapse_user`, you must create a -database that it can access. To create a database, first connect to the -database with your database user: + createdb --encoding=UTF8 --locale=C --template=template0 --owner=synapse_user synapse - su - postgres # Or: sudo -u postgres bash - psql - -and then run: - - CREATE DATABASE synapse - ENCODING 'UTF8' - LC_COLLATE='C' - LC_CTYPE='C' - template=template0 - OWNER synapse_user; - -This would create an appropriate database named `synapse` owned by the -`synapse_user` user (which must already have been created as above). +The above will create a user called `synapse_user`, and a database called +`synapse`. Note that the PostgreSQL database *must* have the correct encoding set (as shown above), otherwise it will not be able to store UTF8 strings. @@ -63,79 +50,6 @@ You may need to enable password authentication so `synapse_user` can connect to the database. See . -If you get an error along the lines of `FATAL: Ident authentication failed for -user "synapse_user"`, you may need to use an authentication method other than -`ident`: - -* If the `synapse_user` user has a password, add the password to the `database:` - section of `homeserver.yaml`. Then add the following to `pg_hba.conf`: - - ``` - host synapse synapse_user ::1/128 md5 # or `scram-sha-256` instead of `md5` if you use that - ``` - -* If the `synapse_user` user does not have a password, then a password doesn't - have to be added to `homeserver.yaml`. But the following does need to be added - to `pg_hba.conf`: - - ``` - host synapse synapse_user ::1/128 trust - ``` - -Note that line order matters in `pg_hba.conf`, so make sure that if you do add a -new line, it is inserted before: - -``` -host all all ::1/128 ident -``` - -### Fixing incorrect `COLLATE` or `CTYPE` - -Synapse will refuse to set up a new database if it has the wrong values of -`COLLATE` and `CTYPE` set, and will log warnings on existing databases. Using -different locales can cause issues if the locale library is updated from -underneath the database, or if a different version of the locale is used on any -replicas. - -The safest way to fix the issue is to take a dump and recreate the database with -the correct `COLLATE` and `CTYPE` parameters (as shown above). It is also possible to change the -parameters on a live database and run a `REINDEX` on the entire database, -however extreme care must be taken to avoid database corruption. - -Note that the above may fail with an error about duplicate rows if corruption -has already occurred, and such duplicate rows will need to be manually removed. - - -## Fixing inconsistent sequences error - -Synapse uses Postgres sequences to generate IDs for various tables. A sequence -and associated table can get out of sync if, for example, Synapse has been -downgraded and then upgraded again. - -To fix the issue shut down Synapse (including any and all workers) and run the -SQL command included in the error message. Once done Synapse should start -successfully. - - -## Tuning Postgres - -The default settings should be fine for most deployments. For larger -scale deployments tuning some of the settings is recommended, details of -which can be found at -. - -In particular, we've found tuning the following values helpful for -performance: - -- `shared_buffers` -- `effective_cache_size` -- `work_mem` -- `maintenance_work_mem` -- `autovacuum_work_mem` - -Note that the appropriate values for those fields depend on the amount -of free memory the database host has available. - ## Synapse config When you are ready to start using PostgreSQL, edit the `database` @@ -165,18 +79,42 @@ may block for an extended period while it waits for a response from the database server. Example values might be: ```yaml -# seconds of inactivity after which TCP should send a keepalive message to the server -keepalives_idle: 10 +database: + args: + # ... as above + + # seconds of inactivity after which TCP should send a keepalive message to the server + keepalives_idle: 10 -# the number of seconds after which a TCP keepalive message that is not -# acknowledged by the server should be retransmitted -keepalives_interval: 10 + # the number of seconds after which a TCP keepalive message that is not + # acknowledged by the server should be retransmitted + keepalives_interval: 10 -# the number of TCP keepalives that can be lost before the client's connection -# to the server is considered dead -keepalives_count: 3 + # the number of TCP keepalives that can be lost before the client's connection + # to the server is considered dead + keepalives_count: 3 ``` +## Tuning Postgres + +The default settings should be fine for most deployments. For larger +scale deployments tuning some of the settings is recommended, details of +which can be found at +. + +In particular, we've found tuning the following values helpful for +performance: + +- `shared_buffers` +- `effective_cache_size` +- `work_mem` +- `maintenance_work_mem` +- `autovacuum_work_mem` + +Note that the appropriate values for those fields depend on the amount +of free memory the database host has available. + + ## Porting from SQLite ### Overview @@ -185,9 +123,8 @@ The script `synapse_port_db` allows porting an existing synapse server backed by SQLite to using PostgreSQL. This is done in as a two phase process: -1. Copy the existing SQLite database to a separate location (while the - server is down) and running the port script against that offline - database. +1. Copy the existing SQLite database to a separate location and run + the port script against that offline database. 2. Shut down the server. Rerun the port script to port any data that has come in since taking the first snapshot. Restart server against the PostgreSQL database. @@ -245,3 +182,60 @@ PostgreSQL database configuration file `homeserver-postgres.yaml`: ./synctl start Synapse should now be running against PostgreSQL. + + +## Troubleshooting + +### Alternative auth methods + +If you get an error along the lines of `FATAL: Ident authentication failed for +user "synapse_user"`, you may need to use an authentication method other than +`ident`: + +* If the `synapse_user` user has a password, add the password to the `database:` + section of `homeserver.yaml`. Then add the following to `pg_hba.conf`: + + ``` + host synapse synapse_user ::1/128 md5 # or `scram-sha-256` instead of `md5` if you use that + ``` + +* If the `synapse_user` user does not have a password, then a password doesn't + have to be added to `homeserver.yaml`. But the following does need to be added + to `pg_hba.conf`: + + ``` + host synapse synapse_user ::1/128 trust + ``` + +Note that line order matters in `pg_hba.conf`, so make sure that if you do add a +new line, it is inserted before: + +``` +host all all ::1/128 ident +``` + +### Fixing incorrect `COLLATE` or `CTYPE` + +Synapse will refuse to set up a new database if it has the wrong values of +`COLLATE` and `CTYPE` set, and will log warnings on existing databases. Using +different locales can cause issues if the locale library is updated from +underneath the database, or if a different version of the locale is used on any +replicas. + +The safest way to fix the issue is to dump the database and recreate it with +the correct locale parameter (as shown above). It is also possible to change the +parameters on a live database and run a `REINDEX` on the entire database, +however extreme care must be taken to avoid database corruption. + +Note that the above may fail with an error about duplicate rows if corruption +has already occurred, and such duplicate rows will need to be manually removed. + +### Fixing inconsistent sequences error + +Synapse uses Postgres sequences to generate IDs for various tables. A sequence +and associated table can get out of sync if, for example, Synapse has been +downgraded and then upgraded again. + +To fix the issue shut down Synapse (including any and all workers) and run the +SQL command included in the error message. Once done Synapse should start +successfully. From 41ac128fd39b30fe33b6c871a8317ba833eb4ef7 Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Mon, 17 May 2021 12:33:38 +0200 Subject: [PATCH 11/52] Split multiplart email sending into a dedicated handler (#9977) Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com> --- changelog.d/9977.misc | 1 + synapse/handlers/account_validity.py | 55 +++------------- synapse/handlers/send_email.py | 98 ++++++++++++++++++++++++++++ synapse/push/mailer.py | 53 +++------------ synapse/server.py | 5 ++ 5 files changed, 122 insertions(+), 90 deletions(-) create mode 100644 changelog.d/9977.misc create mode 100644 synapse/handlers/send_email.py diff --git a/changelog.d/9977.misc b/changelog.d/9977.misc new file mode 100644 index 000000000000..093dffc6beac --- /dev/null +++ b/changelog.d/9977.misc @@ -0,0 +1 @@ +Split multipart email sending into a dedicated handler. diff --git a/synapse/handlers/account_validity.py b/synapse/handlers/account_validity.py index 5b927f10b3a9..d752cf34f0cc 100644 --- a/synapse/handlers/account_validity.py +++ b/synapse/handlers/account_validity.py @@ -15,12 +15,9 @@ import email.mime.multipart import email.utils import logging -from email.mime.multipart import MIMEMultipart -from email.mime.text import MIMEText from typing import TYPE_CHECKING, List, Optional, Tuple from synapse.api.errors import StoreError, SynapseError -from synapse.logging.context import make_deferred_yieldable from synapse.metrics.background_process_metrics import wrap_as_background_process from synapse.types import UserID from synapse.util import stringutils @@ -36,9 +33,11 @@ def __init__(self, hs: "HomeServer"): self.hs = hs self.config = hs.config self.store = self.hs.get_datastore() - self.sendmail = self.hs.get_sendmail() + self.send_email_handler = self.hs.get_send_email_handler() self.clock = self.hs.get_clock() + self._app_name = self.hs.config.email_app_name + self._account_validity_enabled = ( hs.config.account_validity.account_validity_enabled ) @@ -63,23 +62,10 @@ def __init__(self, hs: "HomeServer"): self._template_text = ( hs.config.account_validity.account_validity_template_text ) - account_validity_renew_email_subject = ( + self._renew_email_subject = ( hs.config.account_validity.account_validity_renew_email_subject ) - try: - app_name = hs.config.email_app_name - - self._subject = account_validity_renew_email_subject % {"app": app_name} - - self._from_string = hs.config.email_notif_from % {"app": app_name} - except Exception: - # If substitution failed, fall back to the bare strings. - self._subject = account_validity_renew_email_subject - self._from_string = hs.config.email_notif_from - - self._raw_from = email.utils.parseaddr(self._from_string)[1] - # Check the renewal emails to send and send them every 30min. if hs.config.run_background_tasks: self.clock.looping_call(self._send_renewal_emails, 30 * 60 * 1000) @@ -159,38 +145,17 @@ async def _send_renewal_email(self, user_id: str, expiration_ts: int) -> None: } html_text = self._template_html.render(**template_vars) - html_part = MIMEText(html_text, "html", "utf8") - plain_text = self._template_text.render(**template_vars) - text_part = MIMEText(plain_text, "plain", "utf8") for address in addresses: raw_to = email.utils.parseaddr(address)[1] - multipart_msg = MIMEMultipart("alternative") - multipart_msg["Subject"] = self._subject - multipart_msg["From"] = self._from_string - multipart_msg["To"] = address - multipart_msg["Date"] = email.utils.formatdate() - multipart_msg["Message-ID"] = email.utils.make_msgid() - multipart_msg.attach(text_part) - multipart_msg.attach(html_part) - - logger.info("Sending renewal email to %s", address) - - await make_deferred_yieldable( - self.sendmail( - self.hs.config.email_smtp_host, - self._raw_from, - raw_to, - multipart_msg.as_string().encode("utf8"), - reactor=self.hs.get_reactor(), - port=self.hs.config.email_smtp_port, - requireAuthentication=self.hs.config.email_smtp_user is not None, - username=self.hs.config.email_smtp_user, - password=self.hs.config.email_smtp_pass, - requireTransportSecurity=self.hs.config.require_transport_security, - ) + await self.send_email_handler.send_email( + email_address=raw_to, + subject=self._renew_email_subject, + app_name=self._app_name, + html=html_text, + text=plain_text, ) await self.store.set_renewal_mail_status(user_id=user_id, email_sent=True) diff --git a/synapse/handlers/send_email.py b/synapse/handlers/send_email.py new file mode 100644 index 000000000000..e9f6aef06f01 --- /dev/null +++ b/synapse/handlers/send_email.py @@ -0,0 +1,98 @@ +# Copyright 2021 The Matrix.org C.I.C. Foundation +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import email.utils +import logging +from email.mime.multipart import MIMEMultipart +from email.mime.text import MIMEText +from typing import TYPE_CHECKING + +from synapse.logging.context import make_deferred_yieldable + +if TYPE_CHECKING: + from synapse.server import HomeServer + +logger = logging.getLogger(__name__) + + +class SendEmailHandler: + def __init__(self, hs: "HomeServer"): + self.hs = hs + + self._sendmail = hs.get_sendmail() + self._reactor = hs.get_reactor() + + self._from = hs.config.email.email_notif_from + self._smtp_host = hs.config.email.email_smtp_host + self._smtp_port = hs.config.email.email_smtp_port + self._smtp_user = hs.config.email.email_smtp_user + self._smtp_pass = hs.config.email.email_smtp_pass + self._require_transport_security = hs.config.email.require_transport_security + + async def send_email( + self, + email_address: str, + subject: str, + app_name: str, + html: str, + text: str, + ) -> None: + """Send a multipart email with the given information. + + Args: + email_address: The address to send the email to. + subject: The email's subject. + app_name: The app name to include in the From header. + html: The HTML content to include in the email. + text: The plain text content to include in the email. + """ + try: + from_string = self._from % {"app": app_name} + except (KeyError, TypeError): + from_string = self._from + + raw_from = email.utils.parseaddr(from_string)[1] + raw_to = email.utils.parseaddr(email_address)[1] + + if raw_to == "": + raise RuntimeError("Invalid 'to' address") + + html_part = MIMEText(html, "html", "utf8") + text_part = MIMEText(text, "plain", "utf8") + + multipart_msg = MIMEMultipart("alternative") + multipart_msg["Subject"] = subject + multipart_msg["From"] = from_string + multipart_msg["To"] = email_address + multipart_msg["Date"] = email.utils.formatdate() + multipart_msg["Message-ID"] = email.utils.make_msgid() + multipart_msg.attach(text_part) + multipart_msg.attach(html_part) + + logger.info("Sending email to %s" % email_address) + + await make_deferred_yieldable( + self._sendmail( + self._smtp_host, + raw_from, + raw_to, + multipart_msg.as_string().encode("utf8"), + reactor=self._reactor, + port=self._smtp_port, + requireAuthentication=self._smtp_user is not None, + username=self._smtp_user, + password=self._smtp_pass, + requireTransportSecurity=self._require_transport_security, + ) + ) diff --git a/synapse/push/mailer.py b/synapse/push/mailer.py index c4b43b0d3fc8..5f9ea5003a82 100644 --- a/synapse/push/mailer.py +++ b/synapse/push/mailer.py @@ -12,12 +12,8 @@ # See the License for the specific language governing permissions and # limitations under the License. -import email.mime.multipart -import email.utils import logging import urllib.parse -from email.mime.multipart import MIMEMultipart -from email.mime.text import MIMEText from typing import TYPE_CHECKING, Any, Dict, Iterable, List, Optional, TypeVar import bleach @@ -27,7 +23,6 @@ from synapse.api.errors import StoreError from synapse.config.emailconfig import EmailSubjectConfig from synapse.events import EventBase -from synapse.logging.context import make_deferred_yieldable from synapse.push.presentable_names import ( calculate_room_name, descriptor_from_member_events, @@ -108,7 +103,7 @@ def __init__( self.template_html = template_html self.template_text = template_text - self.sendmail = self.hs.get_sendmail() + self.send_email_handler = hs.get_send_email_handler() self.store = self.hs.get_datastore() self.state_store = self.hs.get_storage().state self.macaroon_gen = self.hs.get_macaroon_generator() @@ -310,17 +305,6 @@ async def send_email( self, email_address: str, subject: str, extra_template_vars: Dict[str, Any] ) -> None: """Send an email with the given information and template text""" - try: - from_string = self.hs.config.email_notif_from % {"app": self.app_name} - except TypeError: - from_string = self.hs.config.email_notif_from - - raw_from = email.utils.parseaddr(from_string)[1] - raw_to = email.utils.parseaddr(email_address)[1] - - if raw_to == "": - raise RuntimeError("Invalid 'to' address") - template_vars = { "app_name": self.app_name, "server_name": self.hs.config.server.server_name, @@ -329,35 +313,14 @@ async def send_email( template_vars.update(extra_template_vars) html_text = self.template_html.render(**template_vars) - html_part = MIMEText(html_text, "html", "utf8") - plain_text = self.template_text.render(**template_vars) - text_part = MIMEText(plain_text, "plain", "utf8") - - multipart_msg = MIMEMultipart("alternative") - multipart_msg["Subject"] = subject - multipart_msg["From"] = from_string - multipart_msg["To"] = email_address - multipart_msg["Date"] = email.utils.formatdate() - multipart_msg["Message-ID"] = email.utils.make_msgid() - multipart_msg.attach(text_part) - multipart_msg.attach(html_part) - - logger.info("Sending email to %s" % email_address) - - await make_deferred_yieldable( - self.sendmail( - self.hs.config.email_smtp_host, - raw_from, - raw_to, - multipart_msg.as_string().encode("utf8"), - reactor=self.hs.get_reactor(), - port=self.hs.config.email_smtp_port, - requireAuthentication=self.hs.config.email_smtp_user is not None, - username=self.hs.config.email_smtp_user, - password=self.hs.config.email_smtp_pass, - requireTransportSecurity=self.hs.config.require_transport_security, - ) + + await self.send_email_handler.send_email( + email_address=email_address, + subject=subject, + app_name=self.app_name, + html=html_text, + text=plain_text, ) async def _get_room_vars( diff --git a/synapse/server.py b/synapse/server.py index 2337d2d9b450..fec0024c89a5 100644 --- a/synapse/server.py +++ b/synapse/server.py @@ -104,6 +104,7 @@ from synapse.handlers.room_member import RoomMemberHandler, RoomMemberMasterHandler from synapse.handlers.room_member_worker import RoomMemberWorkerHandler from synapse.handlers.search import SearchHandler +from synapse.handlers.send_email import SendEmailHandler from synapse.handlers.set_password import SetPasswordHandler from synapse.handlers.space_summary import SpaceSummaryHandler from synapse.handlers.sso import SsoHandler @@ -549,6 +550,10 @@ def get_deactivate_account_handler(self) -> DeactivateAccountHandler: def get_search_handler(self) -> SearchHandler: return SearchHandler(self) + @cache_in_self + def get_send_email_handler(self) -> SendEmailHandler: + return SendEmailHandler(self) + @cache_in_self def get_set_password_handler(self) -> SetPasswordHandler: return SetPasswordHandler(self) From 9752849e2b41968613ca244e86311d805bdd27df Mon Sep 17 00:00:00 2001 From: Patrick Cloke Date: Mon, 17 May 2021 09:01:19 -0400 Subject: [PATCH 12/52] Clarify comments in the space summary handler. (#9974) --- changelog.d/9974.misc | 1 + synapse/handlers/space_summary.py | 51 ++++++++++++++++++++++++++++--- 2 files changed, 47 insertions(+), 5 deletions(-) create mode 100644 changelog.d/9974.misc diff --git a/changelog.d/9974.misc b/changelog.d/9974.misc new file mode 100644 index 000000000000..9ddee2618e21 --- /dev/null +++ b/changelog.d/9974.misc @@ -0,0 +1 @@ +Update comments in the space summary handler. diff --git a/synapse/handlers/space_summary.py b/synapse/handlers/space_summary.py index e35d91832b42..953356f34d90 100644 --- a/synapse/handlers/space_summary.py +++ b/synapse/handlers/space_summary.py @@ -32,7 +32,6 @@ logger = logging.getLogger(__name__) # number of rooms to return. We'll stop once we hit this limit. -# TODO: allow clients to reduce this with a request param. MAX_ROOMS = 50 # max number of events to return per room. @@ -231,11 +230,15 @@ async def _summarize_local_room( Generate a room entry and a list of event entries for a given room. Args: - requester: The requesting user, or None if this is over federation. + requester: + The user requesting the summary, if it is a local request. None + if this is a federation request. room_id: The room ID to summarize. suggested_only: True if only suggested children should be returned. Otherwise, all children are returned. - max_children: The maximum number of children to return for this node. + max_children: + The maximum number of children rooms to include. This is capped + to a server-set limit. Returns: A tuple of: @@ -278,6 +281,26 @@ async def _summarize_remote_room( max_children: Optional[int], exclude_rooms: Iterable[str], ) -> Tuple[Sequence[JsonDict], Sequence[JsonDict]]: + """ + Request room entries and a list of event entries for a given room by querying a remote server. + + Args: + room: The room to summarize. + suggested_only: True if only suggested children should be returned. + Otherwise, all children are returned. + max_children: + The maximum number of children rooms to include. This is capped + to a server-set limit. + exclude_rooms: + Rooms IDs which do not need to be summarized. + + Returns: + A tuple of: + An iterable of rooms. + + An iterable of the sorted children events. This may be limited + to a maximum size or may include all children. + """ room_id = room.room_id logger.info("Requesting summary for %s via %s", room_id, room.via) @@ -310,8 +333,26 @@ async def _summarize_remote_room( ) async def _is_room_accessible(self, room_id: str, requester: Optional[str]) -> bool: - # if we have an authenticated requesting user, first check if they are in the - # room + """ + Calculate whether the room should be shown in the spaces summary. + + It should be included if: + + * The requester is joined or invited to the room. + * The history visibility is set to world readable. + + Args: + room_id: The room ID to summarize. + requester: + The user requesting the summary, if it is a local request. None + if this is a federation request. + + Returns: + True if the room should be included in the spaces summary. + """ + + # if we have an authenticated requesting user, first check if they are able to view + # stripped state in the room. if requester: try: await self._auth.check_user_in_room(room_id, requester) From 206a7b5f12fd3d88ec24a1f53ce75e5b701faed8 Mon Sep 17 00:00:00 2001 From: Patrick Cloke Date: Mon, 17 May 2021 09:59:17 -0400 Subject: [PATCH 13/52] Fix the allowed range of valid ordering characters for spaces. (#10002) \x7F was meant to be \0x7E (~) this was originally incorrect in MSC1772. --- changelog.d/10002.bugfix | 1 + synapse/handlers/space_summary.py | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) create mode 100644 changelog.d/10002.bugfix diff --git a/changelog.d/10002.bugfix b/changelog.d/10002.bugfix new file mode 100644 index 000000000000..1fabdad22eee --- /dev/null +++ b/changelog.d/10002.bugfix @@ -0,0 +1 @@ +Fix a validation bug introduced in v1.34.0 in the ordering of spaces in the space summary API. diff --git a/synapse/handlers/space_summary.py b/synapse/handlers/space_summary.py index 953356f34d90..eb80a5ad67db 100644 --- a/synapse/handlers/space_summary.py +++ b/synapse/handlers/space_summary.py @@ -471,8 +471,8 @@ def _is_suggested_child_event(edge_event: EventBase) -> bool: return False -# Order may only contain characters in the range of \x20 (space) to \x7F (~). -_INVALID_ORDER_CHARS_RE = re.compile(r"[^\x20-\x7F]") +# Order may only contain characters in the range of \x20 (space) to \x7E (~) inclusive. +_INVALID_ORDER_CHARS_RE = re.compile(r"[^\x20-\x7E]") def _child_events_comparison_key(child: EventBase) -> Tuple[bool, Optional[str], str]: From 4d6e5a5e995590efe44855d10dcd2a89b841dae8 Mon Sep 17 00:00:00 2001 From: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com> Date: Tue, 18 May 2021 14:13:45 +0100 Subject: [PATCH 14/52] Use a database table to hold the users that should have full presence sent to them, instead of something in-memory (#9823) --- changelog.d/9823.misc | 1 + docs/presence_router_module.md | 6 +- synapse/handlers/presence.py | 136 ++++++-- synapse/module_api/__init__.py | 63 ++-- synapse/replication/http/presence.py | 11 +- synapse/rest/admin/server_notice_servlet.py | 8 +- synapse/storage/databases/main/presence.py | 58 +++- .../59/13users_to_send_full_presence_to.sql | 34 ++ tests/events/test_presence_router.py | 15 +- tests/module_api/test_api.py | 303 +++++++++++++----- .../test_sharded_event_persister.py | 2 +- 11 files changed, 479 insertions(+), 158 deletions(-) create mode 100644 changelog.d/9823.misc create mode 100644 synapse/storage/schema/main/delta/59/13users_to_send_full_presence_to.sql diff --git a/changelog.d/9823.misc b/changelog.d/9823.misc new file mode 100644 index 000000000000..bf924ab68cc4 --- /dev/null +++ b/changelog.d/9823.misc @@ -0,0 +1 @@ +Allow sending full presence to users via workers other than the one that called `ModuleApi.send_local_online_presence_to`. \ No newline at end of file diff --git a/docs/presence_router_module.md b/docs/presence_router_module.md index d6566d978d06..d2844915dffe 100644 --- a/docs/presence_router_module.md +++ b/docs/presence_router_module.md @@ -28,7 +28,11 @@ async def ModuleApi.send_local_online_presence_to(users: Iterable[str]) -> None which can be given a list of local or remote MXIDs to broadcast known, online user presence to (for those users that the receiving user is considered interested in). It does not include state for users who are currently offline, and it can only be -called on workers that support sending federation. +called on workers that support sending federation. Additionally, this method must +only be called from the process that has been configured to write to the +the [presence stream](https://github.com/matrix-org/synapse/blob/master/docs/workers.md#stream-writers). +By default, this is the main process, but another worker can be configured to do +so. ### Module structure diff --git a/synapse/handlers/presence.py b/synapse/handlers/presence.py index 6fd1f34289f8..f5a049d75461 100644 --- a/synapse/handlers/presence.py +++ b/synapse/handlers/presence.py @@ -222,9 +222,21 @@ async def current_state_for_users( @abc.abstractmethod async def set_state( - self, target_user: UserID, state: JsonDict, ignore_status_msg: bool = False + self, + target_user: UserID, + state: JsonDict, + ignore_status_msg: bool = False, + force_notify: bool = False, ) -> None: - """Set the presence state of the user. """ + """Set the presence state of the user. + + Args: + target_user: The ID of the user to set the presence state of. + state: The presence state as a JSON dictionary. + ignore_status_msg: True to ignore the "status_msg" field of the `state` dict. + If False, the user's current status will be updated. + force_notify: Whether to force notification of the update to clients. + """ @abc.abstractmethod async def bump_presence_active_time(self, user: UserID): @@ -296,6 +308,51 @@ async def maybe_send_presence_to_interested_destinations( for destinations, states in hosts_and_states: self._federation.send_presence_to_destinations(states, destinations) + async def send_full_presence_to_users(self, user_ids: Collection[str]): + """ + Adds to the list of users who should receive a full snapshot of presence + upon their next sync. Note that this only works for local users. + + Then, grabs the current presence state for a given set of users and adds it + to the top of the presence stream. + + Args: + user_ids: The IDs of the local users to send full presence to. + """ + # Retrieve one of the users from the given set + if not user_ids: + raise Exception( + "send_full_presence_to_users must be called with at least one user" + ) + user_id = next(iter(user_ids)) + + # Mark all users as receiving full presence on their next sync + await self.store.add_users_to_send_full_presence_to(user_ids) + + # Add a new entry to the presence stream. Since we use stream tokens to determine whether a + # local user should receive a full snapshot of presence when they sync, we need to bump the + # presence stream so that subsequent syncs with no presence activity in between won't result + # in the client receiving multiple full snapshots of presence. + # + # If we bump the stream ID, then the user will get a higher stream token next sync, and thus + # correctly won't receive a second snapshot. + + # Get the current presence state for one of the users (defaults to offline if not found) + current_presence_state = await self.get_state(UserID.from_string(user_id)) + + # Convert the UserPresenceState object into a serializable dict + state = { + "presence": current_presence_state.state, + "status_message": current_presence_state.status_msg, + } + + # Copy the presence state to the tip of the presence stream. + + # We set force_notify=True here so that this presence update is guaranteed to + # increment the presence stream ID (which resending the current user's presence + # otherwise would not do). + await self.set_state(UserID.from_string(user_id), state, force_notify=True) + class _NullContextManager(ContextManager[None]): """A context manager which does nothing.""" @@ -480,8 +537,17 @@ async def set_state( target_user: UserID, state: JsonDict, ignore_status_msg: bool = False, + force_notify: bool = False, ) -> None: - """Set the presence state of the user.""" + """Set the presence state of the user. + + Args: + target_user: The ID of the user to set the presence state of. + state: The presence state as a JSON dictionary. + ignore_status_msg: True to ignore the "status_msg" field of the `state` dict. + If False, the user's current status will be updated. + force_notify: Whether to force notification of the update to clients. + """ presence = state["presence"] valid_presence = ( @@ -508,6 +574,7 @@ async def set_state( user_id=user_id, state=state, ignore_status_msg=ignore_status_msg, + force_notify=force_notify, ) async def bump_presence_active_time(self, user: UserID) -> None: @@ -677,13 +744,19 @@ async def _persist_unpersisted_changes(self) -> None: [self.user_to_current_state[user_id] for user_id in unpersisted] ) - async def _update_states(self, new_states: Iterable[UserPresenceState]) -> None: + async def _update_states( + self, new_states: Iterable[UserPresenceState], force_notify: bool = False + ) -> None: """Updates presence of users. Sets the appropriate timeouts. Pokes the notifier and federation if and only if the changed presence state should be sent to clients/servers. Args: new_states: The new user presence state updates to process. + force_notify: Whether to force notifying clients of this presence state update, + even if it doesn't change the state of a user's presence (e.g online -> online). + This is currently used to bump the max presence stream ID without changing any + user's presence (see PresenceHandler.add_users_to_send_full_presence_to). """ now = self.clock.time_msec() @@ -720,6 +793,9 @@ async def _update_states(self, new_states: Iterable[UserPresenceState]) -> None: now=now, ) + if force_notify: + should_notify = True + self.user_to_current_state[user_id] = new_state if should_notify: @@ -1058,9 +1134,21 @@ async def incoming_presence(self, origin: str, content: JsonDict) -> None: await self._update_states(updates) async def set_state( - self, target_user: UserID, state: JsonDict, ignore_status_msg: bool = False + self, + target_user: UserID, + state: JsonDict, + ignore_status_msg: bool = False, + force_notify: bool = False, ) -> None: - """Set the presence state of the user.""" + """Set the presence state of the user. + + Args: + target_user: The ID of the user to set the presence state of. + state: The presence state as a JSON dictionary. + ignore_status_msg: True to ignore the "status_msg" field of the `state` dict. + If False, the user's current status will be updated. + force_notify: Whether to force notification of the update to clients. + """ status_msg = state.get("status_msg", None) presence = state["presence"] @@ -1091,7 +1179,9 @@ async def set_state( ): new_fields["last_active_ts"] = self.clock.time_msec() - await self._update_states([prev_state.copy_and_replace(**new_fields)]) + await self._update_states( + [prev_state.copy_and_replace(**new_fields)], force_notify=force_notify + ) async def is_visible(self, observed_user: UserID, observer_user: UserID) -> bool: """Returns whether a user can see another user's presence.""" @@ -1389,11 +1479,10 @@ def __init__(self, hs: "HomeServer"): # # Presence -> Notifier -> PresenceEventSource -> Presence # - # Same with get_module_api, get_presence_router + # Same with get_presence_router: # # AuthHandler -> Notifier -> PresenceEventSource -> ModuleApi -> AuthHandler self.get_presence_handler = hs.get_presence_handler - self.get_module_api = hs.get_module_api self.get_presence_router = hs.get_presence_router self.clock = hs.get_clock() self.store = hs.get_datastore() @@ -1424,16 +1513,21 @@ async def get_new_events( stream_change_cache = self.store.presence_stream_cache with Measure(self.clock, "presence.get_new_events"): - if user_id in self.get_module_api()._send_full_presence_to_local_users: - # This user has been specified by a module to receive all current, online - # user presence. Removing from_key and setting include_offline to false - # will do effectively this. - from_key = None - include_offline = False - if from_key is not None: from_key = int(from_key) + # Check if this user should receive all current, online user presence. We only + # bother to do this if from_key is set, as otherwise the user will receive all + # user presence anyways. + if await self.store.should_user_receive_full_presence_with_token( + user_id, from_key + ): + # This user has been specified by a module to receive all current, online + # user presence. Removing from_key and setting include_offline to false + # will do effectively this. + from_key = None + include_offline = False + max_token = self.store.get_current_presence_token() if from_key == max_token: # This is necessary as due to the way stream ID generators work @@ -1467,12 +1561,6 @@ async def get_new_events( user_id, include_offline, from_key ) - # Remove the user from the list of users to receive all presence - if user_id in self.get_module_api()._send_full_presence_to_local_users: - self.get_module_api()._send_full_presence_to_local_users.remove( - user_id - ) - return presence_updates, max_token # Make mypy happy. users_interested_in should now be a set @@ -1522,10 +1610,6 @@ async def get_new_events( ) presence_updates = list(users_to_state.values()) - # Remove the user from the list of users to receive all presence - if user_id in self.get_module_api()._send_full_presence_to_local_users: - self.get_module_api()._send_full_presence_to_local_users.remove(user_id) - if not include_offline: # Filter out offline presence states presence_updates = self._filter_offline_presence_state(presence_updates) diff --git a/synapse/module_api/__init__.py b/synapse/module_api/__init__.py index a1a2b9aeccd3..cecdc96bf517 100644 --- a/synapse/module_api/__init__.py +++ b/synapse/module_api/__init__.py @@ -56,14 +56,6 @@ def __init__(self, hs, auth_handler): self._http_client = hs.get_simple_http_client() # type: SimpleHttpClient self._public_room_list_manager = PublicRoomListManager(hs) - # The next time these users sync, they will receive the current presence - # state of all local users. Users are added by send_local_online_presence_to, - # and removed after a successful sync. - # - # We make this a private variable to deter modules from accessing it directly, - # though other classes in Synapse will still do so. - self._send_full_presence_to_local_users = set() - @property def http_client(self): """Allows making outbound HTTP requests to remote resources. @@ -405,39 +397,44 @@ async def send_local_online_presence_to(self, users: Iterable[str]) -> None: Updates to remote users will be sent immediately, whereas local users will receive them on their next sync attempt. - Note that this method can only be run on the main or federation_sender worker - processes. + Note that this method can only be run on the process that is configured to write to the + presence stream. By default this is the main process. """ - if not self._hs.should_send_federation(): + if self._hs._instance_name not in self._hs.config.worker.writers.presence: raise Exception( "send_local_online_presence_to can only be run " - "on processes that send federation", + "on the process that is configured to write to the " + "presence stream (by default this is the main process)", ) + local_users = set() + remote_users = set() for user in users: if self._hs.is_mine_id(user): - # Modify SyncHandler._generate_sync_entry_for_presence to call - # presence_source.get_new_events with an empty `from_key` if - # that user's ID were in a list modified by ModuleApi somewhere. - # That user would then get all presence state on next incremental sync. - - # Force a presence initial_sync for this user next time - self._send_full_presence_to_local_users.add(user) + local_users.add(user) else: - # Retrieve presence state for currently online users that this user - # is considered interested in - presence_events, _ = await self._presence_stream.get_new_events( - UserID.from_string(user), from_key=None, include_offline=False - ) - - # Send to remote destinations. - - # We pull out the presence handler here to break a cyclic - # dependency between the presence router and module API. - presence_handler = self._hs.get_presence_handler() - await presence_handler.maybe_send_presence_to_interested_destinations( - presence_events - ) + remote_users.add(user) + + # We pull out the presence handler here to break a cyclic + # dependency between the presence router and module API. + presence_handler = self._hs.get_presence_handler() + + if local_users: + # Force a presence initial_sync for these users next time they sync. + await presence_handler.send_full_presence_to_users(local_users) + + for user in remote_users: + # Retrieve presence state for currently online users that this user + # is considered interested in. + presence_events, _ = await self._presence_stream.get_new_events( + UserID.from_string(user), from_key=None, include_offline=False + ) + + # Send to remote destinations. + destination = UserID.from_string(user).domain + presence_handler.get_federation_queue().send_presence_to_destinations( + presence_events, destination + ) class PublicRoomListManager: diff --git a/synapse/replication/http/presence.py b/synapse/replication/http/presence.py index f25307620d55..bb0024795349 100644 --- a/synapse/replication/http/presence.py +++ b/synapse/replication/http/presence.py @@ -73,6 +73,7 @@ class ReplicationPresenceSetState(ReplicationEndpoint): { "state": { ... }, "ignore_status_msg": false, + "force_notify": false } 200 OK @@ -91,17 +92,23 @@ def __init__(self, hs: "HomeServer"): self._presence_handler = hs.get_presence_handler() @staticmethod - async def _serialize_payload(user_id, state, ignore_status_msg=False): + async def _serialize_payload( + user_id, state, ignore_status_msg=False, force_notify=False + ): return { "state": state, "ignore_status_msg": ignore_status_msg, + "force_notify": force_notify, } async def _handle_request(self, request, user_id): content = parse_json_object_from_request(request) await self._presence_handler.set_state( - UserID.from_string(user_id), content["state"], content["ignore_status_msg"] + UserID.from_string(user_id), + content["state"], + content["ignore_status_msg"], + content["force_notify"], ) return ( diff --git a/synapse/rest/admin/server_notice_servlet.py b/synapse/rest/admin/server_notice_servlet.py index cc3ab5854b0b..b5e4c474efc8 100644 --- a/synapse/rest/admin/server_notice_servlet.py +++ b/synapse/rest/admin/server_notice_servlet.py @@ -54,7 +54,6 @@ def __init__(self, hs: "HomeServer"): self.hs = hs self.auth = hs.get_auth() self.txns = HttpTransactionCache(hs) - self.snm = hs.get_server_notices_manager() def register(self, json_resource: HttpServer): PATTERN = "/send_server_notice" @@ -77,7 +76,10 @@ async def on_POST( event_type = body.get("type", EventTypes.Message) state_key = body.get("state_key") - if not self.snm.is_enabled(): + # We grab the server notices manager here as its initialisation has a check for worker processes, + # but worker processes still need to initialise SendServerNoticeServlet (as it is part of the + # admin api). + if not self.hs.get_server_notices_manager().is_enabled(): raise SynapseError(400, "Server notices are not enabled on this server") user_id = body["user_id"] @@ -85,7 +87,7 @@ async def on_POST( if not self.hs.is_mine_id(user_id): raise SynapseError(400, "Server notices can only be sent to local users") - event = await self.snm.send_notice( + event = await self.hs.get_server_notices_manager().send_notice( user_id=body["user_id"], type=event_type, state_key=state_key, diff --git a/synapse/storage/databases/main/presence.py b/synapse/storage/databases/main/presence.py index db22fab23ea0..669a2af8845a 100644 --- a/synapse/storage/databases/main/presence.py +++ b/synapse/storage/databases/main/presence.py @@ -12,7 +12,7 @@ # See the License for the specific language governing permissions and # limitations under the License. -from typing import TYPE_CHECKING, Dict, List, Tuple +from typing import TYPE_CHECKING, Dict, Iterable, List, Tuple from synapse.api.presence import PresenceState, UserPresenceState from synapse.replication.tcp.streams import PresenceStream @@ -57,6 +57,7 @@ def __init__( db_conn, "presence_stream", "stream_id" ) + self.hs = hs self._presence_on_startup = self._get_active_presence(db_conn) presence_cache_prefill, min_presence_val = self.db_pool.get_cache_dict( @@ -210,6 +211,61 @@ async def get_presence_for_users(self, user_ids): return {row["user_id"]: UserPresenceState(**row) for row in rows} + async def should_user_receive_full_presence_with_token( + self, + user_id: str, + from_token: int, + ) -> bool: + """Check whether the given user should receive full presence using the stream token + they're updating from. + + Args: + user_id: The ID of the user to check. + from_token: The stream token included in their /sync token. + + Returns: + True if the user should have full presence sent to them, False otherwise. + """ + + def _should_user_receive_full_presence_with_token_txn(txn): + sql = """ + SELECT 1 FROM users_to_send_full_presence_to + WHERE user_id = ? + AND presence_stream_id >= ? + """ + txn.execute(sql, (user_id, from_token)) + return bool(txn.fetchone()) + + return await self.db_pool.runInteraction( + "should_user_receive_full_presence_with_token", + _should_user_receive_full_presence_with_token_txn, + ) + + async def add_users_to_send_full_presence_to(self, user_ids: Iterable[str]): + """Adds to the list of users who should receive a full snapshot of presence + upon their next sync. + + Args: + user_ids: An iterable of user IDs. + """ + # Add user entries to the table, updating the presence_stream_id column if the user already + # exists in the table. + await self.db_pool.simple_upsert_many( + table="users_to_send_full_presence_to", + key_names=("user_id",), + key_values=[(user_id,) for user_id in user_ids], + value_names=("presence_stream_id",), + # We save the current presence stream ID token along with the user ID entry so + # that when a user /sync's, even if they syncing multiple times across separate + # devices at different times, each device will receive full presence once - when + # the presence stream ID in their sync token is less than the one in the table + # for their user ID. + value_values=( + (self._presence_id_gen.get_current_token(),) for _ in user_ids + ), + desc="add_users_to_send_full_presence_to", + ) + async def get_presence_for_all_users( self, include_offline: bool = True, diff --git a/synapse/storage/schema/main/delta/59/13users_to_send_full_presence_to.sql b/synapse/storage/schema/main/delta/59/13users_to_send_full_presence_to.sql new file mode 100644 index 000000000000..07b0f53ecfa1 --- /dev/null +++ b/synapse/storage/schema/main/delta/59/13users_to_send_full_presence_to.sql @@ -0,0 +1,34 @@ +/* Copyright 2021 The Matrix.org Foundation C.I.C + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +-- Add a table that keeps track of a list of users who should, upon their next +-- sync request, receive presence for all currently online users that they are +-- "interested" in. + +-- The motivation for a DB table over an in-memory list is so that this list +-- can be added to and retrieved from by any worker. Specifically, we don't +-- want to duplicate work across multiple sync workers. + +CREATE TABLE IF NOT EXISTS users_to_send_full_presence_to( + -- The user ID to send full presence to. + user_id TEXT PRIMARY KEY, + -- A presence stream ID token - the current presence stream token when the row was last upserted. + -- If a user calls /sync and this token is part of the update they're to receive, we also include + -- full user presence in the response. + -- This allows multiple devices for a user to receive full presence whenever they next call /sync. + presence_stream_id BIGINT, + FOREIGN KEY (user_id) + REFERENCES users (name) +); \ No newline at end of file diff --git a/tests/events/test_presence_router.py b/tests/events/test_presence_router.py index 01d257307c6b..875b0d0a114b 100644 --- a/tests/events/test_presence_router.py +++ b/tests/events/test_presence_router.py @@ -302,11 +302,18 @@ def test_send_local_online_presence_to_with_module(self): ) # Check that the expected presence updates were sent - expected_users = [ + # We explicitly compare using sets as we expect that calling + # module_api.send_local_online_presence_to will create a presence + # update that is a duplicate of the specified user's current presence. + # These are sent to clients and will be picked up below, thus we use a + # set to deduplicate. We're just interested that non-offline updates were + # sent out for each user ID. + expected_users = { self.other_user_id, self.presence_receiving_user_one_id, self.presence_receiving_user_two_id, - ] + } + found_users = set() calls = ( self.hs.get_federation_transport_client().send_transaction.call_args_list @@ -326,12 +333,12 @@ def test_send_local_online_presence_to_with_module(self): # EDUs can contain multiple presence updates for presence_update in edu["content"]["push"]: # Check for presence updates that contain the user IDs we're after - expected_users.remove(presence_update["user_id"]) + found_users.add(presence_update["user_id"]) # Ensure that no offline states are being sent out self.assertNotEqual(presence_update["presence"], "offline") - self.assertEqual(len(expected_users), 0) + self.assertEqual(found_users, expected_users) def send_presence_update( diff --git a/tests/module_api/test_api.py b/tests/module_api/test_api.py index 742ad14b8c3a..2c68b9a13c85 100644 --- a/tests/module_api/test_api.py +++ b/tests/module_api/test_api.py @@ -13,6 +13,8 @@ # limitations under the License. from unittest.mock import Mock +from twisted.internet import defer + from synapse.api.constants import EduTypes from synapse.events import EventBase from synapse.federation.units import Transaction @@ -22,11 +24,13 @@ from synapse.types import create_requester from tests.events.test_presence_router import send_presence_update, sync_presence +from tests.replication._base import BaseMultiWorkerStreamTestCase from tests.test_utils.event_injection import inject_member_event -from tests.unittest import FederatingHomeserverTestCase, override_config +from tests.unittest import HomeserverTestCase, override_config +from tests.utils import USE_POSTGRES_FOR_TESTS -class ModuleApiTestCase(FederatingHomeserverTestCase): +class ModuleApiTestCase(HomeserverTestCase): servlets = [ admin.register_servlets, login.register_servlets, @@ -217,97 +221,16 @@ def test_public_rooms(self): ) self.assertFalse(is_in_public_rooms) - # The ability to send federation is required by send_local_online_presence_to. - @override_config({"send_federation": True}) def test_send_local_online_presence_to(self): - """Tests that send_local_presence_to_users sends local online presence to local users.""" - # Create a user who will send presence updates - self.presence_receiver_id = self.register_user("presence_receiver", "monkey") - self.presence_receiver_tok = self.login("presence_receiver", "monkey") - - # And another user that will send presence updates out - self.presence_sender_id = self.register_user("presence_sender", "monkey") - self.presence_sender_tok = self.login("presence_sender", "monkey") - - # Put them in a room together so they will receive each other's presence updates - room_id = self.helper.create_room_as( - self.presence_receiver_id, - tok=self.presence_receiver_tok, - ) - self.helper.join(room_id, self.presence_sender_id, tok=self.presence_sender_tok) - - # Presence sender comes online - send_presence_update( - self, - self.presence_sender_id, - self.presence_sender_tok, - "online", - "I'm online!", - ) - - # Presence receiver should have received it - presence_updates, sync_token = sync_presence(self, self.presence_receiver_id) - self.assertEqual(len(presence_updates), 1) - - presence_update = presence_updates[0] # type: UserPresenceState - self.assertEqual(presence_update.user_id, self.presence_sender_id) - self.assertEqual(presence_update.state, "online") - - # Syncing again should result in no presence updates - presence_updates, sync_token = sync_presence( - self, self.presence_receiver_id, sync_token - ) - self.assertEqual(len(presence_updates), 0) - - # Trigger sending local online presence - self.get_success( - self.module_api.send_local_online_presence_to( - [ - self.presence_receiver_id, - ] - ) - ) - - # Presence receiver should have received online presence again - presence_updates, sync_token = sync_presence( - self, self.presence_receiver_id, sync_token - ) - self.assertEqual(len(presence_updates), 1) - - presence_update = presence_updates[0] # type: UserPresenceState - self.assertEqual(presence_update.user_id, self.presence_sender_id) - self.assertEqual(presence_update.state, "online") - - # Presence sender goes offline - send_presence_update( - self, - self.presence_sender_id, - self.presence_sender_tok, - "offline", - "I slink back into the darkness.", - ) - - # Trigger sending local online presence - self.get_success( - self.module_api.send_local_online_presence_to( - [ - self.presence_receiver_id, - ] - ) - ) - - # Presence receiver should *not* have received offline state - presence_updates, sync_token = sync_presence( - self, self.presence_receiver_id, sync_token - ) - self.assertEqual(len(presence_updates), 0) + # Test sending local online presence to users from the main process + _test_sending_local_online_presence_to_local_user(self, test_with_workers=False) @override_config({"send_federation": True}) def test_send_local_online_presence_to_federation(self): """Tests that send_local_presence_to_users sends local online presence to remote users.""" # Create a user who will send presence updates - self.presence_sender_id = self.register_user("presence_sender", "monkey") - self.presence_sender_tok = self.login("presence_sender", "monkey") + self.presence_sender_id = self.register_user("presence_sender1", "monkey") + self.presence_sender_tok = self.login("presence_sender1", "monkey") # And a room they're a part of room_id = self.helper.create_room_as( @@ -374,3 +297,209 @@ def test_send_local_online_presence_to_federation(self): found_update = True self.assertTrue(found_update) + + +class ModuleApiWorkerTestCase(BaseMultiWorkerStreamTestCase): + """For testing ModuleApi functionality in a multi-worker setup""" + + # Testing stream ID replication from the main to worker processes requires postgres + # (due to needing `MultiWriterIdGenerator`). + if not USE_POSTGRES_FOR_TESTS: + skip = "Requires Postgres" + + servlets = [ + admin.register_servlets, + login.register_servlets, + room.register_servlets, + presence.register_servlets, + ] + + def default_config(self): + conf = super().default_config() + conf["redis"] = {"enabled": "true"} + conf["stream_writers"] = {"presence": ["presence_writer"]} + conf["instance_map"] = { + "presence_writer": {"host": "testserv", "port": 1001}, + } + return conf + + def prepare(self, reactor, clock, homeserver): + self.module_api = homeserver.get_module_api() + self.sync_handler = homeserver.get_sync_handler() + + def test_send_local_online_presence_to_workers(self): + # Test sending local online presence to users from a worker process + _test_sending_local_online_presence_to_local_user(self, test_with_workers=True) + + +def _test_sending_local_online_presence_to_local_user( + test_case: HomeserverTestCase, test_with_workers: bool = False +): + """Tests that send_local_presence_to_users sends local online presence to local users. + + This simultaneously tests two different usecases: + * Testing that this method works when either called from a worker or the main process. + - We test this by calling this method from both a TestCase that runs in monolith mode, and one that + runs with a main and generic_worker. + * Testing that multiple devices syncing simultaneously will all receive a snapshot of local, + online presence - but only once per device. + + Args: + test_with_workers: If True, this method will call ModuleApi.send_local_online_presence_to on a + worker process. The test users will still sync with the main process. The purpose of testing + with a worker is to check whether a Synapse module running on a worker can inform other workers/ + the main process that they should include additional presence when a user next syncs. + """ + if test_with_workers: + # Create a worker process to make module_api calls against + worker_hs = test_case.make_worker_hs( + "synapse.app.generic_worker", {"worker_name": "presence_writer"} + ) + + # Create a user who will send presence updates + test_case.presence_receiver_id = test_case.register_user( + "presence_receiver1", "monkey" + ) + test_case.presence_receiver_tok = test_case.login("presence_receiver1", "monkey") + + # And another user that will send presence updates out + test_case.presence_sender_id = test_case.register_user("presence_sender2", "monkey") + test_case.presence_sender_tok = test_case.login("presence_sender2", "monkey") + + # Put them in a room together so they will receive each other's presence updates + room_id = test_case.helper.create_room_as( + test_case.presence_receiver_id, + tok=test_case.presence_receiver_tok, + ) + test_case.helper.join( + room_id, test_case.presence_sender_id, tok=test_case.presence_sender_tok + ) + + # Presence sender comes online + send_presence_update( + test_case, + test_case.presence_sender_id, + test_case.presence_sender_tok, + "online", + "I'm online!", + ) + + # Presence receiver should have received it + presence_updates, sync_token = sync_presence( + test_case, test_case.presence_receiver_id + ) + test_case.assertEqual(len(presence_updates), 1) + + presence_update = presence_updates[0] # type: UserPresenceState + test_case.assertEqual(presence_update.user_id, test_case.presence_sender_id) + test_case.assertEqual(presence_update.state, "online") + + if test_with_workers: + # Replicate the current sync presence token from the main process to the worker process. + # We need to do this so that the worker process knows the current presence stream ID to + # insert into the database when we call ModuleApi.send_local_online_presence_to. + test_case.replicate() + + # Syncing again should result in no presence updates + presence_updates, sync_token = sync_presence( + test_case, test_case.presence_receiver_id, sync_token + ) + test_case.assertEqual(len(presence_updates), 0) + + # We do an (initial) sync with a second "device" now, getting a new sync token. + # We'll use this in a moment. + _, sync_token_second_device = sync_presence( + test_case, test_case.presence_receiver_id + ) + + # Determine on which process (main or worker) to call ModuleApi.send_local_online_presence_to on + if test_with_workers: + module_api_to_use = worker_hs.get_module_api() + else: + module_api_to_use = test_case.module_api + + # Trigger sending local online presence. We expect this information + # to be saved to the database where all processes can access it. + # Note that we're syncing via the master. + d = module_api_to_use.send_local_online_presence_to( + [ + test_case.presence_receiver_id, + ] + ) + d = defer.ensureDeferred(d) + + if test_with_workers: + # In order for the required presence_set_state replication request to occur between the + # worker and main process, we need to pump the reactor. Otherwise, the coordinator that + # reads the request on the main process won't do so, and the request will time out. + while not d.called: + test_case.reactor.advance(0.1) + + test_case.get_success(d) + + # The presence receiver should have received online presence again. + presence_updates, sync_token = sync_presence( + test_case, test_case.presence_receiver_id, sync_token + ) + test_case.assertEqual(len(presence_updates), 1) + + presence_update = presence_updates[0] # type: UserPresenceState + test_case.assertEqual(presence_update.user_id, test_case.presence_sender_id) + test_case.assertEqual(presence_update.state, "online") + + # We attempt to sync with the second sync token we received above - just to check that + # multiple syncing devices will each receive the necessary online presence. + presence_updates, sync_token_second_device = sync_presence( + test_case, test_case.presence_receiver_id, sync_token_second_device + ) + test_case.assertEqual(len(presence_updates), 1) + + presence_update = presence_updates[0] # type: UserPresenceState + test_case.assertEqual(presence_update.user_id, test_case.presence_sender_id) + test_case.assertEqual(presence_update.state, "online") + + # However, if we now sync with either "device", we won't receive another burst of online presence + # until the API is called again sometime in the future + presence_updates, sync_token = sync_presence( + test_case, test_case.presence_receiver_id, sync_token + ) + + # Now we check that we don't receive *offline* updates using ModuleApi.send_local_online_presence_to. + + # Presence sender goes offline + send_presence_update( + test_case, + test_case.presence_sender_id, + test_case.presence_sender_tok, + "offline", + "I slink back into the darkness.", + ) + + # Presence receiver should have received the updated, offline state + presence_updates, sync_token = sync_presence( + test_case, test_case.presence_receiver_id, sync_token + ) + test_case.assertEqual(len(presence_updates), 1) + + # Now trigger sending local online presence. + d = module_api_to_use.send_local_online_presence_to( + [ + test_case.presence_receiver_id, + ] + ) + d = defer.ensureDeferred(d) + + if test_with_workers: + # In order for the required presence_set_state replication request to occur between the + # worker and main process, we need to pump the reactor. Otherwise, the coordinator that + # reads the request on the main process won't do so, and the request will time out. + while not d.called: + test_case.reactor.advance(0.1) + + test_case.get_success(d) + + # Presence receiver should *not* have received offline state + presence_updates, sync_token = sync_presence( + test_case, test_case.presence_receiver_id, sync_token + ) + test_case.assertEqual(len(presence_updates), 0) diff --git a/tests/replication/test_sharded_event_persister.py b/tests/replication/test_sharded_event_persister.py index d739eb6b17f0..5eca5c165d0a 100644 --- a/tests/replication/test_sharded_event_persister.py +++ b/tests/replication/test_sharded_event_persister.py @@ -30,7 +30,7 @@ class EventPersisterShardTestCase(BaseMultiWorkerStreamTestCase): """Checks event persisting sharding works""" # Event persister sharding requires postgres (due to needing - # `MutliWriterIdGenerator`). + # `MultiWriterIdGenerator`). if not USE_POSTGRES_FOR_TESTS: skip = "Requires Postgres" From ac6bfcd52f03e9574324978f83a281cf35f4ea89 Mon Sep 17 00:00:00 2001 From: Patrick Cloke Date: Tue, 18 May 2021 12:17:04 -0400 Subject: [PATCH 15/52] Refactor checking restricted join rules (#10007) To be more consistent with similar code. The check now automatically raises an AuthError instead of passing back a boolean. It also absorbs some shared logic between callers. --- changelog.d/10007.feature | 1 + synapse/handlers/event_auth.py | 51 ++++++++++++++++++++++----------- synapse/handlers/federation.py | 29 ++++++------------- synapse/handlers/room_member.py | 20 ++++--------- 4 files changed, 50 insertions(+), 51 deletions(-) create mode 100644 changelog.d/10007.feature diff --git a/changelog.d/10007.feature b/changelog.d/10007.feature new file mode 100644 index 000000000000..2c655350c04f --- /dev/null +++ b/changelog.d/10007.feature @@ -0,0 +1 @@ +Experimental support to allow a user who could join a restricted room to view it in the spaces summary. diff --git a/synapse/handlers/event_auth.py b/synapse/handlers/event_auth.py index eff639f40760..5b2fe103e742 100644 --- a/synapse/handlers/event_auth.py +++ b/synapse/handlers/event_auth.py @@ -11,10 +11,12 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. -from typing import TYPE_CHECKING +from typing import TYPE_CHECKING, Optional -from synapse.api.constants import EventTypes, JoinRules +from synapse.api.constants import EventTypes, JoinRules, Membership +from synapse.api.errors import AuthError from synapse.api.room_versions import RoomVersion +from synapse.events import EventBase from synapse.types import StateMap if TYPE_CHECKING: @@ -29,44 +31,58 @@ class EventAuthHandler: def __init__(self, hs: "HomeServer"): self._store = hs.get_datastore() - async def can_join_without_invite( - self, state_ids: StateMap[str], room_version: RoomVersion, user_id: str - ) -> bool: + async def check_restricted_join_rules( + self, + state_ids: StateMap[str], + room_version: RoomVersion, + user_id: str, + prev_member_event: Optional[EventBase], + ) -> None: """ - Check whether a user can join a room without an invite. + Check whether a user can join a room without an invite due to restricted join rules. When joining a room with restricted joined rules (as defined in MSC3083), - the membership of spaces must be checked during join. + the membership of spaces must be checked during a room join. Args: state_ids: The state of the room as it currently is. room_version: The room version of the room being joined. user_id: The user joining the room. + prev_member_event: The current membership event for this user. - Returns: - True if the user can join the room, false otherwise. + Raises: + AuthError if the user cannot join the room. """ + # If the member is invited or currently joined, then nothing to do. + if prev_member_event and ( + prev_member_event.membership in (Membership.JOIN, Membership.INVITE) + ): + return + # This only applies to room versions which support the new join rule. if not room_version.msc3083_join_rules: - return True + return # If there's no join rule, then it defaults to invite (so this doesn't apply). join_rules_event_id = state_ids.get((EventTypes.JoinRules, ""), None) if not join_rules_event_id: - return True + return # If the join rule is not restricted, this doesn't apply. join_rules_event = await self._store.get_event(join_rules_event_id) if join_rules_event.content.get("join_rule") != JoinRules.MSC3083_RESTRICTED: - return True + return # If allowed is of the wrong form, then only allow invited users. allowed_spaces = join_rules_event.content.get("allow", []) if not isinstance(allowed_spaces, list): - return False + allowed_spaces = () # Get the list of joined rooms and see if there's an overlap. - joined_rooms = await self._store.get_rooms_for_user(user_id) + if allowed_spaces: + joined_rooms = await self._store.get_rooms_for_user(user_id) + else: + joined_rooms = () # Pull out the other room IDs, invalid data gets filtered. for space in allowed_spaces: @@ -80,7 +96,10 @@ async def can_join_without_invite( # The user was joined to one of the spaces specified, they can join # this room! if space_id in joined_rooms: - return True + return # The user was not in any of the required spaces. - return False + raise AuthError( + 403, + "You do not belong to any of the required spaces to join this room.", + ) diff --git a/synapse/handlers/federation.py b/synapse/handlers/federation.py index 798ed75b30fa..678f6b770797 100644 --- a/synapse/handlers/federation.py +++ b/synapse/handlers/federation.py @@ -1668,28 +1668,17 @@ async def on_send_join_request(self, origin: str, pdu: EventBase) -> JsonDict: # Check if the user is already in the room or invited to the room. user_id = event.state_key prev_member_event_id = prev_state_ids.get((EventTypes.Member, user_id), None) - newly_joined = True - user_is_invited = False + prev_member_event = None if prev_member_event_id: prev_member_event = await self.store.get_event(prev_member_event_id) - newly_joined = prev_member_event.membership != Membership.JOIN - user_is_invited = prev_member_event.membership == Membership.INVITE - - # If the member is not already in the room, and not invited, check if - # they should be allowed access via membership in a space. - if ( - newly_joined - and not user_is_invited - and not await self._event_auth_handler.can_join_without_invite( - prev_state_ids, - event.room_version, - user_id, - ) - ): - raise AuthError( - 403, - "You do not belong to any of the required spaces to join this room.", - ) + + # Check if the member should be allowed access via membership in a space. + await self._event_auth_handler.check_restricted_join_rules( + prev_state_ids, + event.room_version, + user_id, + prev_member_event, + ) # Persist the event. await self._auth_and_persist_event(origin, event, context) diff --git a/synapse/handlers/room_member.py b/synapse/handlers/room_member.py index 9a092da71597..d6fc43e7984c 100644 --- a/synapse/handlers/room_member.py +++ b/synapse/handlers/room_member.py @@ -260,25 +260,15 @@ async def _local_membership_update( if event.membership == Membership.JOIN: newly_joined = True - user_is_invited = False + prev_member_event = None if prev_member_event_id: prev_member_event = await self.store.get_event(prev_member_event_id) newly_joined = prev_member_event.membership != Membership.JOIN - user_is_invited = prev_member_event.membership == Membership.INVITE - # If the member is not already in the room and is not accepting an invite, - # check if they should be allowed access via membership in a space. - if ( - newly_joined - and not user_is_invited - and not await self.event_auth_handler.can_join_without_invite( - prev_state_ids, event.room_version, user_id - ) - ): - raise AuthError( - 403, - "You do not belong to any of the required spaces to join this room.", - ) + # Check if the member should be allowed access via membership in a space. + await self.event_auth_handler.check_restricted_join_rules( + prev_state_ids, event.room_version, user_id, prev_member_event + ) # Only rate-limit if the user actually joined the room, otherwise we'll end # up blocking profile updates. From 5bba1b49058a648197f217268a3978d8acf09c51 Mon Sep 17 00:00:00 2001 From: Savyasachee Jha Date: Wed, 19 May 2021 16:14:16 +0530 Subject: [PATCH 16/52] Hardened systemd unit files (#9803) Signed-off-by: Savyasachee Jha savya.jha@hawkradius.com --- changelog.d/9803.doc | 1 + contrib/systemd/override-hardened.conf | 71 ++++++++++++++++++++++++++ docs/systemd-with-workers/README.md | 30 +++++++++++ 3 files changed, 102 insertions(+) create mode 100644 changelog.d/9803.doc create mode 100644 contrib/systemd/override-hardened.conf diff --git a/changelog.d/9803.doc b/changelog.d/9803.doc new file mode 100644 index 000000000000..16c7ba703394 --- /dev/null +++ b/changelog.d/9803.doc @@ -0,0 +1 @@ +Add hardened systemd files as proposed in [#9760](https://github.com/matrix-org/synapse/issues/9760) and added them to `contrib/`. Change the docs to reflect the presence of these files. diff --git a/contrib/systemd/override-hardened.conf b/contrib/systemd/override-hardened.conf new file mode 100644 index 000000000000..b2fa3ae7c5db --- /dev/null +++ b/contrib/systemd/override-hardened.conf @@ -0,0 +1,71 @@ +[Service] +# The following directives give the synapse service R/W access to: +# - /run/matrix-synapse +# - /var/lib/matrix-synapse +# - /var/log/matrix-synapse + +RuntimeDirectory=matrix-synapse +StateDirectory=matrix-synapse +LogsDirectory=matrix-synapse + +###################### +## Security Sandbox ## +###################### + +# Make sure that the service has its own unshared tmpfs at /tmp and that it +# cannot see or change any real devices +PrivateTmp=true +PrivateDevices=true + +# We give no capabilities to a service by default +CapabilityBoundingSet= +AmbientCapabilities= + +# Protect the following from modification: +# - The entire filesystem +# - sysctl settings and loaded kernel modules +# - No modifications allowed to Control Groups +# - Hostname +# - System Clock +ProtectSystem=strict +ProtectKernelTunables=true +ProtectKernelModules=true +ProtectControlGroups=true +ProtectClock=true +ProtectHostname=true + +# Prevent access to the following: +# - /home directory +# - Kernel logs +ProtectHome=tmpfs +ProtectKernelLogs=true + +# Make sure that the process can only see PIDs and process details of itself, +# and the second option disables seeing details of things like system load and +# I/O etc +ProtectProc=invisible +ProcSubset=pid + +# While not needed, we set these options explicitly +# - This process has been given access to the host network +# - It can also communicate with any IP Address +PrivateNetwork=false +RestrictAddressFamilies=AF_INET AF_INET6 AF_UNIX +IPAddressAllow=any + +# Restrict system calls to a sane bunch +SystemCallArchitectures=native +SystemCallFilter=@system-service +SystemCallFilter=~@privileged @resources @obsolete + +# Misc restrictions +# - Since the process is a python process it needs to be able to write and +# execute memory regions, so we set MemoryDenyWriteExecute to false +RestrictSUIDSGID=true +RemoveIPC=true +NoNewPrivileges=true +RestrictRealtime=true +RestrictNamespaces=true +LockPersonality=true +PrivateUsers=true +MemoryDenyWriteExecute=false diff --git a/docs/systemd-with-workers/README.md b/docs/systemd-with-workers/README.md index cfa36be7b4c5..a1135e9ed578 100644 --- a/docs/systemd-with-workers/README.md +++ b/docs/systemd-with-workers/README.md @@ -65,3 +65,33 @@ systemctl restart matrix-synapse-worker@federation_reader.service systemctl enable matrix-synapse-worker@federation_writer.service systemctl restart matrix-synapse.target ``` + +## Hardening + +**Optional:** If further hardening is desired, the file +`override-hardened.conf` may be copied from +`contrib/systemd/override-hardened.conf` in this repository to the location +`/etc/systemd/system/matrix-synapse.service.d/override-hardened.conf` (the +directory may have to be created). It enables certain sandboxing features in +systemd to further secure the synapse service. You may read the comments to +understand what the override file is doing. The same file will need to be copied +to +`/etc/systemd/system/matrix-synapse-worker@.service.d/override-hardened-worker.conf` +(this directory may also have to be created) in order to apply the same +hardening options to any worker processes. + +Once these files have been copied to their appropriate locations, simply reload +systemd's manager config files and restart all Synapse services to apply the hardening options. They will automatically +be applied at every restart as long as the override files are present at the +specified locations. + +```sh +systemctl daemon-reload + +# Restart services +systemctl restart matrix-synapse.target +``` + +In order to see their effect, you may run `systemd-analyze security +matrix-synapse.service` before and after applying the hardening options to see +the changes being applied at a glance. From 9c76d0561bdbe6741088fe8af1d336166058bb01 Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Wed, 19 May 2021 11:47:16 +0100 Subject: [PATCH 17/52] Update the contrib grafana dashboard (#10001) --- changelog.d/10001.misc | 1 + contrib/grafana/synapse.json | 5623 ++++++++++++++++++++++++++-------- 2 files changed, 4269 insertions(+), 1355 deletions(-) create mode 100644 changelog.d/10001.misc diff --git a/changelog.d/10001.misc b/changelog.d/10001.misc new file mode 100644 index 000000000000..8740cc478dda --- /dev/null +++ b/changelog.d/10001.misc @@ -0,0 +1 @@ +Update the Grafana dashboard in `contrib/`. diff --git a/contrib/grafana/synapse.json b/contrib/grafana/synapse.json index 539569b5b12a..0c4816b7cd53 100644 --- a/contrib/grafana/synapse.json +++ b/contrib/grafana/synapse.json @@ -14,7 +14,7 @@ "type": "grafana", "id": "grafana", "name": "Grafana", - "version": "6.7.4" + "version": "7.3.7" }, { "type": "panel", @@ -38,7 +38,6 @@ "annotations": { "list": [ { - "$$hashKey": "object:76", "builtIn": 1, "datasource": "$datasource", "enable": false, @@ -55,11 +54,12 @@ "gnetId": null, "graphTooltip": 0, "id": null, - "iteration": 1594646317221, + "iteration": 1621258266004, "links": [ { - "asDropdown": true, + "asDropdown": false, "icon": "external link", + "includeVars": true, "keepTime": true, "tags": [ "matrix" @@ -83,73 +83,255 @@ "title": "Overview", "type": "row" }, + { + "cards": { + "cardPadding": -1, + "cardRound": 0 + }, + "color": { + "cardColor": "#b4ff00", + "colorScale": "sqrt", + "colorScheme": "interpolateInferno", + "exponent": 0.5, + "mode": "spectrum" + }, + "dataFormat": "tsbuckets", + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 1 + }, + "heatmap": {}, + "hideZeroBuckets": false, + "highlightCards": true, + "id": 189, + "legend": { + "show": false + }, + "links": [], + "reverseYBuckets": false, + "targets": [ + { + "expr": "sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',instance=\"$instance\",code=~\"2..\"}[$bucket_size])) by (le)", + "format": "heatmap", + "interval": "", + "intervalFactor": 1, + "legendFormat": "{{le}}", + "refId": "A" + } + ], + "title": "Event Send Time (excluding errors, all workers)", + "tooltip": { + "show": true, + "showHistogram": true + }, + "type": "heatmap", + "xAxis": { + "show": true + }, + "xBucketNumber": null, + "xBucketSize": null, + "yAxis": { + "decimals": null, + "format": "s", + "logBase": 2, + "max": null, + "min": null, + "show": true, + "splitFactor": null + }, + "yBucketBound": "auto", + "yBucketNumber": null, + "yBucketSize": null + }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "$datasource", - "fill": 1, + "description": "", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 0, "fillGradient": 0, "gridPos": { "h": 9, "w": 12, - "x": 0, + "x": 12, "y": 1 }, "hiddenSeries": false, - "id": 75, + "id": 152, "legend": { "avg": false, "current": false, "max": false, "min": false, + "rightSide": false, "show": true, "total": false, "values": false }, "lines": true, - "linewidth": 1, + "linewidth": 0, "links": [], - "nullPointMode": "null", + "nullPointMode": "connected", "options": { - "dataLinks": [] + "alertThreshold": true }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", - "seriesOverrides": [], + "seriesOverrides": [ + { + "alias": "Avg", + "fill": 0, + "linewidth": 3 + }, + { + "alias": "99%", + "color": "#C4162A", + "fillBelowTo": "90%" + }, + { + "alias": "90%", + "color": "#FF7383", + "fillBelowTo": "75%" + }, + { + "alias": "75%", + "color": "#FFEE52", + "fillBelowTo": "50%" + }, + { + "alias": "50%", + "color": "#73BF69", + "fillBelowTo": "25%" + }, + { + "alias": "25%", + "color": "#1F60C4", + "fillBelowTo": "5%" + }, + { + "alias": "5%", + "lines": false + }, + { + "alias": "Average", + "color": "rgb(255, 255, 255)", + "lines": true, + "linewidth": 3 + }, + { + "alias": "Events", + "color": "#B877D9", + "hideTooltip": true, + "points": true, + "yaxis": 2, + "zindex": -3 + } + ], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { - "expr": "rate(process_cpu_seconds_total{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])", + "expr": "histogram_quantile(0.99, sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',index=~\"$index\",instance=\"$instance\",code=~\"2..\"}[$bucket_size])) by (le))", "format": "time_series", "intervalFactor": 1, - "legendFormat": "{{job}}-{{index}} ", + "legendFormat": "99%", + "refId": "D" + }, + { + "expr": "histogram_quantile(0.9, sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',index=~\"$index\",instance=\"$instance\",code=~\"2..\"}[$bucket_size])) by (le))", + "format": "time_series", + "interval": "", + "intervalFactor": 1, + "legendFormat": "90%", "refId": "A" + }, + { + "expr": "histogram_quantile(0.75, sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',index=~\"$index\",instance=\"$instance\",code=~\"2..\"}[$bucket_size])) by (le))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "75%", + "refId": "C" + }, + { + "expr": "histogram_quantile(0.5, sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',index=~\"$index\",instance=\"$instance\",code=~\"2..\"}[$bucket_size])) by (le))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "50%", + "refId": "B" + }, + { + "expr": "histogram_quantile(0.25, sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',index=~\"$index\",instance=\"$instance\",code=~\"2..\"}[$bucket_size])) by (le))", + "legendFormat": "25%", + "refId": "F" + }, + { + "expr": "histogram_quantile(0.05, sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',index=~\"$index\",instance=\"$instance\",code=~\"2..\"}[$bucket_size])) by (le))", + "legendFormat": "5%", + "refId": "G" + }, + { + "expr": "sum(rate(synapse_http_server_response_time_seconds_sum{servlet='RoomSendEventRestServlet',index=~\"$index\",instance=\"$instance\",code=~\"2..\"}[$bucket_size])) / sum(rate(synapse_http_server_response_time_seconds_count{servlet='RoomSendEventRestServlet',index=~\"$index\",instance=\"$instance\",code=~\"2..\"}[$bucket_size]))", + "legendFormat": "Average", + "refId": "H" + }, + { + "expr": "sum(rate(synapse_storage_events_persisted_events{instance=\"$instance\"}[$bucket_size]))", + "hide": false, + "instant": false, + "legendFormat": "Events", + "refId": "E" } ], "thresholds": [ { - "colorMode": "critical", - "fill": true, + "$$hashKey": "object:283", + "colorMode": "warning", + "fill": false, "line": true, "op": "gt", "value": 1, "yaxis": "left" + }, + { + "$$hashKey": "object:284", + "colorMode": "critical", + "fill": false, + "line": true, + "op": "gt", + "value": 2, + "yaxis": "left" } ], "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "CPU usage", + "title": "Event Send Time Quantiles (excluding errors, all workers)", "tooltip": { "shared": true, - "sort": 0, + "sort": 2, "value_type": "individual" }, "type": "graph", @@ -162,20 +344,22 @@ }, "yaxes": [ { + "$$hashKey": "object:255", "decimals": null, - "format": "percentunit", - "label": null, + "format": "s", + "label": "", "logBase": 1, - "max": "1.5", + "max": null, "min": "0", "show": true }, { - "format": "short", - "label": null, + "$$hashKey": "object:256", + "format": "hertz", + "label": "", "logBase": 1, "max": null, - "min": null, + "min": "0", "show": true } ], @@ -190,37 +374,42 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", - "editable": true, - "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, - "grid": {}, "gridPos": { "h": 9, "w": 12, - "x": 12, - "y": 1 + "x": 0, + "y": 10 }, "hiddenSeries": false, - "id": 33, + "id": 75, "legend": { "avg": false, "current": false, "max": false, "min": false, - "show": false, + "show": true, "total": false, "values": false }, "lines": true, - "linewidth": 2, + "linewidth": 3, "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -230,24 +419,33 @@ "steppedLine": false, "targets": [ { - "expr": "sum(rate(synapse_storage_events_persisted_events{instance=\"$instance\"}[$bucket_size])) without (job,index)", + "expr": "rate(process_cpu_seconds_total{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])", "format": "time_series", - "intervalFactor": 2, - "legendFormat": "", - "refId": "A", - "step": 20, - "target": "" + "interval": "", + "intervalFactor": 1, + "legendFormat": "{{job}}-{{index}} ", + "refId": "A" + } + ], + "thresholds": [ + { + "$$hashKey": "object:566", + "colorMode": "critical", + "fill": true, + "line": true, + "op": "gt", + "value": 1, + "yaxis": "left" } ], - "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Events Persisted", + "title": "CPU usage", "tooltip": { - "shared": true, + "shared": false, "sort": 0, - "value_type": "cumulative" + "value_type": "individual" }, "type": "graph", "xaxis": { @@ -259,14 +457,19 @@ }, "yaxes": [ { - "format": "hertz", + "$$hashKey": "object:538", + "decimals": null, + "format": "percentunit", + "label": null, "logBase": 1, - "max": null, - "min": null, + "max": "1.5", + "min": "0", "show": true }, { + "$$hashKey": "object:539", "format": "short", + "label": null, "logBase": 1, "max": null, "min": null, @@ -278,76 +481,24 @@ "alignLevel": null } }, - { - "cards": { - "cardPadding": 0, - "cardRound": null - }, - "color": { - "cardColor": "#b4ff00", - "colorScale": "sqrt", - "colorScheme": "interpolateSpectral", - "exponent": 0.5, - "mode": "spectrum" - }, - "dataFormat": "tsbuckets", - "datasource": "$datasource", - "gridPos": { - "h": 9, - "w": 12, - "x": 0, - "y": 10 - }, - "heatmap": {}, - "hideZeroBuckets": true, - "highlightCards": true, - "id": 85, - "legend": { - "show": false - }, - "links": [], - "reverseYBuckets": false, - "targets": [ - { - "expr": "sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',instance=\"$instance\"}[$bucket_size])) by (le)", - "format": "heatmap", - "intervalFactor": 1, - "legendFormat": "{{le}}", - "refId": "A" - } - ], - "title": "Event Send Time", - "tooltip": { - "show": true, - "showHistogram": false - }, - "type": "heatmap", - "xAxis": { - "show": true - }, - "xBucketNumber": null, - "xBucketSize": null, - "yAxis": { - "decimals": null, - "format": "s", - "logBase": 2, - "max": null, - "min": null, - "show": true, - "splitFactor": null - }, - "yBucketBound": "auto", - "yBucketNumber": null, - "yBucketSize": null - }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "$datasource", - "fill": 0, + "editable": true, + "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 1, "fillGradient": 0, + "grid": {}, "gridPos": { "h": 9, "w": 12, @@ -355,7 +506,7 @@ "y": 10 }, "hiddenSeries": false, - "id": 107, + "id": 198, "legend": { "avg": false, "current": false, @@ -366,76 +517,52 @@ "values": false }, "lines": true, - "linewidth": 1, + "linewidth": 3, "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", - "repeat": null, - "repeatDirection": "h", - "seriesOverrides": [ - { - "alias": "mean", - "linewidth": 2 - } - ], + "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { - "expr": "histogram_quantile(0.99, sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',instance=\"$instance\",code=~\"2..\"}[$bucket_size])) without (job, index, method))", + "expr": "process_resident_memory_bytes{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}", "format": "time_series", "interval": "", - "intervalFactor": 1, - "legendFormat": "99%", - "refId": "A" - }, - { - "expr": "histogram_quantile(0.95, sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',instance=\"$instance\",code=~\"2..\"}[$bucket_size])) without (job, index, method))", - "format": "time_series", - "intervalFactor": 1, - "legendFormat": "95%", - "refId": "B" - }, - { - "expr": "histogram_quantile(0.90, sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',instance=\"$instance\",code=~\"2..\"}[$bucket_size])) without (job, index, method))", - "format": "time_series", - "intervalFactor": 1, - "legendFormat": "90%", - "refId": "C" - }, - { - "expr": "histogram_quantile(0.50, sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',instance=\"$instance\",code=~\"2..\"}[$bucket_size])) without (job, index, method))", - "format": "time_series", - "intervalFactor": 1, - "legendFormat": "50%", - "refId": "D" + "intervalFactor": 2, + "legendFormat": "{{job}} {{index}}", + "refId": "A", + "step": 20, + "target": "" }, { - "expr": "sum(rate(synapse_http_server_response_time_seconds_sum{servlet='RoomSendEventRestServlet',instance=\"$instance\",code=~\"2..\"}[$bucket_size])) without (job, index, method) / sum(rate(synapse_http_server_response_time_seconds_count{servlet='RoomSendEventRestServlet',instance=\"$instance\",code=~\"2..\"}[$bucket_size])) without (job, index, method)", - "format": "time_series", - "intervalFactor": 1, - "legendFormat": "mean", - "refId": "E" + "expr": "sum(process_resident_memory_bytes{instance=\"$instance\",job=~\"$job\",index=~\"$index\"})", + "hide": true, + "interval": "", + "legendFormat": "total", + "refId": "B" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Event send time quantiles", + "title": "Memory", "tooltip": { - "shared": true, + "shared": false, "sort": 0, - "value_type": "individual" + "value_type": "cumulative" }, + "transformations": [], "type": "graph", "xaxis": { "buckets": null, @@ -446,16 +573,16 @@ }, "yaxes": [ { - "format": "s", - "label": null, + "$$hashKey": "object:1560", + "format": "bytes", "logBase": 1, "max": null, - "min": null, + "min": "0", "show": true }, { + "$$hashKey": "object:1561", "format": "short", - "label": null, "logBase": 1, "max": null, "min": null, @@ -473,16 +600,23 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", - "fill": 0, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 1, "fillGradient": 0, "gridPos": { - "h": 9, + "h": 7, "w": 12, - "x": 0, + "x": 12, "y": 19 }, "hiddenSeries": false, - "id": 118, + "id": 37, "legend": { "avg": false, "current": false, @@ -497,18 +631,21 @@ "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", - "repeatDirection": "h", "seriesOverrides": [ { - "alias": "mean", - "linewidth": 2 + "$$hashKey": "object:639", + "alias": "/max$/", + "color": "#890F02", + "fill": 0, + "legend": false } ], "spaceLength": 10, @@ -516,49 +653,33 @@ "steppedLine": false, "targets": [ { - "expr": "histogram_quantile(0.99, sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',instance=\"$instance\",code=~\"2..\",job=~\"$job\",index=~\"$index\"}[$bucket_size])) without (method))", + "expr": "process_open_fds{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}", "format": "time_series", + "hide": false, "interval": "", - "intervalFactor": 1, - "legendFormat": "{{job}}-{{index}} 99%", - "refId": "A" - }, - { - "expr": "histogram_quantile(0.95, sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',instance=\"$instance\",code=~\"2..\",job=~\"$job\",index=~\"$index\"}[$bucket_size])) without (method))", - "format": "time_series", - "intervalFactor": 1, - "legendFormat": "{{job}}-{{index}} 95%", - "refId": "B" - }, - { - "expr": "histogram_quantile(0.90, sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',instance=\"$instance\",code=~\"2..\",job=~\"$job\",index=~\"$index\"}[$bucket_size])) without (method))", - "format": "time_series", - "intervalFactor": 1, - "legendFormat": "{{job}}-{{index}} 90%", - "refId": "C" - }, - { - "expr": "histogram_quantile(0.50, sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',instance=\"$instance\",code=~\"2..\",job=~\"$job\",index=~\"$index\"}[$bucket_size])) without (method))", - "format": "time_series", - "intervalFactor": 1, - "legendFormat": "{{job}}-{{index}} 50%", - "refId": "D" + "intervalFactor": 2, + "legendFormat": "{{job}}-{{index}}", + "refId": "A", + "step": 20 }, { - "expr": "sum(rate(synapse_http_server_response_time_seconds_sum{servlet='RoomSendEventRestServlet',instance=\"$instance\",code=~\"2..\",job=~\"$job\",index=~\"$index\"}[$bucket_size])) without (method) / sum(rate(synapse_http_server_response_time_seconds_count{servlet='RoomSendEventRestServlet',instance=\"$instance\",code=~\"2..\",job=~\"$job\",index=~\"$index\"}[$bucket_size])) without (method)", + "expr": "process_max_fds{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}", "format": "time_series", - "intervalFactor": 1, - "legendFormat": "{{job}}-{{index}} mean", - "refId": "E" + "hide": true, + "interval": "", + "intervalFactor": 2, + "legendFormat": "{{job}}-{{index}} max", + "refId": "B", + "step": 20 } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Event send time quantiles by worker", + "title": "Open FDs", "tooltip": { - "shared": true, + "shared": false, "sort": 0, "value_type": "individual" }, @@ -572,14 +693,18 @@ }, "yaxes": [ { - "format": "s", - "label": null, + "$$hashKey": "object:650", + "decimals": null, + "format": "none", + "label": "", "logBase": 1, "max": null, "min": null, "show": true }, { + "$$hashKey": "object:651", + "decimals": null, "format": "short", "label": null, "logBase": 1, @@ -600,7 +725,7 @@ "h": 1, "w": 24, "x": 0, - "y": 28 + "y": 26 }, "id": 54, "panels": [ @@ -612,6 +737,13 @@ "datasource": "$datasource", "editable": true, "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "grid": {}, @@ -619,7 +751,7 @@ "h": 7, "w": 12, "x": 0, - "y": 2 + "y": 25 }, "hiddenSeries": false, "id": 5, @@ -637,22 +769,25 @@ "values": false }, "lines": true, - "linewidth": 1, + "linewidth": 3, "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", "seriesOverrides": [ { + "$$hashKey": "object:1240", "alias": "/user/" }, { + "$$hashKey": "object:1241", "alias": "/system/" } ], @@ -682,20 +817,33 @@ ], "thresholds": [ { + "$$hashKey": "object:1278", "colorMode": "custom", "fillColor": "rgba(255, 255, 255, 1)", "line": true, "lineColor": "rgba(216, 200, 27, 0.27)", "op": "gt", - "value": 0.5 + "value": 0.5, + "yaxis": "left" }, { + "$$hashKey": "object:1279", "colorMode": "custom", "fillColor": "rgba(255, 255, 255, 1)", "line": true, - "lineColor": "rgba(234, 112, 112, 0.22)", + "lineColor": "rgb(87, 6, 16)", + "op": "gt", + "value": 0.8, + "yaxis": "left" + }, + { + "$$hashKey": "object:1498", + "colorMode": "critical", + "fill": true, + "line": true, "op": "gt", - "value": 0.8 + "value": 1, + "yaxis": "left" } ], "timeFrom": null, @@ -703,7 +851,7 @@ "timeShift": null, "title": "CPU", "tooltip": { - "shared": true, + "shared": false, "sort": 0, "value_type": "individual" }, @@ -717,6 +865,7 @@ }, "yaxes": [ { + "$$hashKey": "object:1250", "decimals": null, "format": "percentunit", "label": "", @@ -726,6 +875,7 @@ "show": true }, { + "$$hashKey": "object:1251", "format": "short", "logBase": 1, "max": null, @@ -744,16 +894,25 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "description": "Shows the time in which the given percentage of reactor ticks completed, over the sampled timespan", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 7, "w": 12, "x": 12, - "y": 2 + "y": 25 }, "hiddenSeries": false, - "id": 37, + "id": 105, + "interval": "", "legend": { "avg": false, "current": false, @@ -768,51 +927,57 @@ "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", - "seriesOverrides": [ - { - "alias": "/max$/", - "color": "#890F02", - "fill": 0, - "legend": false - } - ], + "seriesOverrides": [], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { - "expr": "process_open_fds{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}", + "expr": "histogram_quantile(0.99, rate(python_twisted_reactor_tick_time_bucket{index=~\"$index\",instance=\"$instance\",job=~\"$job\"}[$bucket_size]))", "format": "time_series", - "hide": false, + "interval": "", "intervalFactor": 2, - "legendFormat": "{{job}}-{{index}}", + "legendFormat": "{{job}}-{{index}} 99%", "refId": "A", "step": 20 }, { - "expr": "process_max_fds{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}", + "expr": "histogram_quantile(0.95, rate(python_twisted_reactor_tick_time_bucket{index=~\"$index\",instance=\"$instance\",job=~\"$job\"}[$bucket_size]))", "format": "time_series", - "hide": true, - "intervalFactor": 2, - "legendFormat": "{{job}}-{{index}} max", - "refId": "B", - "step": 20 + "intervalFactor": 1, + "legendFormat": "{{job}}-{{index}} 95%", + "refId": "B" + }, + { + "expr": "histogram_quantile(0.90, rate(python_twisted_reactor_tick_time_bucket{index=~\"$index\",instance=\"$instance\",job=~\"$job\"}[$bucket_size]))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{job}}-{{index}} 90%", + "refId": "C" + }, + { + "expr": "rate(python_twisted_reactor_tick_time_sum{index=~\"$index\",instance=\"$instance\",job=~\"$job\"}[$bucket_size]) / rate(python_twisted_reactor_tick_time_count{index=~\"$index\",instance=\"$instance\",job=~\"$job\"}[$bucket_size])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{job}}-{{index}} mean", + "refId": "D" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Open FDs", + "title": "Reactor tick quantiles", "tooltip": { - "shared": true, + "shared": false, "sort": 0, "value_type": "individual" }, @@ -826,7 +991,7 @@ }, "yaxes": [ { - "format": "none", + "format": "s", "label": null, "logBase": 1, "max": null, @@ -839,7 +1004,7 @@ "logBase": 1, "max": null, "min": null, - "show": true + "show": false } ], "yaxis": { @@ -855,6 +1020,13 @@ "datasource": "$datasource", "editable": true, "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 0, "fillGradient": 0, "grid": {}, @@ -862,7 +1034,7 @@ "h": 7, "w": 12, "x": 0, - "y": 9 + "y": 32 }, "hiddenSeries": false, "id": 34, @@ -880,10 +1052,11 @@ "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -895,11 +1068,18 @@ { "expr": "process_resident_memory_bytes{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}", "format": "time_series", + "interval": "", "intervalFactor": 2, "legendFormat": "{{job}} {{index}}", "refId": "A", "step": 20, "target": "" + }, + { + "expr": "sum(process_resident_memory_bytes{instance=\"$instance\",job=~\"$job\",index=~\"$index\"})", + "interval": "", + "legendFormat": "total", + "refId": "B" } ], "thresholds": [], @@ -908,10 +1088,11 @@ "timeShift": null, "title": "Memory", "tooltip": { - "shared": true, + "shared": false, "sort": 0, "value_type": "cumulative" }, + "transformations": [], "type": "graph", "xaxis": { "buckets": null, @@ -947,18 +1128,23 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", - "description": "Shows the time in which the given percentage of reactor ticks completed, over the sampled timespan", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 7, "w": 12, "x": 12, - "y": 9 + "y": 32 }, "hiddenSeries": false, - "id": 105, - "interval": "", + "id": 49, "legend": { "avg": false, "current": false, @@ -973,54 +1159,40 @@ "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", - "seriesOverrides": [], + "seriesOverrides": [ + { + "alias": "/^up/", + "legend": false, + "yaxis": 2 + } + ], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { - "expr": "histogram_quantile(0.99, rate(python_twisted_reactor_tick_time_bucket{index=~\"$index\",instance=\"$instance\",job=~\"$job\"}[$bucket_size]))", + "expr": "scrape_duration_seconds{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}", "format": "time_series", "interval": "", "intervalFactor": 2, - "legendFormat": "{{job}}-{{index}} 99%", + "legendFormat": "{{job}}-{{index}}", "refId": "A", "step": 20 - }, - { - "expr": "histogram_quantile(0.95, rate(python_twisted_reactor_tick_time_bucket{index=~\"$index\",instance=\"$instance\",job=~\"$job\"}[$bucket_size]))", - "format": "time_series", - "intervalFactor": 1, - "legendFormat": "{{job}}-{{index}} 95%", - "refId": "B" - }, - { - "expr": "histogram_quantile(0.90, rate(python_twisted_reactor_tick_time_bucket{index=~\"$index\",instance=\"$instance\",job=~\"$job\"}[$bucket_size]))", - "format": "time_series", - "intervalFactor": 1, - "legendFormat": "{{job}}-{{index}} 90%", - "refId": "C" - }, - { - "expr": "rate(python_twisted_reactor_tick_time_sum{index=~\"$index\",instance=\"$instance\",job=~\"$job\"}[$bucket_size]) / rate(python_twisted_reactor_tick_time_count{index=~\"$index\",instance=\"$instance\",job=~\"$job\"}[$bucket_size])", - "format": "time_series", - "intervalFactor": 1, - "legendFormat": "{{job}}-{{index}} mean", - "refId": "D" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Reactor tick quantiles", + "title": "Prometheus scrape time", "tooltip": { "shared": false, "sort": 0, @@ -1040,15 +1212,16 @@ "label": null, "logBase": 1, "max": null, - "min": null, + "min": "0", "show": true }, { - "format": "short", - "label": null, + "decimals": 0, + "format": "none", + "label": "", "logBase": 1, - "max": null, - "min": null, + "max": "0", + "min": "-1", "show": false } ], @@ -1063,13 +1236,20 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 0, "fillGradient": 0, "gridPos": { "h": 7, "w": 12, "x": 0, - "y": 16 + "y": 39 }, "hiddenSeries": false, "id": 53, @@ -1087,10 +1267,11 @@ "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -1113,7 +1294,7 @@ "timeShift": null, "title": "Up", "tooltip": { - "shared": true, + "shared": false, "sort": 0, "value_type": "individual" }, @@ -1154,16 +1335,23 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 7, "w": 12, "x": 12, - "y": 16 + "y": 39 }, "hiddenSeries": false, - "id": 49, + "id": 120, "legend": { "avg": false, "current": false, @@ -1176,43 +1364,56 @@ "lines": true, "linewidth": 1, "links": [], - "nullPointMode": "null", + "nullPointMode": "null as zero", "options": { - "dataLinks": [] + "alertThreshold": true }, - "paceLength": 10, "percentage": false, - "pointradius": 5, + "pluginVersion": "7.3.7", + "pointradius": 2, "points": false, "renderer": "flot", - "seriesOverrides": [ - { - "alias": "/^up/", - "legend": false, - "yaxis": 2 - } - ], + "seriesOverrides": [], "spaceLength": 10, - "stack": false, + "stack": true, "steppedLine": false, "targets": [ { - "expr": "scrape_duration_seconds{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}", + "expr": "rate(synapse_http_server_response_ru_utime_seconds{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])+rate(synapse_http_server_response_ru_stime_seconds{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])", + "format": "time_series", + "hide": false, + "instant": false, + "intervalFactor": 1, + "legendFormat": "{{job}}-{{index}} {{method}} {{servlet}} {{tag}}", + "refId": "A" + }, + { + "expr": "rate(synapse_background_process_ru_utime_seconds{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])+rate(synapse_background_process_ru_stime_seconds{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])", "format": "time_series", + "hide": false, + "instant": false, "interval": "", - "intervalFactor": 2, - "legendFormat": "{{job}}-{{index}}", - "refId": "A", - "step": 20 + "intervalFactor": 1, + "legendFormat": "{{job}}-{{index}} {{name}}", + "refId": "B" + } + ], + "thresholds": [ + { + "colorMode": "critical", + "fill": true, + "line": true, + "op": "gt", + "value": 1, + "yaxis": "left" } ], - "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Prometheus scrape time", + "title": "Stacked CPU usage", "tooltip": { - "shared": true, + "shared": false, "sort": 0, "value_type": "individual" }, @@ -1226,21 +1427,22 @@ }, "yaxes": [ { - "format": "s", + "$$hashKey": "object:572", + "format": "percentunit", "label": null, "logBase": 1, "max": null, - "min": "0", + "min": null, "show": true }, { - "decimals": 0, - "format": "none", - "label": "", + "$$hashKey": "object:573", + "format": "short", + "label": null, "logBase": 1, - "max": "0", - "min": "-1", - "show": false + "max": null, + "min": null, + "show": true } ], "yaxis": { @@ -1254,13 +1456,20 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 7, "w": 12, "x": 0, - "y": 23 + "y": 46 }, "hiddenSeries": false, "id": 136, @@ -1278,9 +1487,10 @@ "linewidth": 1, "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 2, "points": false, "renderer": "flot", @@ -1306,7 +1516,7 @@ "timeShift": null, "title": "Outgoing HTTP request rate", "tooltip": { - "shared": true, + "shared": false, "sort": 0, "value_type": "individual" }, @@ -1340,6 +1550,90 @@ "align": false, "alignLevel": null } + } + ], + "repeat": null, + "title": "Process info", + "type": "row" + }, + { + "collapsed": true, + "datasource": "${DS_PROMETHEUS}", + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 27 + }, + "id": 56, + "panels": [ + { + "cards": { + "cardPadding": -1, + "cardRound": 0 + }, + "color": { + "cardColor": "#b4ff00", + "colorScale": "sqrt", + "colorScheme": "interpolateInferno", + "exponent": 0.5, + "mode": "spectrum" + }, + "dataFormat": "tsbuckets", + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 21 + }, + "heatmap": {}, + "hideZeroBuckets": false, + "highlightCards": true, + "id": 85, + "legend": { + "show": false + }, + "links": [], + "reverseYBuckets": false, + "targets": [ + { + "expr": "sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',instance=\"$instance\"}[$bucket_size])) by (le)", + "format": "heatmap", + "intervalFactor": 1, + "legendFormat": "{{le}}", + "refId": "A" + } + ], + "title": "Event Send Time (Including errors, across all workers)", + "tooltip": { + "show": true, + "showHistogram": true + }, + "type": "heatmap", + "xAxis": { + "show": true + }, + "xBucketNumber": null, + "xBucketSize": null, + "yAxis": { + "decimals": null, + "format": "s", + "logBase": 2, + "max": null, + "min": null, + "show": true, + "splitFactor": null + }, + "yBucketBound": "auto", + "yBucketNumber": null, + "yBucketSize": null }, { "aliasColors": {}, @@ -1347,79 +1641,74 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "description": "", + "editable": true, + "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, + "grid": {}, "gridPos": { - "h": 7, + "h": 9, "w": 12, "x": 12, - "y": 23 + "y": 21 }, "hiddenSeries": false, - "id": 120, + "id": 33, "legend": { "avg": false, "current": false, "max": false, "min": false, - "show": true, + "show": false, "total": false, "values": false }, "lines": true, - "linewidth": 1, + "linewidth": 2, "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, + "paceLength": 10, "percentage": false, - "pointradius": 2, + "pluginVersion": "7.3.7", + "pointradius": 5, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, - "stack": true, + "stack": false, "steppedLine": false, "targets": [ { - "expr": "rate(synapse_http_server_response_ru_utime_seconds{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])+rate(synapse_http_server_response_ru_stime_seconds{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])", - "format": "time_series", - "hide": false, - "instant": false, - "intervalFactor": 1, - "legendFormat": "{{job}}-{{index}} {{method}} {{servlet}} {{tag}}", - "refId": "A" - }, - { - "expr": "rate(synapse_background_process_ru_utime_seconds{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])+rate(synapse_background_process_ru_stime_seconds{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])", + "expr": "sum(rate(synapse_storage_events_persisted_events{instance=\"$instance\"}[$bucket_size])) without (job,index)", "format": "time_series", - "hide": false, - "instant": false, "interval": "", - "intervalFactor": 1, - "legendFormat": "{{job}}-{{index}} {{name}}", - "refId": "B" - } - ], - "thresholds": [ - { - "colorMode": "critical", - "fill": true, - "line": true, - "op": "gt", - "value": 1, - "yaxis": "left" + "intervalFactor": 2, + "legendFormat": "", + "refId": "A", + "step": 20, + "target": "" } ], + "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Stacked CPU usage", + "title": "Events Persisted (all workers)", "tooltip": { - "shared": false, + "shared": true, "sort": 0, - "value_type": "individual" + "value_type": "cumulative" }, "type": "graph", "xaxis": { @@ -1431,16 +1720,16 @@ }, "yaxes": [ { - "format": "percentunit", - "label": null, + "$$hashKey": "object:102", + "format": "hertz", "logBase": 1, "max": null, "min": null, "show": true }, { + "$$hashKey": "object:103", "format": "short", - "label": null, "logBase": 1, "max": null, "min": null, @@ -1451,23 +1740,7 @@ "align": false, "alignLevel": null } - } - ], - "repeat": null, - "title": "Process info", - "type": "row" - }, - { - "collapsed": true, - "datasource": "${DS_PROMETHEUS}", - "gridPos": { - "h": 1, - "w": 24, - "x": 0, - "y": 29 - }, - "id": 56, - "panels": [ + }, { "aliasColors": {}, "bars": false, @@ -1475,13 +1748,21 @@ "dashes": false, "datasource": "$datasource", "decimals": 1, + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, "fill": 1, + "fillGradient": 0, "gridPos": { "h": 7, "w": 12, "x": 0, - "y": 58 + "y": 30 }, + "hiddenSeries": false, "id": 40, "legend": { "avg": false, @@ -1496,7 +1777,11 @@ "linewidth": 1, "links": [], "nullPointMode": "null", + "options": { + "alertThreshold": true + }, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -1561,13 +1846,21 @@ "dashes": false, "datasource": "$datasource", "decimals": 1, + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, "fill": 1, + "fillGradient": 0, "gridPos": { "h": 7, "w": 12, "x": 12, - "y": 58 + "y": 30 }, + "hiddenSeries": false, "id": 46, "legend": { "avg": false, @@ -1582,7 +1875,11 @@ "linewidth": 1, "links": [], "nullPointMode": "null", + "options": { + "alertThreshold": true + }, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -1651,13 +1948,21 @@ "dashes": false, "datasource": "$datasource", "decimals": 1, + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, "fill": 1, + "fillGradient": 0, "gridPos": { "h": 7, "w": 12, "x": 0, - "y": 65 + "y": 37 }, + "hiddenSeries": false, "id": 44, "legend": { "alignAsTable": true, @@ -1675,7 +1980,11 @@ "linewidth": 1, "links": [], "nullPointMode": "null", + "options": { + "alertThreshold": true + }, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -1741,13 +2050,21 @@ "dashes": false, "datasource": "$datasource", "decimals": 1, + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, "fill": 1, + "fillGradient": 0, "gridPos": { "h": 7, "w": 12, "x": 12, - "y": 65 + "y": 37 }, + "hiddenSeries": false, "id": 45, "legend": { "alignAsTable": true, @@ -1765,7 +2082,11 @@ "linewidth": 1, "links": [], "nullPointMode": "null", + "options": { + "alertThreshold": true + }, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -1823,52 +2144,35 @@ "align": false, "alignLevel": null } - } - ], - "repeat": null, - "title": "Event persist rates", - "type": "row" - }, - { - "collapsed": true, - "datasource": "${DS_PROMETHEUS}", - "gridPos": { - "h": 1, - "w": 24, - "x": 0, - "y": 30 - }, - "id": 57, - "panels": [ + }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "$datasource", - "decimals": null, - "editable": true, - "error": false, - "fill": 2, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 0, "fillGradient": 0, - "grid": {}, "gridPos": { - "h": 8, + "h": 9, "w": 12, "x": 0, - "y": 31 + "y": 44 }, "hiddenSeries": false, - "id": 4, + "id": 118, "legend": { - "alignAsTable": true, "avg": false, "current": false, - "hideEmpty": false, - "hideZero": true, "max": false, "min": false, - "rightSide": false, "show": true, "total": false, "values": false @@ -1878,50 +2182,212 @@ "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, + "paceLength": 10, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", - "seriesOverrides": [], + "repeatDirection": "h", + "seriesOverrides": [ + { + "alias": "mean", + "linewidth": 2 + } + ], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { - "expr": "rate(synapse_http_server_requests_received{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])", + "expr": "histogram_quantile(0.99, sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',instance=\"$instance\",code=~\"2..\",job=~\"$job\",index=~\"$index\"}[$bucket_size])) without (method))", "format": "time_series", "interval": "", - "intervalFactor": 2, - "legendFormat": "{{job}}-{{index}} {{method}} {{servlet}} {{tag}}", - "refId": "A", - "step": 20 - } - ], - "thresholds": [ + "intervalFactor": 1, + "legendFormat": "{{job}}-{{index}} 99%", + "refId": "A" + }, { - "colorMode": "custom", - "fill": true, - "fillColor": "rgba(216, 200, 27, 0.27)", - "op": "gt", - "value": 100 + "expr": "histogram_quantile(0.95, sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',instance=\"$instance\",code=~\"2..\",job=~\"$job\",index=~\"$index\"}[$bucket_size])) without (method))", + "format": "time_series", + "interval": "", + "intervalFactor": 1, + "legendFormat": "{{job}}-{{index}} 95%", + "refId": "B" }, { - "colorMode": "custom", - "fill": true, - "fillColor": "rgba(234, 112, 112, 0.22)", - "op": "gt", - "value": 250 - } - ], + "expr": "histogram_quantile(0.90, sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',instance=\"$instance\",code=~\"2..\",job=~\"$job\",index=~\"$index\"}[$bucket_size])) without (method))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{job}}-{{index}} 90%", + "refId": "C" + }, + { + "expr": "histogram_quantile(0.50, sum(rate(synapse_http_server_response_time_seconds_bucket{servlet='RoomSendEventRestServlet',instance=\"$instance\",code=~\"2..\",job=~\"$job\",index=~\"$index\"}[$bucket_size])) without (method))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{job}}-{{index}} 50%", + "refId": "D" + }, + { + "expr": "sum(rate(synapse_http_server_response_time_seconds_sum{servlet='RoomSendEventRestServlet',instance=\"$instance\",code=~\"2..\",job=~\"$job\",index=~\"$index\"}[$bucket_size])) without (method) / sum(rate(synapse_http_server_response_time_seconds_count{servlet='RoomSendEventRestServlet',instance=\"$instance\",code=~\"2..\",job=~\"$job\",index=~\"$index\"}[$bucket_size])) without (method)", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{job}}-{{index}} mean", + "refId": "E" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Event send time quantiles by worker", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "s", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + } + ], + "repeat": null, + "title": "Event persistence", + "type": "row" + }, + { + "collapsed": true, + "datasource": "${DS_PROMETHEUS}", + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 28 + }, + "id": 57, + "panels": [ + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "decimals": null, + "editable": true, + "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 2, + "fillGradient": 0, + "grid": {}, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 31 + }, + "hiddenSeries": false, + "id": 4, + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "hideEmpty": false, + "hideZero": true, + "max": false, + "min": false, + "rightSide": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "7.3.7", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "rate(synapse_http_server_requests_received{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])", + "format": "time_series", + "interval": "", + "intervalFactor": 2, + "legendFormat": "{{job}}-{{index}} {{method}} {{servlet}} {{tag}}", + "refId": "A", + "step": 20 + } + ], + "thresholds": [ + { + "colorMode": "custom", + "fill": true, + "fillColor": "rgba(216, 200, 27, 0.27)", + "op": "gt", + "value": 100, + "yaxis": "left" + }, + { + "colorMode": "custom", + "fill": true, + "fillColor": "rgba(234, 112, 112, 0.22)", + "op": "gt", + "value": 250, + "yaxis": "left" + } + ], "timeFrom": null, "timeRegions": [], "timeShift": null, "title": "Request Count by arrival time", "tooltip": { "shared": false, - "sort": 0, + "sort": 2, "value_type": "individual" }, "type": "graph", @@ -1961,6 +2427,13 @@ "datasource": "$datasource", "editable": true, "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "grid": {}, @@ -1986,9 +2459,10 @@ "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -2014,7 +2488,7 @@ "title": "Top 10 Request Counts", "tooltip": { "shared": false, - "sort": 0, + "sort": 2, "value_type": "cumulative" }, "type": "graph", @@ -2055,6 +2529,13 @@ "decimals": null, "editable": true, "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 2, "fillGradient": 0, "grid": {}, @@ -2084,9 +2565,10 @@ "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -2129,7 +2611,7 @@ "title": "Total CPU Usage by Endpoint", "tooltip": { "shared": false, - "sort": 0, + "sort": 2, "value_type": "individual" }, "type": "graph", @@ -2170,7 +2652,14 @@ "decimals": null, "editable": true, "error": false, - "fill": 2, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 0, "fillGradient": 0, "grid": {}, "gridPos": { @@ -2199,9 +2688,10 @@ "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -2214,7 +2704,7 @@ "expr": "(rate(synapse_http_server_in_flight_requests_ru_utime_seconds{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])+rate(synapse_http_server_in_flight_requests_ru_stime_seconds{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])) / rate(synapse_http_server_requests_received{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])", "format": "time_series", "interval": "", - "intervalFactor": 2, + "intervalFactor": 1, "legendFormat": "{{job}}-{{index}} {{method}} {{servlet}} {{tag}}", "refId": "A", "step": 20 @@ -2226,14 +2716,16 @@ "fill": true, "fillColor": "rgba(216, 200, 27, 0.27)", "op": "gt", - "value": 100 + "value": 100, + "yaxis": "left" }, { "colorMode": "custom", "fill": true, "fillColor": "rgba(234, 112, 112, 0.22)", "op": "gt", - "value": 250 + "value": 250, + "yaxis": "left" } ], "timeFrom": null, @@ -2242,7 +2734,7 @@ "title": "Average CPU Usage by Endpoint", "tooltip": { "shared": false, - "sort": 0, + "sort": 2, "value_type": "individual" }, "type": "graph", @@ -2282,6 +2774,13 @@ "datasource": "$datasource", "editable": true, "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "grid": {}, @@ -2310,9 +2809,10 @@ "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -2325,7 +2825,7 @@ "expr": "rate(synapse_http_server_in_flight_requests_db_txn_duration_seconds{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])", "format": "time_series", "interval": "", - "intervalFactor": 2, + "intervalFactor": 1, "legendFormat": "{{job}}-{{index}} {{method}} {{servlet}} {{tag}}", "refId": "A", "step": 20 @@ -2338,7 +2838,7 @@ "title": "DB Usage by endpoint", "tooltip": { "shared": false, - "sort": 0, + "sort": 2, "value_type": "cumulative" }, "type": "graph", @@ -2379,6 +2879,13 @@ "decimals": null, "editable": true, "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 2, "fillGradient": 0, "grid": {}, @@ -2408,9 +2915,10 @@ "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -2424,7 +2932,7 @@ "format": "time_series", "hide": false, "interval": "", - "intervalFactor": 2, + "intervalFactor": 1, "legendFormat": "{{job}}-{{index}} {{method}} {{servlet}}", "refId": "A", "step": 20 @@ -2437,7 +2945,7 @@ "title": "Non-sync avg response time", "tooltip": { "shared": false, - "sort": 0, + "sort": 2, "value_type": "individual" }, "type": "graph", @@ -2475,6 +2983,13 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { @@ -2499,13 +3014,21 @@ "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", - "seriesOverrides": [], + "seriesOverrides": [ + { + "alias": "Total", + "color": "rgb(255, 255, 255)", + "fill": 0, + "linewidth": 3 + } + ], "spaceLength": 10, "stack": false, "steppedLine": false, @@ -2517,6 +3040,12 @@ "intervalFactor": 1, "legendFormat": "{{job}}-{{index}} {{method}} {{servlet}}", "refId": "A" + }, + { + "expr": "sum(avg_over_time(synapse_http_server_in_flight_requests_count{job=\"$job\",index=~\"$index\",instance=\"$instance\"}[$bucket_size]))", + "interval": "", + "legendFormat": "Total", + "refId": "B" } ], "thresholds": [], @@ -2526,7 +3055,7 @@ "title": "Requests in flight", "tooltip": { "shared": false, - "sort": 0, + "sort": 2, "value_type": "individual" }, "type": "graph", @@ -2572,7 +3101,7 @@ "h": 1, "w": 24, "x": 0, - "y": 31 + "y": 29 }, "id": 97, "panels": [ @@ -2582,6 +3111,13 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { @@ -2605,11 +3141,9 @@ "linewidth": 1, "links": [], "nullPointMode": "null", - "options": { - "dataLinks": [] - }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.1.3", "pointradius": 5, "points": false, "renderer": "flot", @@ -2674,6 +3208,13 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { @@ -2697,11 +3238,9 @@ "linewidth": 1, "links": [], "nullPointMode": "null", - "options": { - "dataLinks": [] - }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.1.3", "pointradius": 5, "points": false, "renderer": "flot", @@ -2717,12 +3256,6 @@ "intervalFactor": 1, "legendFormat": "{{job}}-{{index}} {{name}}", "refId": "A" - }, - { - "expr": "", - "format": "time_series", - "intervalFactor": 1, - "refId": "B" } ], "thresholds": [], @@ -2731,7 +3264,7 @@ "timeShift": null, "title": "DB usage by background jobs (including scheduling time)", "tooltip": { - "shared": true, + "shared": false, "sort": 0, "value_type": "individual" }, @@ -2772,6 +3305,13 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { @@ -2794,10 +3334,8 @@ "lines": true, "linewidth": 1, "nullPointMode": "null", - "options": { - "dataLinks": [] - }, "percentage": false, + "pluginVersion": "7.1.3", "pointradius": 2, "points": false, "renderer": "flot", @@ -2864,7 +3402,7 @@ "h": 1, "w": 24, "x": 0, - "y": 32 + "y": 30 }, "id": 81, "panels": [ @@ -2874,13 +3412,20 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 9, "w": 12, "x": 0, - "y": 6 + "y": 33 }, "hiddenSeries": false, "id": 79, @@ -2897,11 +3442,9 @@ "linewidth": 1, "links": [], "nullPointMode": "null", - "options": { - "dataLinks": [] - }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.1.3", "pointradius": 5, "points": false, "renderer": "flot", @@ -2970,13 +3513,20 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 9, "w": 12, "x": 12, - "y": 6 + "y": 33 }, "hiddenSeries": false, "id": 83, @@ -2993,11 +3543,9 @@ "linewidth": 1, "links": [], "nullPointMode": "null", - "options": { - "dataLinks": [] - }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.1.3", "pointradius": 5, "points": false, "renderer": "flot", @@ -3068,13 +3616,20 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 9, "w": 12, "x": 0, - "y": 15 + "y": 42 }, "hiddenSeries": false, "id": 109, @@ -3091,11 +3646,9 @@ "linewidth": 1, "links": [], "nullPointMode": "null", - "options": { - "dataLinks": [] - }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.1.3", "pointradius": 5, "points": false, "renderer": "flot", @@ -3167,13 +3720,20 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 9, "w": 12, "x": 12, - "y": 15 + "y": 42 }, "hiddenSeries": false, "id": 111, @@ -3190,11 +3750,9 @@ "linewidth": 1, "links": [], "nullPointMode": "null", - "options": { - "dataLinks": [] - }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.1.3", "pointradius": 5, "points": false, "renderer": "flot", @@ -3258,18 +3816,25 @@ "bars": false, "dashLength": 10, "dashes": false, - "datasource": "$datasource", - "description": "Number of events queued up on the master process for processing by the federation sender", + "datasource": "${DS_PROMETHEUS}", + "description": "The number of events in the in-memory queues ", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { - "h": 9, + "h": 8, "w": 12, "x": 0, - "y": 24 + "y": 51 }, "hiddenSeries": false, - "id": 140, + "id": 142, "legend": { "avg": false, "current": false, @@ -3281,14 +3846,112 @@ }, "lines": true, "linewidth": 1, - "links": [], "nullPointMode": "null", - "options": { - "dataLinks": [] - }, - "paceLength": 10, "percentage": false, - "pointradius": 5, + "pluginVersion": "7.1.3", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "synapse_federation_transaction_queue_pending_pdus{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}", + "interval": "", + "legendFormat": "pending PDUs {{job}}-{{index}}", + "refId": "A" + }, + { + "expr": "synapse_federation_transaction_queue_pending_edus{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}", + "interval": "", + "legendFormat": "pending EDUs {{job}}-{{index}}", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "In-memory federation transmission queues", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "short", + "label": "events", + "logBase": 1, + "max": null, + "min": "0", + "show": true + }, + { + "format": "short", + "label": "", + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "description": "Number of events queued up on the master process for processing by the federation sender", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 12, + "y": 51 + }, + "hiddenSeries": false, + "id": 140, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "paceLength": 10, + "percentage": false, + "pluginVersion": "7.1.3", + "pointradius": 5, "points": false, "renderer": "flot", "seriesOverrides": [], @@ -3391,68 +4054,243 @@ "alignLevel": null } }, + { + "cards": { + "cardPadding": -1, + "cardRound": null + }, + "color": { + "cardColor": "#b4ff00", + "colorScale": "sqrt", + "colorScheme": "interpolateInferno", + "exponent": 0.5, + "min": 0, + "mode": "spectrum" + }, + "dataFormat": "tsbuckets", + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 59 + }, + "heatmap": {}, + "hideZeroBuckets": false, + "highlightCards": true, + "id": 166, + "legend": { + "show": false + }, + "links": [], + "reverseYBuckets": false, + "targets": [ + { + "expr": "sum(rate(synapse_event_processing_lag_by_event_bucket{instance=\"$instance\",name=\"federation_sender\"}[$bucket_size])) by (le)", + "format": "heatmap", + "instant": false, + "interval": "", + "intervalFactor": 1, + "legendFormat": "{{ le }}", + "refId": "A" + } + ], + "title": "Federation send PDU lag", + "tooltip": { + "show": true, + "showHistogram": true + }, + "tooltipDecimals": 2, + "type": "heatmap", + "xAxis": { + "show": true + }, + "xBucketNumber": null, + "xBucketSize": null, + "yAxis": { + "decimals": 0, + "format": "s", + "logBase": 1, + "max": null, + "min": null, + "show": true, + "splitFactor": null + }, + "yBucketBound": "auto", + "yBucketNumber": null, + "yBucketSize": null + }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, - "datasource": "${DS_PROMETHEUS}", - "description": "The number of events in the in-memory queues ", - "fill": 1, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 0, "fillGradient": 0, "gridPos": { - "h": 8, + "h": 9, "w": 12, "x": 12, - "y": 24 + "y": 60 }, "hiddenSeries": false, - "id": 142, + "id": 162, "legend": { "avg": false, "current": false, "max": false, "min": false, + "rightSide": false, "show": true, "total": false, "values": false }, "lines": true, - "linewidth": 1, - "nullPointMode": "null", - "options": { - "dataLinks": [] - }, + "linewidth": 0, + "links": [], + "nullPointMode": "connected", + "paceLength": 10, "percentage": false, - "pointradius": 2, + "pluginVersion": "7.1.3", + "pointradius": 5, "points": false, "renderer": "flot", - "seriesOverrides": [], + "seriesOverrides": [ + { + "alias": "Avg", + "fill": 0, + "linewidth": 3 + }, + { + "alias": "99%", + "color": "#C4162A", + "fillBelowTo": "90%" + }, + { + "alias": "90%", + "color": "#FF7383", + "fillBelowTo": "75%" + }, + { + "alias": "75%", + "color": "#FFEE52", + "fillBelowTo": "50%" + }, + { + "alias": "50%", + "color": "#73BF69", + "fillBelowTo": "25%" + }, + { + "alias": "25%", + "color": "#1F60C4", + "fillBelowTo": "5%" + }, + { + "alias": "5%", + "lines": false + }, + { + "alias": "Average", + "color": "rgb(255, 255, 255)", + "lines": true, + "linewidth": 3 + } + ], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { - "expr": "synapse_federation_transaction_queue_pending_pdus{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}", + "expr": "histogram_quantile(0.99, sum(rate(synapse_event_processing_lag_by_event_bucket{name='federation_sender',index=~\"$index\",instance=\"$instance\"}[$bucket_size])) by (le))", + "format": "time_series", "interval": "", - "legendFormat": "pending PDUs {{job}}-{{index}}", + "intervalFactor": 1, + "legendFormat": "99%", + "refId": "D" + }, + { + "expr": "histogram_quantile(0.9, sum(rate(synapse_event_processing_lag_by_event_bucket{name='federation_sender',index=~\"$index\",instance=\"$instance\"}[$bucket_size])) by (le))", + "format": "time_series", + "interval": "", + "intervalFactor": 1, + "legendFormat": "90%", "refId": "A" }, { - "expr": "synapse_federation_transaction_queue_pending_edus{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}", + "expr": "histogram_quantile(0.75, sum(rate(synapse_event_processing_lag_by_event_bucket{name='federation_sender',index=~\"$index\",instance=\"$instance\"}[$bucket_size])) by (le))", + "format": "time_series", "interval": "", - "legendFormat": "pending EDUs {{job}}-{{index}}", + "intervalFactor": 1, + "legendFormat": "75%", + "refId": "C" + }, + { + "expr": "histogram_quantile(0.5, sum(rate(synapse_event_processing_lag_by_event_bucket{name='federation_sender',index=~\"$index\",instance=\"$instance\"}[$bucket_size])) by (le))", + "format": "time_series", + "interval": "", + "intervalFactor": 1, + "legendFormat": "50%", "refId": "B" + }, + { + "expr": "histogram_quantile(0.25, sum(rate(synapse_event_processing_lag_by_event_bucket{name='federation_sender',index=~\"$index\",instance=\"$instance\"}[$bucket_size])) by (le))", + "interval": "", + "legendFormat": "25%", + "refId": "F" + }, + { + "expr": "histogram_quantile(0.05, sum(rate(synapse_event_processing_lag_by_event_bucket{name='federation_sender',index=~\"$index\",instance=\"$instance\"}[$bucket_size])) by (le))", + "interval": "", + "legendFormat": "5%", + "refId": "G" + }, + { + "expr": "sum(rate(synapse_event_processing_lag_by_event_sum{name='federation_sender',index=~\"$index\",instance=\"$instance\"}[$bucket_size])) / sum(rate(synapse_event_processing_lag_by_event_count{name='federation_sender',index=~\"$index\",instance=\"$instance\"}[$bucket_size]))", + "interval": "", + "legendFormat": "Average", + "refId": "H" + } + ], + "thresholds": [ + { + "colorMode": "warning", + "fill": false, + "line": true, + "op": "gt", + "value": 0.25, + "yaxis": "left" + }, + { + "colorMode": "critical", + "fill": false, + "line": true, + "op": "gt", + "value": 1, + "yaxis": "left" } ], - "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "In-memory federation transmission queues", + "title": "Federation send PDU lag quantiles", "tooltip": { "shared": true, - "sort": 0, + "sort": 2, "value_type": "individual" }, "type": "graph", @@ -3465,21 +4303,20 @@ }, "yaxes": [ { - "$$hashKey": "object:317", - "format": "short", - "label": "events", + "decimals": null, + "format": "s", + "label": "", "logBase": 1, "max": null, "min": "0", "show": true }, { - "$$hashKey": "object:318", - "format": "short", + "format": "hertz", "label": "", "logBase": 1, "max": null, - "min": null, + "min": "0", "show": true } ], @@ -3487,7 +4324,79 @@ "align": false, "alignLevel": null } - } + }, + { + "cards": { + "cardPadding": -1, + "cardRound": null + }, + "color": { + "cardColor": "#b4ff00", + "colorScale": "sqrt", + "colorScheme": "interpolateInferno", + "exponent": 0.5, + "min": 0, + "mode": "spectrum" + }, + "dataFormat": "tsbuckets", + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 68 + }, + "heatmap": {}, + "hideZeroBuckets": false, + "highlightCards": true, + "id": 164, + "legend": { + "show": false + }, + "links": [], + "reverseYBuckets": false, + "targets": [ + { + "expr": "sum(rate(synapse_federation_server_pdu_process_time_bucket{instance=\"$instance\"}[$bucket_size])) by (le)", + "format": "heatmap", + "instant": false, + "interval": "", + "intervalFactor": 1, + "legendFormat": "{{ le }}", + "refId": "A" + } + ], + "title": "Handle inbound PDU time", + "tooltip": { + "show": true, + "showHistogram": true + }, + "tooltipDecimals": 2, + "type": "heatmap", + "xAxis": { + "show": true + }, + "xBucketNumber": null, + "xBucketSize": null, + "yAxis": { + "decimals": 0, + "format": "s", + "logBase": 1, + "max": null, + "min": null, + "show": true, + "splitFactor": null + }, + "yBucketBound": "auto", + "yBucketNumber": null, + "yBucketSize": null + } ], "title": "Federation", "type": "row" @@ -3499,7 +4408,7 @@ "h": 1, "w": 24, "x": 0, - "y": 33 + "y": 31 }, "id": 60, "panels": [ @@ -3509,6 +4418,13 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { @@ -3532,11 +4448,9 @@ "linewidth": 1, "links": [], "nullPointMode": "null", - "options": { - "dataLinks": [] - }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.1.3", "pointradius": 5, "points": false, "renderer": "flot", @@ -3611,6 +4525,13 @@ "dashes": false, "datasource": "$datasource", "description": "", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { @@ -3634,10 +4555,8 @@ "lines": true, "linewidth": 1, "nullPointMode": "null", - "options": { - "dataLinks": [] - }, "percentage": false, + "pluginVersion": "7.1.3", "pointradius": 2, "points": false, "renderer": "flot", @@ -3705,7 +4624,7 @@ "h": 1, "w": 24, "x": 0, - "y": 34 + "y": 32 }, "id": 58, "panels": [ @@ -3715,13 +4634,20 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 7, "w": 12, "x": 0, - "y": 79 + "y": 8 }, "hiddenSeries": false, "id": 48, @@ -3739,10 +4665,11 @@ "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -3809,13 +4736,20 @@ "dashes": false, "datasource": "$datasource", "description": "Shows the time in which the given percentage of database queries were scheduled, over the sampled timespan", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 7, "w": 12, "x": 12, - "y": 79 + "y": 8 }, "hiddenSeries": false, "id": 104, @@ -3834,10 +4768,11 @@ "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -3928,6 +4863,13 @@ "datasource": "$datasource", "editable": true, "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 0, "fillGradient": 0, "grid": {}, @@ -3935,7 +4877,7 @@ "h": 7, "w": 12, "x": 0, - "y": 86 + "y": 15 }, "hiddenSeries": false, "id": 10, @@ -3955,10 +4897,11 @@ "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -4024,6 +4967,13 @@ "datasource": "$datasource", "editable": true, "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "grid": {}, @@ -4031,7 +4981,7 @@ "h": 7, "w": 12, "x": 12, - "y": 86 + "y": 15 }, "hiddenSeries": false, "id": 11, @@ -4051,10 +5001,11 @@ "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -4078,7 +5029,7 @@ "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Top DB transactions by total txn time", + "title": "DB transactions by total txn time", "tooltip": { "shared": false, "sort": 0, @@ -4112,6 +5063,111 @@ "align": false, "alignLevel": null } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "editable": true, + "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "grid": {}, + "gridPos": { + "h": 7, + "w": 12, + "x": 0, + "y": 22 + }, + "hiddenSeries": false, + "id": 180, + "legend": { + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": true, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": false + }, + "paceLength": 10, + "percentage": false, + "pluginVersion": "7.3.7", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "rate(synapse_storage_transaction_time_sum{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])/rate(synapse_storage_transaction_time_count{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])", + "format": "time_series", + "instant": false, + "interval": "", + "intervalFactor": 1, + "legendFormat": "{{job}}-{{index}} {{desc}}", + "refId": "A", + "step": 20 + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Average DB txn time", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "cumulative" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "s", + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } } ], "repeat": null, @@ -4125,7 +5181,7 @@ "h": 1, "w": 24, "x": 0, - "y": 35 + "y": 33 }, "id": 59, "panels": [ @@ -4137,6 +5193,13 @@ "datasource": "$datasource", "editable": true, "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "grid": {}, @@ -4144,7 +5207,7 @@ "h": 13, "w": 12, "x": 0, - "y": 80 + "y": 9 }, "hiddenSeries": false, "id": 12, @@ -4162,11 +5225,9 @@ "linewidth": 2, "links": [], "nullPointMode": "null", - "options": { - "dataLinks": [] - }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.1.3", "pointradius": 5, "points": false, "renderer": "flot", @@ -4191,8 +5252,8 @@ "timeShift": null, "title": "Total CPU Usage by Block", "tooltip": { - "shared": false, - "sort": 0, + "shared": true, + "sort": 2, "value_type": "cumulative" }, "type": "graph", @@ -4232,6 +5293,13 @@ "datasource": "$datasource", "editable": true, "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "grid": {}, @@ -4239,7 +5307,7 @@ "h": 13, "w": 12, "x": 12, - "y": 80 + "y": 9 }, "hiddenSeries": false, "id": 26, @@ -4257,11 +5325,9 @@ "linewidth": 2, "links": [], "nullPointMode": "null", - "options": { - "dataLinks": [] - }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.1.3", "pointradius": 5, "points": false, "renderer": "flot", @@ -4286,8 +5352,8 @@ "timeShift": null, "title": "Average CPU Time per Block", "tooltip": { - "shared": false, - "sort": 0, + "shared": true, + "sort": 2, "value_type": "cumulative" }, "type": "graph", @@ -4327,6 +5393,13 @@ "datasource": "$datasource", "editable": true, "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "grid": {}, @@ -4334,7 +5407,7 @@ "h": 13, "w": 12, "x": 0, - "y": 93 + "y": 22 }, "hiddenSeries": false, "id": 13, @@ -4352,11 +5425,9 @@ "linewidth": 2, "links": [], "nullPointMode": "null", - "options": { - "dataLinks": [] - }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.1.3", "pointradius": 5, "points": false, "renderer": "flot", @@ -4381,8 +5452,8 @@ "timeShift": null, "title": "Total DB Usage by Block", "tooltip": { - "shared": false, - "sort": 0, + "shared": true, + "sort": 2, "value_type": "cumulative" }, "type": "graph", @@ -4423,6 +5494,13 @@ "description": "The time each database transaction takes to execute, on average, broken down by metrics block.", "editable": true, "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "grid": {}, @@ -4430,7 +5508,7 @@ "h": 13, "w": 12, "x": 12, - "y": 93 + "y": 22 }, "hiddenSeries": false, "id": 27, @@ -4448,11 +5526,9 @@ "linewidth": 2, "links": [], "nullPointMode": "null", - "options": { - "dataLinks": [] - }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.1.3", "pointradius": 5, "points": false, "renderer": "flot", @@ -4477,8 +5553,8 @@ "timeShift": null, "title": "Average Database Transaction time, by Block", "tooltip": { - "shared": false, - "sort": 0, + "shared": true, + "sort": 2, "value_type": "cumulative" }, "type": "graph", @@ -4518,6 +5594,13 @@ "datasource": "$datasource", "editable": true, "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "grid": {}, @@ -4525,7 +5608,7 @@ "h": 13, "w": 12, "x": 0, - "y": 106 + "y": 35 }, "hiddenSeries": false, "id": 28, @@ -4542,11 +5625,9 @@ "linewidth": 2, "links": [], "nullPointMode": "null", - "options": { - "dataLinks": [] - }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.1.3", "pointradius": 5, "points": false, "renderer": "flot", @@ -4612,6 +5693,13 @@ "datasource": "$datasource", "editable": true, "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "grid": {}, @@ -4619,7 +5707,7 @@ "h": 13, "w": 12, "x": 12, - "y": 106 + "y": 35 }, "hiddenSeries": false, "id": 25, @@ -4636,11 +5724,9 @@ "linewidth": 2, "links": [], "nullPointMode": "null", - "options": { - "dataLinks": [] - }, "paceLength": 10, "percentage": false, + "pluginVersion": "7.1.3", "pointradius": 5, "points": false, "renderer": "flot", @@ -4697,49 +5783,33 @@ "align": false, "alignLevel": null } - } - ], - "repeat": null, - "title": "Per-block metrics", - "type": "row" - }, - { - "collapsed": true, - "datasource": "${DS_PROMETHEUS}", - "gridPos": { - "h": 1, - "w": 24, - "x": 0, - "y": 36 - }, - "id": 61, - "panels": [ + }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "$datasource", - "decimals": 2, - "editable": true, - "error": false, - "fill": 0, + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 1, "fillGradient": 0, - "grid": {}, "gridPos": { - "h": 10, + "h": 15, "w": 12, "x": 0, - "y": 37 + "y": 48 }, "hiddenSeries": false, - "id": 1, + "id": 154, "legend": { "alignAsTable": true, "avg": false, "current": false, - "hideEmpty": true, - "hideZero": false, "max": false, "min": false, "show": true, @@ -4747,13 +5817,130 @@ "values": false }, "lines": true, - "linewidth": 2, - "links": [], + "linewidth": 1, "nullPointMode": "null", - "options": { - "dataLinks": [] - }, "percentage": false, + "pluginVersion": "7.1.3", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "rate(synapse_util_metrics_block_count{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])", + "interval": "", + "legendFormat": "{{job}}-{{index}} {{block_name}}", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Block count", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "hertz", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + } + ], + "repeat": null, + "title": "Per-block metrics", + "type": "row" + }, + { + "collapsed": true, + "datasource": "${DS_PROMETHEUS}", + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 34 + }, + "id": 61, + "panels": [ + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "decimals": 2, + "editable": true, + "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "grid": {}, + "gridPos": { + "h": 10, + "w": 12, + "x": 0, + "y": 84 + }, + "hiddenSeries": false, + "id": 1, + "legend": { + "alignAsTable": true, + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 2, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -4821,6 +6008,13 @@ "datasource": "$datasource", "editable": true, "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "grid": {}, @@ -4828,7 +6022,7 @@ "h": 10, "w": 12, "x": 12, - "y": 37 + "y": 84 }, "hiddenSeries": false, "id": 8, @@ -4848,9 +6042,10 @@ "links": [], "nullPointMode": "connected", "options": { - "dataLinks": [] + "alertThreshold": true }, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -4917,6 +6112,13 @@ "datasource": "$datasource", "editable": true, "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "grid": {}, @@ -4924,7 +6126,7 @@ "h": 10, "w": 12, "x": 0, - "y": 47 + "y": 94 }, "hiddenSeries": false, "id": 38, @@ -4944,9 +6146,10 @@ "links": [], "nullPointMode": "connected", "options": { - "dataLinks": [] + "alertThreshold": true }, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -5010,13 +6213,20 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 10, "w": 12, "x": 12, - "y": 47 + "y": 94 }, "hiddenSeries": false, "id": 39, @@ -5035,9 +6245,10 @@ "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -5102,13 +6313,20 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 9, "w": 12, "x": 0, - "y": 57 + "y": 104 }, "hiddenSeries": false, "id": 65, @@ -5127,9 +6345,10 @@ "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -5200,9 +6419,9 @@ "h": 1, "w": 24, "x": 0, - "y": 37 + "y": 35 }, - "id": 62, + "id": 148, "panels": [ { "aliasColors": {}, @@ -5210,16 +6429,23 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { - "h": 9, + "h": 8, "w": 12, "x": 0, - "y": 121 + "y": 29 }, "hiddenSeries": false, - "id": 91, + "id": 146, "legend": { "avg": false, "current": false, @@ -5231,26 +6457,24 @@ }, "lines": true, "linewidth": 1, - "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "percentage": false, - "pointradius": 5, + "pluginVersion": "7.3.7", + "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], "spaceLength": 10, - "stack": true, + "stack": false, "steppedLine": false, "targets": [ { - "expr": "rate(python_gc_time_sum{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[10m])", - "format": "time_series", - "instant": false, - "intervalFactor": 1, - "legendFormat": "{{job}}-{{index}} gen {{gen}}", + "expr": "synapse_util_caches_response_cache:size{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}", + "interval": "", + "legendFormat": "{{name}} {{job}}-{{index}}", "refId": "A" } ], @@ -5258,9 +6482,9 @@ "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Total GC time by bucket (10m smoothing)", + "title": "Response cache size", "tooltip": { - "shared": true, + "shared": false, "sort": 0, "value_type": "individual" }, @@ -5274,12 +6498,11 @@ }, "yaxes": [ { - "decimals": null, - "format": "percentunit", + "format": "short", "label": null, "logBase": 1, "max": null, - "min": "0", + "min": null, "show": true }, { @@ -5302,22 +6525,24 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", - "decimals": 3, - "editable": true, - "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, - "grid": {}, "gridPos": { - "h": 9, + "h": 8, "w": 12, "x": 12, - "y": 121 + "y": 29 }, "hiddenSeries": false, - "id": 21, + "id": 150, "legend": { - "alignAsTable": true, "avg": false, "current": false, "max": false, @@ -5327,14 +6552,14 @@ "values": false }, "lines": true, - "linewidth": 2, - "links": [], - "nullPointMode": "null as zero", + "linewidth": 1, + "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "percentage": false, - "pointradius": 5, + "pluginVersion": "7.3.7", + "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], @@ -5343,24 +6568,27 @@ "steppedLine": false, "targets": [ { - "expr": "rate(python_gc_time_sum{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])/rate(python_gc_time_count[$bucket_size])", - "format": "time_series", - "intervalFactor": 2, - "legendFormat": "{{job}} {{index}} gen {{gen}} ", - "refId": "A", - "step": 20, - "target": "" + "expr": "rate(synapse_util_caches_response_cache:hits{instance=\"$instance\", job=~\"$job\", index=~\"$index\"}[$bucket_size])/rate(synapse_util_caches_response_cache:total{instance=\"$instance\", job=~\"$job\", index=~\"$index\"}[$bucket_size])", + "interval": "", + "legendFormat": "{{name}} {{job}}-{{index}}", + "refId": "A" + }, + { + "expr": "", + "interval": "", + "legendFormat": "", + "refId": "B" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Average GC Time Per Collection", + "title": "Response cache hit rate", "tooltip": { "shared": false, "sort": 0, - "value_type": "cumulative" + "value_type": "individual" }, "type": "graph", "xaxis": { @@ -5372,14 +6600,17 @@ }, "yaxes": [ { - "format": "s", + "decimals": null, + "format": "percentunit", + "label": null, "logBase": 1, - "max": null, - "min": null, + "max": "1", + "min": "0", "show": true }, { "format": "short", + "label": null, "logBase": 1, "max": null, "min": null, @@ -5390,29 +6621,48 @@ "align": false, "alignLevel": null } - }, + } + ], + "title": "Response caches", + "type": "row" + }, + { + "collapsed": true, + "datasource": "${DS_PROMETHEUS}", + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 36 + }, + "id": 62, + "panels": [ { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "$datasource", - "description": "'gen 0' shows the number of objects allocated since the last gen0 GC.\n'gen 1' / 'gen 2' show the number of gen0/gen1 GCs since the last gen1/gen2 GC.", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 9, "w": 12, "x": 0, - "y": 130 + "y": 30 }, "hiddenSeries": false, - "id": 89, + "id": 91, "legend": { "avg": false, "current": false, - "hideEmpty": true, - "hideZero": false, "max": false, "min": false, "show": true, @@ -5424,25 +6674,22 @@ "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", - "seriesOverrides": [ - { - "alias": "/gen 0$/", - "yaxis": 2 - } - ], + "seriesOverrides": [], "spaceLength": 10, - "stack": false, + "stack": true, "steppedLine": false, "targets": [ { - "expr": "python_gc_counts{job=~\"$job\",index=~\"$index\",instance=\"$instance\"}", + "expr": "rate(python_gc_time_sum{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[10m])", "format": "time_series", + "instant": false, "intervalFactor": 1, "legendFormat": "{{job}}-{{index}} gen {{gen}}", "refId": "A" @@ -5452,9 +6699,9 @@ "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Allocation counts", + "title": "Total GC time by bucket (10m smoothing)", "tooltip": { - "shared": false, + "shared": true, "sort": 0, "value_type": "individual" }, @@ -5468,17 +6715,17 @@ }, "yaxes": [ { - "format": "short", - "label": "Gen N-1 GCs since last Gen N GC", + "decimals": null, + "format": "percentunit", + "label": null, "logBase": 1, "max": null, - "min": null, + "min": "0", "show": true }, { - "decimals": null, "format": "short", - "label": "Objects since last Gen 0 GC", + "label": null, "logBase": 1, "max": null, "min": null, @@ -5496,17 +6743,29 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "decimals": 3, + "editable": true, + "error": false, + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, + "grid": {}, "gridPos": { "h": 9, "w": 12, "x": 12, - "y": 130 + "y": 30 }, "hiddenSeries": false, - "id": 93, + "id": 21, "legend": { + "alignAsTable": true, "avg": false, "current": false, "max": false, @@ -5516,13 +6775,219 @@ "values": false }, "lines": true, - "linewidth": 1, + "linewidth": 2, "links": [], - "nullPointMode": "connected", + "nullPointMode": "null as zero", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "7.3.7", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "rate(python_gc_time_sum{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])/rate(python_gc_time_count[$bucket_size])", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{job}} {{index}} gen {{gen}} ", + "refId": "A", + "step": 20, + "target": "" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Average GC Time Per Collection", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "cumulative" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "s", + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "description": "'gen 0' shows the number of objects allocated since the last gen0 GC.\n'gen 1' / 'gen 2' show the number of gen0/gen1 GCs since the last gen1/gen2 GC.", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 39 + }, + "hiddenSeries": false, + "id": 89, + "legend": { + "avg": false, + "current": false, + "hideEmpty": true, + "hideZero": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "7.3.7", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [ + { + "alias": "/gen 0$/", + "yaxis": 2 + } + ], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "python_gc_counts{job=~\"$job\",index=~\"$index\",instance=\"$instance\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{job}}-{{index}} gen {{gen}}", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Allocation counts", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "short", + "label": "Gen N-1 GCs since last Gen N GC", + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "decimals": null, + "format": "short", + "label": "Objects since last Gen 0 GC", + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 12, + "y": 39 + }, + "hiddenSeries": false, + "id": 93, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "connected", "options": { - "dataLinks": [] + "alertThreshold": true }, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", @@ -5586,16 +7051,1579 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 48 + }, + "hiddenSeries": false, + "id": 95, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "7.3.7", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "rate(python_gc_time_count{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{job}}-{{index}} gen {{gen}}", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "GC frequency", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "hertz", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "cards": { + "cardPadding": 0, + "cardRound": null + }, + "color": { + "cardColor": "#b4ff00", + "colorScale": "sqrt", + "colorScheme": "interpolateSpectral", + "exponent": 0.5, + "max": null, + "min": 0, + "mode": "spectrum" + }, + "dataFormat": "tsbuckets", + "datasource": "${DS_PROMETHEUS}", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "gridPos": { + "h": 9, + "w": 12, + "x": 12, + "y": 48 + }, + "heatmap": {}, + "hideZeroBuckets": true, + "highlightCards": true, + "id": 87, + "legend": { + "show": true + }, + "links": [], + "reverseYBuckets": false, + "targets": [ + { + "expr": "sum(rate(python_gc_time_bucket[$bucket_size])) by (le)", + "format": "heatmap", + "intervalFactor": 1, + "legendFormat": "{{le}}", + "refId": "A" + } + ], + "title": "GC durations", + "tooltip": { + "show": true, + "showHistogram": false + }, + "type": "heatmap", + "xAxis": { + "show": true + }, + "xBucketNumber": null, + "xBucketSize": null, + "yAxis": { + "decimals": null, + "format": "s", + "logBase": 1, + "max": null, + "min": null, + "show": true, + "splitFactor": null + }, + "yBucketBound": "auto", + "yBucketNumber": null, + "yBucketSize": null + } + ], + "repeat": null, + "title": "GC", + "type": "row" + }, + { + "collapsed": true, + "datasource": "${DS_PROMETHEUS}", + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 37 + }, + "id": 63, + "panels": [ + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 12, + "x": 0, + "y": 13 + }, + "hiddenSeries": false, + "id": 42, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "paceLength": 10, + "percentage": false, + "pluginVersion": "7.3.7", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum (rate(synapse_replication_tcp_protocol_inbound_commands{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])) without (name, conn_id)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{job}}-{{index}} {{command}}", + "refId": "A", + "step": 20 + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Rate of incoming commands", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "hertz", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "${DS_PROMETHEUS}", + "description": "", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 12, + "x": 12, + "y": 13 + }, + "hiddenSeries": false, + "id": 144, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "percentage": false, + "pluginVersion": "7.3.7", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "synapse_replication_tcp_command_queue{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}", + "interval": "", + "legendFormat": "{{stream_name}} {{job}}-{{index}}", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Queued incoming RDATA commands, by stream", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 12, + "x": 0, + "y": 20 + }, + "hiddenSeries": false, + "id": 43, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "paceLength": 10, + "percentage": false, + "pluginVersion": "7.3.7", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum (rate(synapse_replication_tcp_protocol_outbound_commands{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])) without (name, conn_id)", + "format": "time_series", + "intervalFactor": 2, + "legendFormat": "{{job}}-{{index}} {{command}}", + "refId": "A", + "step": 20 + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Rate of outgoing commands", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "hertz", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 12, + "x": 12, + "y": 20 + }, + "hiddenSeries": false, + "id": 41, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "paceLength": 10, + "percentage": false, + "pluginVersion": "7.3.7", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "rate(synapse_replication_tcp_resource_stream_updates{job=~\"$job\",index=~\"$index\",instance=\"$instance\"}[$bucket_size])", + "format": "time_series", + "interval": "", + "intervalFactor": 2, + "legendFormat": "{{stream_name}}", + "refId": "A", + "step": 20 + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Outgoing stream updates", + "tooltip": { + "shared": false, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "hertz", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 12, + "x": 0, + "y": 27 + }, + "hiddenSeries": false, + "id": 113, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "paceLength": 10, + "percentage": false, + "pluginVersion": "7.3.7", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "synapse_replication_tcp_resource_connections_per_stream{job=~\"$job\",index=~\"$index\",instance=\"$instance\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{job}}-{{index}} {{stream_name}}", + "refId": "A" + }, + { + "expr": "synapse_replication_tcp_resource_total_connections{job=~\"$job\",index=~\"$index\",instance=\"$instance\"}", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{job}}-{{index}}", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Replication connections", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 7, + "w": 12, + "x": 12, + "y": 27 + }, + "hiddenSeries": false, + "id": 115, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "options": { + "alertThreshold": true + }, + "paceLength": 10, + "percentage": false, + "pluginVersion": "7.3.7", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "rate(synapse_replication_tcp_protocol_close_reason{job=~\"$job\",index=~\"$index\",instance=\"$instance\"}[$bucket_size])", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "{{job}}-{{index}} {{reason_type}}", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Replication connection close reasons", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "hertz", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + } + ], + "repeat": null, + "title": "Replication", + "type": "row" + }, + { + "collapsed": true, + "datasource": "${DS_PROMETHEUS}", + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 38 + }, + "id": 69, + "panels": [ + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 41 + }, + "hiddenSeries": false, + "id": 67, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "connected", + "options": { + "alertThreshold": true + }, + "paceLength": 10, + "percentage": false, + "pluginVersion": "7.3.7", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "max(synapse_event_persisted_position{instance=\"$instance\"}) - on() group_right() synapse_event_processing_positions{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}", + "format": "time_series", + "interval": "", + "intervalFactor": 1, + "legendFormat": "{{job}}-{{index}} {{name}}", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Event processing lag", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "short", + "label": "events", + "logBase": 1, + "max": null, + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 12, + "y": 41 + }, + "hiddenSeries": false, + "id": 71, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "connected", + "options": { + "alertThreshold": true + }, + "paceLength": 10, + "percentage": false, + "pluginVersion": "7.3.7", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "time()*1000-synapse_event_processing_last_ts{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}", + "format": "time_series", + "hide": false, + "interval": "", + "intervalFactor": 1, + "legendFormat": "{{job}}-{{index}} {{name}}", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Age of last processed event", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "ms", + "label": null, + "logBase": 1, + "max": null, + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 9, + "w": 12, + "x": 0, + "y": 50 + }, + "hiddenSeries": false, + "id": 121, + "interval": "", + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "connected", + "options": { + "alertThreshold": true + }, + "paceLength": 10, + "percentage": false, + "pluginVersion": "7.3.7", + "pointradius": 5, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "deriv(synapse_event_processing_last_ts{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])/1000 - 1", + "format": "time_series", + "hide": false, + "interval": "", + "intervalFactor": 1, + "legendFormat": "{{job}}-{{index}} {{name}}", + "refId": "B" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Event processing catchup rate", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": null, + "format": "none", + "label": "fallbehind(-) / catchup(+): s/sec", + "logBase": 1, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + } + ], + "title": "Event processing loop positions", + "type": "row" + }, + { + "collapsed": true, + "datasource": "${DS_PROMETHEUS}", + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 39 + }, + "id": 126, + "panels": [ + { + "cards": { + "cardPadding": 0, + "cardRound": null + }, + "color": { + "cardColor": "#B877D9", + "colorScale": "sqrt", + "colorScheme": "interpolateInferno", + "exponent": 0.5, + "max": null, + "min": 0, + "mode": "opacity" + }, + "dataFormat": "tsbuckets", + "datasource": "$datasource", + "description": "Colour reflects the number of rooms with the given number of forward extremities, or fewer.\n\nThis is only updated once an hour.", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 42 + }, + "heatmap": {}, + "hideZeroBuckets": true, + "highlightCards": true, + "id": 122, + "legend": { + "show": true + }, + "links": [], + "reverseYBuckets": false, + "targets": [ + { + "expr": "synapse_forward_extremities_bucket{instance=\"$instance\"} and on (index, instance, job) (synapse_storage_events_persisted_events > 0)", + "format": "heatmap", + "intervalFactor": 1, + "legendFormat": "{{le}}", + "refId": "A" + } + ], + "timeFrom": null, + "timeShift": null, + "title": "Number of rooms, by number of forward extremities in room", + "tooltip": { + "show": true, + "showHistogram": true + }, + "type": "heatmap", + "xAxis": { + "show": true + }, + "xBucketNumber": null, + "xBucketSize": null, + "yAxis": { + "decimals": 0, + "format": "short", + "logBase": 1, + "max": null, + "min": null, + "show": true, + "splitFactor": null + }, + "yBucketBound": "auto", + "yBucketNumber": null, + "yBucketSize": null + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "description": "Number of rooms with the given number of forward extremities or fewer.\n\nThis is only updated once an hour.", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 0, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 42 + }, + "hiddenSeries": false, + "id": 124, + "interval": "", + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "connected", + "percentage": false, + "pluginVersion": "7.1.3", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "synapse_forward_extremities_bucket{instance=\"$instance\"} > 0", + "format": "heatmap", + "interval": "", + "intervalFactor": 1, + "legendFormat": "{{le}}", + "refId": "A" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Room counts, by number of extremities", + "tooltip": { + "shared": true, + "sort": 2, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "decimals": null, + "format": "none", + "label": "Number of rooms", + "logBase": 10, + "max": null, + "min": null, + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": false + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "cards": { + "cardPadding": 0, + "cardRound": null + }, + "color": { + "cardColor": "#5794F2", + "colorScale": "sqrt", + "colorScheme": "interpolateInferno", + "exponent": 0.5, + "min": 0, + "mode": "opacity" + }, + "dataFormat": "tsbuckets", + "datasource": "$datasource", + "description": "Colour reflects the number of events persisted to rooms with the given number of forward extremities, or fewer.", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 50 + }, + "heatmap": {}, + "hideZeroBuckets": true, + "highlightCards": true, + "id": 127, + "legend": { + "show": true + }, + "links": [], + "reverseYBuckets": false, + "targets": [ + { + "expr": "rate(synapse_storage_events_forward_extremities_persisted_bucket{instance=\"$instance\"}[$bucket_size]) and on (index, instance, job) (synapse_storage_events_persisted_events > 0)", + "format": "heatmap", + "intervalFactor": 1, + "legendFormat": "{{le}}", + "refId": "A" + } + ], + "timeFrom": null, + "timeShift": null, + "title": "Events persisted, by number of forward extremities in room (heatmap)", + "tooltip": { + "show": true, + "showHistogram": true + }, + "type": "heatmap", + "xAxis": { + "show": true + }, + "xBucketNumber": null, + "xBucketSize": null, + "yAxis": { + "decimals": 0, + "format": "short", + "logBase": 1, + "max": null, + "min": null, + "show": true, + "splitFactor": null + }, + "yBucketBound": "auto", + "yBucketNumber": null, + "yBucketSize": null + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "description": "For a given percentage P, the number X where P% of events were persisted to rooms with X forward extremities or fewer.", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 12, + "x": 12, + "y": 50 + }, + "hiddenSeries": false, + "id": 128, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "links": [], + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.3", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "histogram_quantile(0.5, rate(synapse_storage_events_forward_extremities_persisted_bucket{instance=\"$instance\"}[$bucket_size]) and on (index, instance, job) (synapse_storage_events_persisted_events > 0))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "50%", + "refId": "A" + }, + { + "expr": "histogram_quantile(0.75, rate(synapse_storage_events_forward_extremities_persisted_bucket{instance=\"$instance\"}[$bucket_size]) and on (index, instance, job) (synapse_storage_events_persisted_events > 0))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "75%", + "refId": "B" + }, + { + "expr": "histogram_quantile(0.90, rate(synapse_storage_events_forward_extremities_persisted_bucket{instance=\"$instance\"}[$bucket_size]) and on (index, instance, job) (synapse_storage_events_persisted_events > 0))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "90%", + "refId": "C" + }, + { + "expr": "histogram_quantile(0.99, rate(synapse_storage_events_forward_extremities_persisted_bucket{instance=\"$instance\"}[$bucket_size]) and on (index, instance, job) (synapse_storage_events_persisted_events > 0))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "99%", + "refId": "D" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Events persisted, by number of forward extremities in room (quantiles)", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "short", + "label": "Number of extremities in room", + "logBase": 1, + "max": null, + "min": "0", + "show": true + }, + { + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "cards": { + "cardPadding": 0, + "cardRound": null + }, + "color": { + "cardColor": "#FF9830", + "colorScale": "sqrt", + "colorScheme": "interpolateInferno", + "exponent": 0.5, + "min": 0, + "mode": "opacity" + }, + "dataFormat": "tsbuckets", + "datasource": "$datasource", + "description": "Colour reflects the number of events persisted to rooms with the given number of stale forward extremities, or fewer.\n\nStale forward extremities are those that were in the previous set of extremities as well as the new.", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 58 + }, + "heatmap": {}, + "hideZeroBuckets": true, + "highlightCards": true, + "id": 129, + "legend": { + "show": true + }, + "links": [], + "reverseYBuckets": false, + "targets": [ + { + "expr": "rate(synapse_storage_events_stale_forward_extremities_persisted_bucket{instance=\"$instance\"}[$bucket_size]) and on (index, instance, job) (synapse_storage_events_persisted_events > 0)", + "format": "heatmap", + "intervalFactor": 1, + "legendFormat": "{{le}}", + "refId": "A" + } + ], + "timeFrom": null, + "timeShift": null, + "title": "Events persisted, by number of stale forward extremities in room (heatmap)", + "tooltip": { + "show": true, + "showHistogram": true + }, + "type": "heatmap", + "xAxis": { + "show": true + }, + "xBucketNumber": null, + "xBucketSize": null, + "yAxis": { + "decimals": 0, + "format": "short", + "logBase": 1, + "max": null, + "min": null, + "show": true, + "splitFactor": null + }, + "yBucketBound": "auto", + "yBucketNumber": null, + "yBucketSize": null + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "description": "For given percentage P, the number X where P% of events were persisted to rooms with X stale forward extremities or fewer.\n\nStale forward extremities are those that were in the previous set of extremities as well as the new.", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { - "h": 9, + "h": 8, "w": 12, - "x": 0, - "y": 139 + "x": 12, + "y": 58 }, "hiddenSeries": false, - "id": 95, + "id": 130, "legend": { "avg": false, "current": false, @@ -5609,11 +8637,9 @@ "linewidth": 1, "links": [], "nullPointMode": "null", - "options": { - "dataLinks": [] - }, "percentage": false, - "pointradius": 5, + "pluginVersion": "7.1.3", + "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], @@ -5622,18 +8648,39 @@ "steppedLine": false, "targets": [ { - "expr": "rate(python_gc_time_count{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])", + "expr": "histogram_quantile(0.5, rate(synapse_storage_events_stale_forward_extremities_persisted_bucket{instance=\"$instance\"}[$bucket_size]) and on (index, instance, job) (synapse_storage_events_persisted_events > 0))", "format": "time_series", "intervalFactor": 1, - "legendFormat": "{{job}}-{{index}} gen {{gen}}", + "legendFormat": "50%", "refId": "A" + }, + { + "expr": "histogram_quantile(0.75, rate(synapse_storage_events_stale_forward_extremities_persisted_bucket{instance=\"$instance\"}[$bucket_size]) and on (index, instance, job) (synapse_storage_events_persisted_events > 0))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "75%", + "refId": "B" + }, + { + "expr": "histogram_quantile(0.90, rate(synapse_storage_events_stale_forward_extremities_persisted_bucket{instance=\"$instance\"}[$bucket_size]) and on (index, instance, job) (synapse_storage_events_persisted_events > 0))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "90%", + "refId": "C" + }, + { + "expr": "histogram_quantile(0.99, rate(synapse_storage_events_stale_forward_extremities_persisted_bucket{instance=\"$instance\"}[$bucket_size]) and on (index, instance, job) (synapse_storage_events_persisted_events > 0))", + "format": "time_series", + "intervalFactor": 1, + "legendFormat": "99%", + "refId": "D" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "GC frequency", + "title": "Events persisted, by number of stale forward extremities in room (quantiles)", "tooltip": { "shared": true, "sort": 0, @@ -5649,11 +8696,11 @@ }, "yaxes": [ { - "format": "hertz", - "label": null, + "format": "short", + "label": "Number of stale forward extremities in room", "logBase": 1, "max": null, - "min": null, + "min": "0", "show": true }, { @@ -5676,26 +8723,32 @@ "cardRound": null }, "color": { - "cardColor": "#b4ff00", + "cardColor": "#73BF69", "colorScale": "sqrt", - "colorScheme": "interpolateSpectral", + "colorScheme": "interpolateInferno", "exponent": 0.5, - "max": null, "min": 0, - "mode": "spectrum" + "mode": "opacity" }, "dataFormat": "tsbuckets", - "datasource": "${DS_PROMETHEUS}", + "datasource": "$datasource", + "description": "Colour reflects the number of state resolution operations performed over the given number of state groups, or fewer.", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, "gridPos": { - "h": 9, + "h": 8, "w": 12, - "x": 12, - "y": 139 + "x": 0, + "y": 66 }, "heatmap": {}, "hideZeroBuckets": true, "highlightCards": true, - "id": 87, + "id": 131, "legend": { "show": true }, @@ -5703,17 +8756,20 @@ "reverseYBuckets": false, "targets": [ { - "expr": "sum(rate(python_gc_time_bucket[$bucket_size])) by (le)", + "expr": "rate(synapse_state_number_state_groups_in_resolution_bucket{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])", "format": "heatmap", + "interval": "", "intervalFactor": 1, "legendFormat": "{{le}}", "refId": "A" } ], - "title": "GC durations", + "timeFrom": null, + "timeShift": null, + "title": "Number of state resolution performed, by number of state groups involved (heatmap)", "tooltip": { "show": true, - "showHistogram": false + "showHistogram": true }, "type": "heatmap", "xAxis": { @@ -5722,8 +8778,8 @@ "xBucketNumber": null, "xBucketSize": null, "yAxis": { - "decimals": null, - "format": "s", + "decimals": 0, + "format": "short", "logBase": 1, "max": null, "min": null, @@ -5733,39 +8789,32 @@ "yBucketBound": "auto", "yBucketNumber": null, "yBucketSize": null - } - ], - "repeat": null, - "title": "GC", - "type": "row" - }, - { - "collapsed": true, - "datasource": "${DS_PROMETHEUS}", - "gridPos": { - "h": 1, - "w": 24, - "x": 0, - "y": 38 - }, - "id": 63, - "panels": [ + }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "$datasource", + "description": "For a given percentage P, the number X where P% of state resolution operations took place over X state groups or fewer.", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { - "h": 7, + "h": 8, "w": 12, - "x": 0, + "x": 12, "y": 66 }, "hiddenSeries": false, - "id": 2, + "id": 132, + "interval": "", "legend": { "avg": false, "current": false, @@ -5779,12 +8828,9 @@ "linewidth": 1, "links": [], "nullPointMode": "null", - "options": { - "dataLinks": [] - }, - "paceLength": 10, "percentage": false, - "pointradius": 5, + "pluginVersion": "7.1.3", + "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], @@ -5793,53 +8839,150 @@ "steppedLine": false, "targets": [ { - "expr": "rate(synapse_replication_tcp_resource_user_sync{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])", + "expr": "histogram_quantile(0.5, rate(synapse_state_number_state_groups_in_resolution_bucket{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size]))", "format": "time_series", - "intervalFactor": 2, - "legendFormat": "user started/stopped syncing", - "refId": "A", - "step": 20 + "interval": "", + "intervalFactor": 1, + "legendFormat": "50%", + "refId": "A" }, { - "expr": "rate(synapse_replication_tcp_resource_federation_ack{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])", + "expr": "histogram_quantile(0.75, rate(synapse_state_number_state_groups_in_resolution_bucket{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size]))", "format": "time_series", - "intervalFactor": 2, - "legendFormat": "federation ack", - "refId": "B", - "step": 20 + "interval": "", + "intervalFactor": 1, + "legendFormat": "75%", + "refId": "B" }, { - "expr": "rate(synapse_replication_tcp_resource_remove_pusher{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])", + "expr": "histogram_quantile(0.90, rate(synapse_state_number_state_groups_in_resolution_bucket{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size]))", "format": "time_series", - "intervalFactor": 2, - "legendFormat": "remove pusher", - "refId": "C", - "step": 20 + "interval": "", + "intervalFactor": 1, + "legendFormat": "90%", + "refId": "C" }, { - "expr": "rate(synapse_replication_tcp_resource_invalidate_cache{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])", + "expr": "histogram_quantile(0.99, rate(synapse_state_number_state_groups_in_resolution_bucket{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size]))", "format": "time_series", - "intervalFactor": 2, - "legendFormat": "invalidate cache", - "refId": "D", - "step": 20 + "interval": "", + "intervalFactor": 1, + "legendFormat": "99%", + "refId": "D" + } + ], + "thresholds": [], + "timeFrom": null, + "timeRegions": [], + "timeShift": null, + "title": "Number of state resolutions performed, by number of state groups involved (quantiles)", + "tooltip": { + "shared": true, + "sort": 0, + "value_type": "individual" + }, + "type": "graph", + "xaxis": { + "buckets": null, + "mode": "time", + "name": null, + "show": true, + "values": [] + }, + "yaxes": [ + { + "format": "short", + "label": "Number of state groups", + "logBase": 1, + "max": null, + "min": "0", + "show": true }, { - "expr": "rate(synapse_replication_tcp_resource_user_ip_cache{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])", - "format": "time_series", - "intervalFactor": 2, - "legendFormat": "user ip cache", - "refId": "E", - "step": 20 + "format": "short", + "label": null, + "logBase": 1, + "max": null, + "min": null, + "show": true + } + ], + "yaxis": { + "align": false, + "alignLevel": null + } + }, + { + "aliasColors": {}, + "bars": false, + "dashLength": 10, + "dashes": false, + "datasource": "$datasource", + "description": "When we do a state res while persisting events we try and see if we can prune any stale extremities.", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 1, + "fillGradient": 0, + "gridPos": { + "h": 8, + "w": 12, + "x": 0, + "y": 74 + }, + "hiddenSeries": false, + "id": 179, + "legend": { + "avg": false, + "current": false, + "max": false, + "min": false, + "show": true, + "total": false, + "values": false + }, + "lines": true, + "linewidth": 1, + "nullPointMode": "null", + "percentage": false, + "pluginVersion": "7.1.3", + "pointradius": 2, + "points": false, + "renderer": "flot", + "seriesOverrides": [], + "spaceLength": 10, + "stack": false, + "steppedLine": false, + "targets": [ + { + "expr": "sum(rate(synapse_storage_events_state_resolutions_during_persistence{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size]))", + "interval": "", + "legendFormat": "State res ", + "refId": "A" + }, + { + "expr": "sum(rate(synapse_storage_events_potential_times_prune_extremities{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size]))", + "interval": "", + "legendFormat": "Potential to prune", + "refId": "B" + }, + { + "expr": "sum(rate(synapse_storage_events_times_pruned_extremities{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size]))", + "interval": "", + "legendFormat": "Pruned", + "refId": "C" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Rate of events on replication master", + "title": "Stale extremity dropping", "tooltip": { - "shared": false, + "shared": true, "sort": 0, "value_type": "individual" }, @@ -5873,23 +9016,45 @@ "align": false, "alignLevel": null } - }, + } + ], + "title": "Extremities", + "type": "row" + }, + { + "collapsed": true, + "datasource": "${DS_PROMETHEUS}", + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 40 + }, + "id": 158, + "panels": [ { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { - "h": 7, + "h": 8, "w": 12, - "x": 12, - "y": 66 + "x": 0, + "y": 119 }, "hiddenSeries": false, - "id": 41, + "id": 156, "legend": { "avg": false, "current": false, @@ -5904,35 +9069,49 @@ "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, - "paceLength": 10, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 5, "points": false, "renderer": "flot", - "seriesOverrides": [], + "seriesOverrides": [ + { + "alias": "Max", + "color": "#bf1b00", + "fill": 0, + "linewidth": 2 + } + ], "spaceLength": 10, "stack": false, "steppedLine": false, "targets": [ { - "expr": "rate(synapse_replication_tcp_resource_stream_updates{job=~\"$job\",index=~\"$index\",instance=\"$instance\"}[$bucket_size])", + "expr": "synapse_admin_mau:current{instance=\"$instance\"}", "format": "time_series", "interval": "", - "intervalFactor": 2, - "legendFormat": "{{stream_name}}", - "refId": "A", - "step": 20 + "intervalFactor": 1, + "legendFormat": "Current", + "refId": "A" + }, + { + "expr": "synapse_admin_mau:max{instance=\"$instance\"}", + "format": "time_series", + "interval": "", + "intervalFactor": 1, + "legendFormat": "Max", + "refId": "B" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Outgoing stream updates", + "title": "MAU Limits", "tooltip": { - "shared": false, + "shared": true, "sort": 0, "value_type": "individual" }, @@ -5946,11 +9125,11 @@ }, "yaxes": [ { - "format": "hertz", + "format": "short", "label": null, "logBase": 1, "max": null, - "min": null, + "min": "0", "show": true }, { @@ -5973,16 +9152,22 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { - "h": 7, + "h": 8, "w": 12, - "x": 0, - "y": 73 + "x": 12, + "y": 119 }, "hiddenSeries": false, - "id": 42, + "id": 160, "legend": { "avg": false, "current": false, @@ -5994,14 +9179,13 @@ }, "lines": true, "linewidth": 1, - "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, - "paceLength": 10, "percentage": false, - "pointradius": 5, + "pluginVersion": "7.3.7", + "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], @@ -6010,21 +9194,19 @@ "steppedLine": false, "targets": [ { - "expr": "sum (rate(synapse_replication_tcp_protocol_inbound_commands{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])) without (name, conn_id)", - "format": "time_series", - "intervalFactor": 2, - "legendFormat": "{{job}}-{{index}} {{command}}", - "refId": "A", - "step": 20 + "expr": "synapse_admin_mau_current_mau_by_service{instance=\"$instance\"}", + "interval": "", + "legendFormat": "{{ app_service }}", + "refId": "A" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Rate of incoming commands", + "title": "MAU by Appservice", "tooltip": { - "shared": false, + "shared": true, "sort": 0, "value_type": "individual" }, @@ -6038,7 +9220,7 @@ }, "yaxes": [ { - "format": "hertz", + "format": "short", "label": null, "logBase": 1, "max": null, @@ -6058,23 +9240,45 @@ "align": false, "alignLevel": null } - }, + } + ], + "title": "MAU", + "type": "row" + }, + { + "collapsed": true, + "datasource": "${DS_PROMETHEUS}", + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 41 + }, + "id": 177, + "panels": [ { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 7, "w": 12, - "x": 12, - "y": 73 + "x": 0, + "y": 1 }, "hiddenSeries": false, - "id": 43, + "id": 173, "legend": { "avg": false, "current": false, @@ -6088,11 +9292,8 @@ "linewidth": 1, "links": [], "nullPointMode": "null", - "options": { - "dataLinks": [] - }, - "paceLength": 10, "percentage": false, + "pluginVersion": "7.1.3", "pointradius": 5, "points": false, "renderer": "flot", @@ -6102,22 +9303,24 @@ "steppedLine": false, "targets": [ { - "expr": "sum (rate(synapse_replication_tcp_protocol_outbound_commands{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])) without (name, conn_id)", + "expr": "rate(synapse_notifier_users_woken_by_stream{job=\"$job\",index=~\"$index\",instance=\"$instance\"}[$bucket_size])", "format": "time_series", + "hide": false, "intervalFactor": 2, - "legendFormat": "{{job}}-{{index}} {{command}}", + "legendFormat": "{{stream}} {{index}}", + "metric": "synapse_notifier", "refId": "A", - "step": 20 + "step": 2 } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Rate of outgoing commands", + "title": "Notifier Streams Woken", "tooltip": { - "shared": false, - "sort": 0, + "shared": true, + "sort": 2, "value_type": "individual" }, "type": "graph", @@ -6157,16 +9360,23 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {}, + "links": [] + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 7, "w": 12, - "x": 0, - "y": 80 + "x": 12, + "y": 1 }, "hiddenSeries": false, - "id": 113, + "id": 175, "legend": { "avg": false, "current": false, @@ -6180,11 +9390,8 @@ "linewidth": 1, "links": [], "nullPointMode": "null", - "options": { - "dataLinks": [] - }, - "paceLength": 10, "percentage": false, + "pluginVersion": "7.1.3", "pointradius": 5, "points": false, "renderer": "flot", @@ -6194,28 +9401,23 @@ "steppedLine": false, "targets": [ { - "expr": "synapse_replication_tcp_resource_connections_per_stream{job=~\"$job\",index=~\"$index\",instance=\"$instance\"}", - "format": "time_series", - "intervalFactor": 1, - "legendFormat": "{{job}}-{{index}} {{stream_name}}", - "refId": "A" - }, - { - "expr": "synapse_replication_tcp_resource_total_connections{job=~\"$job\",index=~\"$index\",instance=\"$instance\"}", + "expr": "rate(synapse_handler_presence_get_updates{job=~\"$job\",instance=\"$instance\"}[$bucket_size])", "format": "time_series", - "intervalFactor": 1, - "legendFormat": "{{job}}-{{index}}", - "refId": "B" + "interval": "", + "intervalFactor": 2, + "legendFormat": "{{type}} {{index}}", + "refId": "A", + "step": 2 } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Replication connections", + "title": "Presence Stream Fetch Type Rates", "tooltip": { "shared": true, - "sort": 0, + "sort": 2, "value_type": "individual" }, "type": "graph", @@ -6228,7 +9430,7 @@ }, "yaxes": [ { - "format": "short", + "format": "hertz", "label": null, "logBase": 1, "max": null, @@ -6248,23 +9450,44 @@ "align": false, "alignLevel": null } - }, + } + ], + "title": "Notifier", + "type": "row" + }, + { + "collapsed": true, + "datasource": "${DS_PROMETHEUS}", + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 42 + }, + "id": 170, + "panels": [ { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { - "h": 7, + "h": 8, "w": 12, - "x": 12, - "y": 80 + "x": 0, + "y": 73 }, "hiddenSeries": false, - "id": 115, + "id": 168, "legend": { "avg": false, "current": false, @@ -6276,14 +9499,13 @@ }, "lines": true, "linewidth": 1, - "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, - "paceLength": 10, "percentage": false, - "pointradius": 5, + "pluginVersion": "7.3.7", + "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], @@ -6292,10 +9514,9 @@ "steppedLine": false, "targets": [ { - "expr": "rate(synapse_replication_tcp_protocol_close_reason{job=~\"$job\",index=~\"$index\",instance=\"$instance\"}[$bucket_size])", - "format": "time_series", - "intervalFactor": 1, - "legendFormat": "{{job}}-{{index}} {{reason_type}}", + "expr": "rate(synapse_appservice_api_sent_events{instance=\"$instance\"}[$bucket_size])", + "interval": "", + "legendFormat": "{{exported_service}}", "refId": "A" } ], @@ -6303,7 +9524,7 @@ "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Replication connection close reasons", + "title": "Sent Events rate", "tooltip": { "shared": true, "sort": 0, @@ -6339,39 +9560,29 @@ "align": false, "alignLevel": null } - } - ], - "repeat": null, - "title": "Replication", - "type": "row" - }, - { - "collapsed": true, - "datasource": "${DS_PROMETHEUS}", - "gridPos": { - "h": 1, - "w": 24, - "x": 0, - "y": 39 - }, - "id": 69, - "panels": [ + }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { - "h": 9, + "h": 8, "w": 12, - "x": 0, - "y": 40 + "x": 12, + "y": 73 }, "hiddenSeries": false, - "id": 67, + "id": 171, "legend": { "avg": false, "current": false, @@ -6383,14 +9594,13 @@ }, "lines": true, "linewidth": 1, - "links": [], - "nullPointMode": "connected", + "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, - "paceLength": 10, "percentage": false, - "pointradius": 5, + "pluginVersion": "7.3.7", + "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], @@ -6399,11 +9609,9 @@ "steppedLine": false, "targets": [ { - "expr": "max(synapse_event_persisted_position{instance=\"$instance\"}) - ignoring(instance,index, job, name) group_right() synapse_event_processing_positions{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}", - "format": "time_series", + "expr": "rate(synapse_appservice_api_sent_transactions{instance=\"$instance\"}[$bucket_size])", "interval": "", - "intervalFactor": 1, - "legendFormat": "{{job}}-{{index}} {{name}}", + "legendFormat": "{{exported_service}}", "refId": "A" } ], @@ -6411,7 +9619,7 @@ "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Event processing lag", + "title": "Transactions rate", "tooltip": { "shared": true, "sort": 0, @@ -6427,8 +9635,8 @@ }, "yaxes": [ { - "format": "short", - "label": "events", + "format": "hertz", + "label": null, "logBase": 1, "max": null, "min": null, @@ -6447,23 +9655,44 @@ "align": false, "alignLevel": null } - }, + } + ], + "title": "Appservices", + "type": "row" + }, + { + "collapsed": true, + "datasource": "${DS_PROMETHEUS}", + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 43 + }, + "id": 188, + "panels": [ { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { - "h": 9, + "h": 8, "w": 12, - "x": 12, - "y": 40 + "x": 0, + "y": 44 }, "hiddenSeries": false, - "id": 71, + "id": 182, "legend": { "avg": false, "current": false, @@ -6475,14 +9704,13 @@ }, "lines": true, "linewidth": 1, - "links": [], - "nullPointMode": "connected", + "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, - "paceLength": 10, "percentage": false, - "pointradius": 5, + "pluginVersion": "7.3.7", + "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], @@ -6491,23 +9719,44 @@ "steppedLine": false, "targets": [ { - "expr": "time()*1000-synapse_event_processing_last_ts{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}", - "format": "time_series", - "hide": false, + "expr": "rate(synapse_handler_presence_notified_presence{job=\"$job\",index=~\"$index\",instance=\"$instance\"}[$bucket_size])", "interval": "", - "intervalFactor": 1, - "legendFormat": "{{job}}-{{index}} {{name}}", + "legendFormat": "Notified", + "refId": "A" + }, + { + "expr": "rate(synapse_handler_presence_federation_presence_out{job=\"$job\",index=~\"$index\",instance=\"$instance\"}[$bucket_size])", + "interval": "", + "legendFormat": "Remote ping", "refId": "B" + }, + { + "expr": "rate(synapse_handler_presence_presence_updates{job=\"$job\",index=~\"$index\",instance=\"$instance\"}[$bucket_size])", + "interval": "", + "legendFormat": "Total updates", + "refId": "C" + }, + { + "expr": "rate(synapse_handler_presence_federation_presence{job=\"$job\",index=~\"$index\",instance=\"$instance\"}[$bucket_size])", + "interval": "", + "legendFormat": "Remote updates", + "refId": "D" + }, + { + "expr": "rate(synapse_handler_presence_bump_active_time{job=\"$job\",index=~\"$index\",instance=\"$instance\"}[$bucket_size])", + "interval": "", + "legendFormat": "Bump active time", + "refId": "E" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Age of last processed event", + "title": "Presence", "tooltip": { "shared": true, - "sort": 0, + "sort": 2, "value_type": "individual" }, "type": "graph", @@ -6520,7 +9769,7 @@ }, "yaxes": [ { - "format": "ms", + "format": "hertz", "label": null, "logBase": 1, "max": null, @@ -6547,17 +9796,22 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { - "h": 9, + "h": 8, "w": 12, - "x": 0, - "y": 49 + "x": 12, + "y": 44 }, "hiddenSeries": false, - "id": 121, - "interval": "", + "id": 184, "legend": { "avg": false, "current": false, @@ -6569,14 +9823,13 @@ }, "lines": true, "linewidth": 1, - "links": [], - "nullPointMode": "connected", + "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, - "paceLength": 10, "percentage": false, - "pointradius": 5, + "pluginVersion": "7.3.7", + "pointradius": 2, "points": false, "renderer": "flot", "seriesOverrides": [], @@ -6585,23 +9838,20 @@ "steppedLine": false, "targets": [ { - "expr": "deriv(synapse_event_processing_last_ts{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])/1000 - 1", - "format": "time_series", - "hide": false, + "expr": "rate(synapse_handler_presence_state_transition{job=\"$job\",index=~\"$index\",instance=\"$instance\"}[$bucket_size])", "interval": "", - "intervalFactor": 1, - "legendFormat": "{{job}}-{{index}} {{name}}", - "refId": "B" + "legendFormat": "{{from}} -> {{to}}", + "refId": "A" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Event processing catchup rate", + "title": "Presence state transitions", "tooltip": { "shared": true, - "sort": 0, + "sort": 2, "value_type": "individual" }, "type": "graph", @@ -6614,9 +9864,8 @@ }, "yaxes": [ { - "decimals": null, - "format": "none", - "label": "fallbehind(-) / catchup(+): s/sec", + "format": "hertz", + "label": null, "logBase": 1, "max": null, "min": null, @@ -6635,88 +9884,6 @@ "align": false, "alignLevel": null } - } - ], - "title": "Event processing loop positions", - "type": "row" - }, - { - "collapsed": true, - "datasource": "${DS_PROMETHEUS}", - "gridPos": { - "h": 1, - "w": 24, - "x": 0, - "y": 40 - }, - "id": 126, - "panels": [ - { - "cards": { - "cardPadding": 0, - "cardRound": null - }, - "color": { - "cardColor": "#B877D9", - "colorScale": "sqrt", - "colorScheme": "interpolateInferno", - "exponent": 0.5, - "max": null, - "min": 0, - "mode": "opacity" - }, - "dataFormat": "tsbuckets", - "datasource": "$datasource", - "description": "Colour reflects the number of rooms with the given number of forward extremities, or fewer.\n\nThis is only updated once an hour.", - "gridPos": { - "h": 8, - "w": 12, - "x": 0, - "y": 86 - }, - "heatmap": {}, - "hideZeroBuckets": true, - "highlightCards": true, - "id": 122, - "legend": { - "show": true - }, - "links": [], - "reverseYBuckets": false, - "targets": [ - { - "expr": "synapse_forward_extremities_bucket{instance=\"$instance\"} and on (index, instance, job) (synapse_storage_events_persisted_events > 0)", - "format": "heatmap", - "intervalFactor": 1, - "legendFormat": "{{le}}", - "refId": "A" - } - ], - "timeFrom": null, - "timeShift": null, - "title": "Number of rooms, by number of forward extremities in room", - "tooltip": { - "show": true, - "showHistogram": true - }, - "type": "heatmap", - "xAxis": { - "show": true - }, - "xBucketNumber": null, - "xBucketSize": null, - "yAxis": { - "decimals": 0, - "format": "short", - "logBase": 1, - "max": null, - "min": null, - "show": true, - "splitFactor": null - }, - "yBucketBound": "auto", - "yBucketNumber": null, - "yBucketSize": null }, { "aliasColors": {}, @@ -6724,18 +9891,22 @@ "dashLength": 10, "dashes": false, "datasource": "$datasource", - "description": "Number of rooms with the given number of forward extremities or fewer.\n\nThis is only updated once an hour.", - "fill": 0, + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, + "fill": 1, "fillGradient": 0, "gridPos": { "h": 8, "w": 12, - "x": 12, - "y": 86 + "x": 0, + "y": 52 }, "hiddenSeries": false, - "id": 124, - "interval": "", + "id": 186, "legend": { "avg": false, "current": false, @@ -6747,12 +9918,12 @@ }, "lines": true, "linewidth": 1, - "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 2, "points": false, "renderer": "flot", @@ -6762,11 +9933,9 @@ "steppedLine": false, "targets": [ { - "expr": "synapse_forward_extremities_bucket{instance=\"$instance\"} > 0", - "format": "time_series", + "expr": "rate(synapse_handler_presence_notify_reason{job=\"$job\",index=~\"$index\",instance=\"$instance\"}[$bucket_size])", "interval": "", - "intervalFactor": 1, - "legendFormat": "{{le}}", + "legendFormat": "{{reason}}", "refId": "A" } ], @@ -6774,10 +9943,10 @@ "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Room counts, by number of extremities", + "title": "Presence notify reason", "tooltip": { - "shared": false, - "sort": 1, + "shared": true, + "sort": 2, "value_type": "individual" }, "type": "graph", @@ -6790,111 +9959,66 @@ }, "yaxes": [ { - "decimals": null, - "format": "none", - "label": "Number of rooms", + "$$hashKey": "object:165", + "format": "hertz", + "label": null, "logBase": 1, "max": null, "min": null, "show": true }, { + "$$hashKey": "object:166", "format": "short", "label": null, "logBase": 1, "max": null, "min": null, - "show": false + "show": true } ], "yaxis": { "align": false, "alignLevel": null } - }, - { - "cards": { - "cardPadding": 0, - "cardRound": null - }, - "color": { - "cardColor": "#5794F2", - "colorScale": "sqrt", - "colorScheme": "interpolateInferno", - "exponent": 0.5, - "min": 0, - "mode": "opacity" - }, - "dataFormat": "tsbuckets", - "datasource": "$datasource", - "description": "Colour reflects the number of events persisted to rooms with the given number of forward extremities, or fewer.", - "gridPos": { - "h": 8, - "w": 12, - "x": 0, - "y": 94 - }, - "heatmap": {}, - "hideZeroBuckets": true, - "highlightCards": true, - "id": 127, - "legend": { - "show": true - }, - "links": [], - "reverseYBuckets": false, - "targets": [ - { - "expr": "rate(synapse_storage_events_forward_extremities_persisted_bucket{instance=\"$instance\"}[$bucket_size]) and on (index, instance, job) (synapse_storage_events_persisted_events > 0)", - "format": "heatmap", - "intervalFactor": 1, - "legendFormat": "{{le}}", - "refId": "A" - } - ], - "timeFrom": null, - "timeShift": null, - "title": "Events persisted, by number of forward extremities in room (heatmap)", - "tooltip": { - "show": true, - "showHistogram": true - }, - "type": "heatmap", - "xAxis": { - "show": true - }, - "xBucketNumber": null, - "xBucketSize": null, - "yAxis": { - "decimals": 0, - "format": "short", - "logBase": 1, - "max": null, - "min": null, - "show": true, - "splitFactor": null - }, - "yBucketBound": "auto", - "yBucketNumber": null, - "yBucketSize": null - }, + } + ], + "title": "Presence", + "type": "row" + }, + { + "collapsed": true, + "datasource": "${DS_PROMETHEUS}", + "gridPos": { + "h": 1, + "w": 24, + "x": 0, + "y": 44 + }, + "id": 197, + "panels": [ { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "$datasource", - "description": "For a given percentage P, the number X where P% of events were persisted to rooms with X forward extremities or fewer.", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 8, "w": 12, - "x": 12, - "y": 94 + "x": 0, + "y": 1 }, "hiddenSeries": false, - "id": 128, + "id": 191, "legend": { "avg": false, "current": false, @@ -6906,12 +10030,12 @@ }, "lines": true, "linewidth": 1, - "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 2, "points": false, "renderer": "flot", @@ -6921,42 +10045,20 @@ "steppedLine": false, "targets": [ { - "expr": "histogram_quantile(0.5, rate(synapse_storage_events_forward_extremities_persisted_bucket{instance=\"$instance\"}[$bucket_size]) and on (index, instance, job) (synapse_storage_events_persisted_events > 0))", - "format": "time_series", - "intervalFactor": 1, - "legendFormat": "50%", - "refId": "A" - }, - { - "expr": "histogram_quantile(0.75, rate(synapse_storage_events_forward_extremities_persisted_bucket{instance=\"$instance\"}[$bucket_size]) and on (index, instance, job) (synapse_storage_events_persisted_events > 0))", - "format": "time_series", - "intervalFactor": 1, - "legendFormat": "75%", - "refId": "B" - }, - { - "expr": "histogram_quantile(0.90, rate(synapse_storage_events_forward_extremities_persisted_bucket{instance=\"$instance\"}[$bucket_size]) and on (index, instance, job) (synapse_storage_events_persisted_events > 0))", - "format": "time_series", - "intervalFactor": 1, - "legendFormat": "90%", - "refId": "C" - }, - { - "expr": "histogram_quantile(0.99, rate(synapse_storage_events_forward_extremities_persisted_bucket{instance=\"$instance\"}[$bucket_size]) and on (index, instance, job) (synapse_storage_events_persisted_events > 0))", - "format": "time_series", - "intervalFactor": 1, - "legendFormat": "99%", - "refId": "D" + "expr": "rate(synapse_external_cache_set{job=\"$job\", instance=\"$instance\", index=~\"$index\"}[$bucket_size])", + "interval": "", + "legendFormat": "{{ cache_name }} {{ index }}", + "refId": "A" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Events persisted, by number of forward extremities in room (quantiles)", + "title": "External Cache Set Rate", "tooltip": { "shared": true, - "sort": 0, + "sort": 2, "value_type": "individual" }, "type": "graph", @@ -6969,14 +10071,16 @@ }, "yaxes": [ { - "format": "short", - "label": "Number of extremities in room", + "$$hashKey": "object:390", + "format": "hertz", + "label": null, "logBase": 1, "max": null, - "min": "0", + "min": null, "show": true }, { + "$$hashKey": "object:391", "format": "short", "label": null, "logBase": 1, @@ -6990,89 +10094,29 @@ "alignLevel": null } }, - { - "cards": { - "cardPadding": 0, - "cardRound": null - }, - "color": { - "cardColor": "#FF9830", - "colorScale": "sqrt", - "colorScheme": "interpolateInferno", - "exponent": 0.5, - "min": 0, - "mode": "opacity" - }, - "dataFormat": "tsbuckets", - "datasource": "$datasource", - "description": "Colour reflects the number of events persisted to rooms with the given number of stale forward extremities, or fewer.\n\nStale forward extremities are those that were in the previous set of extremities as well as the new.", - "gridPos": { - "h": 8, - "w": 12, - "x": 0, - "y": 102 - }, - "heatmap": {}, - "hideZeroBuckets": true, - "highlightCards": true, - "id": 129, - "legend": { - "show": true - }, - "links": [], - "reverseYBuckets": false, - "targets": [ - { - "expr": "rate(synapse_storage_events_stale_forward_extremities_persisted_bucket{instance=\"$instance\"}[$bucket_size]) and on (index, instance, job) (synapse_storage_events_persisted_events > 0)", - "format": "heatmap", - "intervalFactor": 1, - "legendFormat": "{{le}}", - "refId": "A" - } - ], - "timeFrom": null, - "timeShift": null, - "title": "Events persisted, by number of stale forward extremities in room (heatmap)", - "tooltip": { - "show": true, - "showHistogram": true - }, - "type": "heatmap", - "xAxis": { - "show": true - }, - "xBucketNumber": null, - "xBucketSize": null, - "yAxis": { - "decimals": 0, - "format": "short", - "logBase": 1, - "max": null, - "min": null, - "show": true, - "splitFactor": null - }, - "yBucketBound": "auto", - "yBucketNumber": null, - "yBucketSize": null - }, { "aliasColors": {}, "bars": false, "dashLength": 10, "dashes": false, "datasource": "$datasource", - "description": "For given percentage P, the number X where P% of events were persisted to rooms with X stale forward extremities or fewer.\n\nStale forward extremities are those that were in the previous set of extremities as well as the new.", + "description": "", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, "fill": 1, "fillGradient": 0, "gridPos": { "h": 8, "w": 12, "x": 12, - "y": 102 + "y": 1 }, "hiddenSeries": false, - "id": 130, + "id": 193, "legend": { "avg": false, "current": false, @@ -7084,12 +10128,12 @@ }, "lines": true, "linewidth": 1, - "links": [], "nullPointMode": "null", "options": { - "dataLinks": [] + "alertThreshold": true }, "percentage": false, + "pluginVersion": "7.3.7", "pointradius": 2, "points": false, "renderer": "flot", @@ -7099,42 +10143,20 @@ "steppedLine": false, "targets": [ { - "expr": "histogram_quantile(0.5, rate(synapse_storage_events_stale_forward_extremities_persisted_bucket{instance=\"$instance\"}[$bucket_size]) and on (index, instance, job) (synapse_storage_events_persisted_events > 0))", - "format": "time_series", - "intervalFactor": 1, - "legendFormat": "50%", + "expr": "rate(synapse_external_cache_get{job=\"$job\", instance=\"$instance\", index=~\"$index\"}[$bucket_size])", + "interval": "", + "legendFormat": "{{ cache_name }} {{ index }}", "refId": "A" - }, - { - "expr": "histogram_quantile(0.75, rate(synapse_storage_events_stale_forward_extremities_persisted_bucket{instance=\"$instance\"}[$bucket_size]) and on (index, instance, job) (synapse_storage_events_persisted_events > 0))", - "format": "time_series", - "intervalFactor": 1, - "legendFormat": "75%", - "refId": "B" - }, - { - "expr": "histogram_quantile(0.90, rate(synapse_storage_events_stale_forward_extremities_persisted_bucket{instance=\"$instance\"}[$bucket_size]) and on (index, instance, job) (synapse_storage_events_persisted_events > 0))", - "format": "time_series", - "intervalFactor": 1, - "legendFormat": "90%", - "refId": "C" - }, - { - "expr": "histogram_quantile(0.99, rate(synapse_storage_events_stale_forward_extremities_persisted_bucket{instance=\"$instance\"}[$bucket_size]) and on (index, instance, job) (synapse_storage_events_persisted_events > 0))", - "format": "time_series", - "intervalFactor": 1, - "legendFormat": "99%", - "refId": "D" } ], "thresholds": [], "timeFrom": null, "timeRegions": [], "timeShift": null, - "title": "Events persisted, by number of stale forward extremities in room (quantiles)", + "title": "External Cache Get Rate", "tooltip": { "shared": true, - "sort": 0, + "sort": 2, "value_type": "individual" }, "type": "graph", @@ -7147,14 +10169,16 @@ }, "yaxes": [ { - "format": "short", - "label": "Number of stale forward extremities in room", + "$$hashKey": "object:390", + "format": "hertz", + "label": null, "logBase": 1, "max": null, - "min": "0", + "min": null, "show": true }, { + "$$hashKey": "object:391", "format": "short", "label": null, "logBase": 1, @@ -7170,52 +10194,57 @@ }, { "cards": { - "cardPadding": 0, + "cardPadding": -1, "cardRound": null }, "color": { - "cardColor": "#73BF69", + "cardColor": "#b4ff00", "colorScale": "sqrt", "colorScheme": "interpolateInferno", "exponent": 0.5, "min": 0, - "mode": "opacity" + "mode": "spectrum" }, "dataFormat": "tsbuckets", "datasource": "$datasource", - "description": "Colour reflects the number of state resolution operations performed over the given number of state groups, or fewer.", + "fieldConfig": { + "defaults": { + "custom": {} + }, + "overrides": [] + }, "gridPos": { - "h": 8, + "h": 9, "w": 12, "x": 0, - "y": 110 + "y": 9 }, "heatmap": {}, - "hideZeroBuckets": true, + "hideZeroBuckets": false, "highlightCards": true, - "id": 131, + "id": 195, "legend": { - "show": true + "show": false }, "links": [], "reverseYBuckets": false, "targets": [ { - "expr": "rate(synapse_state_number_state_groups_in_resolution_bucket{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size])", + "expr": "sum(rate(synapse_external_cache_response_time_seconds_bucket{index=~\"$index\",instance=\"$instance\",job=\"$job\"}[$bucket_size])) by (le)", "format": "heatmap", + "instant": false, "interval": "", "intervalFactor": 1, "legendFormat": "{{le}}", "refId": "A" } ], - "timeFrom": null, - "timeShift": null, - "title": "Number of state resolution performed, by number of state groups involved (heatmap)", + "title": "External Cache Response Time", "tooltip": { "show": true, "showHistogram": true }, + "tooltipDecimals": 2, "type": "heatmap", "xAxis": { "show": true @@ -7224,7 +10253,7 @@ "xBucketSize": null, "yAxis": { "decimals": 0, - "format": "short", + "format": "s", "logBase": 1, "max": null, "min": null, @@ -7234,131 +10263,14 @@ "yBucketBound": "auto", "yBucketNumber": null, "yBucketSize": null - }, - { - "aliasColors": {}, - "bars": false, - "dashLength": 10, - "dashes": false, - "datasource": "$datasource", - "description": "For a given percentage P, the number X where P% of state resolution operations took place over X state groups or fewer.", - "fill": 1, - "fillGradient": 0, - "gridPos": { - "h": 8, - "w": 12, - "x": 12, - "y": 110 - }, - "hiddenSeries": false, - "id": 132, - "interval": "", - "legend": { - "avg": false, - "current": false, - "max": false, - "min": false, - "show": true, - "total": false, - "values": false - }, - "lines": true, - "linewidth": 1, - "links": [], - "nullPointMode": "null", - "options": { - "dataLinks": [] - }, - "percentage": false, - "pointradius": 2, - "points": false, - "renderer": "flot", - "seriesOverrides": [], - "spaceLength": 10, - "stack": false, - "steppedLine": false, - "targets": [ - { - "expr": "histogram_quantile(0.5, rate(synapse_state_number_state_groups_in_resolution_bucket{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size]))", - "format": "time_series", - "interval": "", - "intervalFactor": 1, - "legendFormat": "50%", - "refId": "A" - }, - { - "expr": "histogram_quantile(0.75, rate(synapse_state_number_state_groups_in_resolution_bucket{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size]))", - "format": "time_series", - "interval": "", - "intervalFactor": 1, - "legendFormat": "75%", - "refId": "B" - }, - { - "expr": "histogram_quantile(0.90, rate(synapse_state_number_state_groups_in_resolution_bucket{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size]))", - "format": "time_series", - "interval": "", - "intervalFactor": 1, - "legendFormat": "90%", - "refId": "C" - }, - { - "expr": "histogram_quantile(0.99, rate(synapse_state_number_state_groups_in_resolution_bucket{instance=\"$instance\",job=~\"$job\",index=~\"$index\"}[$bucket_size]))", - "format": "time_series", - "interval": "", - "intervalFactor": 1, - "legendFormat": "99%", - "refId": "D" - } - ], - "thresholds": [], - "timeFrom": null, - "timeRegions": [], - "timeShift": null, - "title": "Number of state resolutions performed, by number of state groups involved (quantiles)", - "tooltip": { - "shared": true, - "sort": 0, - "value_type": "individual" - }, - "type": "graph", - "xaxis": { - "buckets": null, - "mode": "time", - "name": null, - "show": true, - "values": [] - }, - "yaxes": [ - { - "format": "short", - "label": "Number of state groups", - "logBase": 1, - "max": null, - "min": "0", - "show": true - }, - { - "format": "short", - "label": null, - "logBase": 1, - "max": null, - "min": null, - "show": true - } - ], - "yaxis": { - "align": false, - "alignLevel": null - } } ], - "title": "Extremities", + "title": "External Cache", "type": "row" } ], - "refresh": "5m", - "schemaVersion": 22, + "refresh": false, + "schemaVersion": 26, "style": "dark", "tags": [ "matrix" @@ -7368,9 +10280,10 @@ { "current": { "selected": false, - "text": "Prometheus", - "value": "Prometheus" + "text": "default", + "value": "default" }, + "error": null, "hide": 0, "includeAll": false, "label": null, @@ -7378,6 +10291,7 @@ "name": "datasource", "options": [], "query": "prometheus", + "queryValue": "", "refresh": 1, "regex": "", "skipUrlSync": false, @@ -7387,13 +10301,14 @@ "allFormat": "glob", "auto": true, "auto_count": 100, - "auto_min": "30s", + "auto_min": "60s", "current": { "selected": false, "text": "auto", "value": "$__auto_interval_bucket_size" }, "datasource": null, + "error": null, "hide": 0, "includeAll": false, "label": "Bucket Size", @@ -7438,6 +10353,7 @@ } ], "query": "30s,1m,2m,5m,10m,15m", + "queryValue": "", "refresh": 2, "skipUrlSync": false, "type": "interval" @@ -7447,9 +10363,9 @@ "current": {}, "datasource": "$datasource", "definition": "", + "error": null, "hide": 0, "includeAll": false, - "index": -1, "label": null, "multi": false, "name": "instance", @@ -7458,7 +10374,7 @@ "refresh": 2, "regex": "", "skipUrlSync": false, - "sort": 0, + "sort": 1, "tagValuesQuery": "", "tags": [], "tagsQuery": "", @@ -7471,10 +10387,10 @@ "current": {}, "datasource": "$datasource", "definition": "", + "error": null, "hide": 0, "hideLabel": false, "includeAll": true, - "index": -1, "label": "Job", "multi": true, "multiFormat": "regex values", @@ -7498,10 +10414,10 @@ "current": {}, "datasource": "$datasource", "definition": "", + "error": null, "hide": 0, "hideLabel": false, "includeAll": true, - "index": -1, "label": "", "multi": true, "multiFormat": "regex values", @@ -7522,7 +10438,7 @@ ] }, "time": { - "from": "now-1h", + "from": "now-3h", "to": "now" }, "timepicker": { @@ -7554,8 +10470,5 @@ "timezone": "", "title": "Synapse", "uid": "000000012", - "variables": { - "list": [] - }, - "version": 32 + "version": 90 } \ No newline at end of file From 141b073c7bf649ac12b7e12b118770677a64f51f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Javier=20Junquera=20S=C3=A1nchez?= Date: Thu, 20 May 2021 15:24:19 +0200 Subject: [PATCH 18/52] Update user_directory.md (#10016) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Javier Junquera Sánchez --- changelog.d/10016.doc | 1 + docs/user_directory.md | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) create mode 100644 changelog.d/10016.doc diff --git a/changelog.d/10016.doc b/changelog.d/10016.doc new file mode 100644 index 000000000000..f9b615d7d7ab --- /dev/null +++ b/changelog.d/10016.doc @@ -0,0 +1 @@ +Fix broken link in user directory documentation. Contributed by @junquera. diff --git a/docs/user_directory.md b/docs/user_directory.md index 872fc2197968..d4f38d2cf11a 100644 --- a/docs/user_directory.md +++ b/docs/user_directory.md @@ -7,6 +7,6 @@ who are present in a publicly viewable room present on the server. The directory info is stored in various tables, which can (typically after DB corruption) get stale or out of sync. If this happens, for now the -solution to fix it is to execute the SQL [here](../synapse/storage/databases/main/schema/delta/53/user_dir_populate.sql) +solution to fix it is to execute the SQL [here](https://github.com/matrix-org/synapse/blob/master/synapse/storage/schema/main/delta/53/user_dir_populate.sql) and then restart synapse. This should then start a background task to flush the current tables and regenerate the directory. From 551d2c3f4b492d59b3c670c1a8e82869b16a594d Mon Sep 17 00:00:00 2001 From: Patrick Cloke Date: Thu, 20 May 2021 11:10:36 -0400 Subject: [PATCH 19/52] Allow a user who could join a restricted room to see it in spaces summary. (#9922) This finishes up the experimental implementation of MSC3083 by showing the restricted rooms in the spaces summary (from MSC2946). --- changelog.d/9922.feature | 1 + synapse/federation/transport/server.py | 2 +- synapse/handlers/event_auth.py | 104 ++++++++++--- synapse/handlers/space_summary.py | 201 +++++++++++++++++++++---- 4 files changed, 254 insertions(+), 54 deletions(-) create mode 100644 changelog.d/9922.feature diff --git a/changelog.d/9922.feature b/changelog.d/9922.feature new file mode 100644 index 000000000000..2c655350c04f --- /dev/null +++ b/changelog.d/9922.feature @@ -0,0 +1 @@ +Experimental support to allow a user who could join a restricted room to view it in the spaces summary. diff --git a/synapse/federation/transport/server.py b/synapse/federation/transport/server.py index e1b746247469..c17a085a4f93 100644 --- a/synapse/federation/transport/server.py +++ b/synapse/federation/transport/server.py @@ -1428,7 +1428,7 @@ async def on_POST( ) return 200, await self.handler.federation_space_summary( - room_id, suggested_only, max_rooms_per_space, exclude_rooms + origin, room_id, suggested_only, max_rooms_per_space, exclude_rooms ) diff --git a/synapse/handlers/event_auth.py b/synapse/handlers/event_auth.py index 5b2fe103e742..a0df16a32f68 100644 --- a/synapse/handlers/event_auth.py +++ b/synapse/handlers/event_auth.py @@ -11,7 +11,7 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. -from typing import TYPE_CHECKING, Optional +from typing import TYPE_CHECKING, Collection, Optional from synapse.api.constants import EventTypes, JoinRules, Membership from synapse.api.errors import AuthError @@ -59,32 +59,76 @@ async def check_restricted_join_rules( ): return + # This is not a room with a restricted join rule, so we don't need to do the + # restricted room specific checks. + # + # Note: We'll be applying the standard join rule checks later, which will + # catch the cases of e.g. trying to join private rooms without an invite. + if not await self.has_restricted_join_rules(state_ids, room_version): + return + + # Get the spaces which allow access to this room and check if the user is + # in any of them. + allowed_spaces = await self.get_spaces_that_allow_join(state_ids) + if not await self.is_user_in_rooms(allowed_spaces, user_id): + raise AuthError( + 403, + "You do not belong to any of the required spaces to join this room.", + ) + + async def has_restricted_join_rules( + self, state_ids: StateMap[str], room_version: RoomVersion + ) -> bool: + """ + Return if the room has the proper join rules set for access via spaces. + + Args: + state_ids: The state of the room as it currently is. + room_version: The room version of the room to query. + + Returns: + True if the proper room version and join rules are set for restricted access. + """ # This only applies to room versions which support the new join rule. if not room_version.msc3083_join_rules: - return + return False # If there's no join rule, then it defaults to invite (so this doesn't apply). join_rules_event_id = state_ids.get((EventTypes.JoinRules, ""), None) if not join_rules_event_id: - return + return False + + # If the join rule is not restricted, this doesn't apply. + join_rules_event = await self._store.get_event(join_rules_event_id) + return join_rules_event.content.get("join_rule") == JoinRules.MSC3083_RESTRICTED + + async def get_spaces_that_allow_join( + self, state_ids: StateMap[str] + ) -> Collection[str]: + """ + Generate a list of spaces which allow access to a room. + + Args: + state_ids: The state of the room as it currently is. + + Returns: + A collection of spaces which provide membership to the room. + """ + # If there's no join rule, then it defaults to invite (so this doesn't apply). + join_rules_event_id = state_ids.get((EventTypes.JoinRules, ""), None) + if not join_rules_event_id: + return () # If the join rule is not restricted, this doesn't apply. join_rules_event = await self._store.get_event(join_rules_event_id) - if join_rules_event.content.get("join_rule") != JoinRules.MSC3083_RESTRICTED: - return # If allowed is of the wrong form, then only allow invited users. allowed_spaces = join_rules_event.content.get("allow", []) if not isinstance(allowed_spaces, list): - allowed_spaces = () - - # Get the list of joined rooms and see if there's an overlap. - if allowed_spaces: - joined_rooms = await self._store.get_rooms_for_user(user_id) - else: - joined_rooms = () + return () # Pull out the other room IDs, invalid data gets filtered. + result = [] for space in allowed_spaces: if not isinstance(space, dict): continue @@ -93,13 +137,31 @@ async def check_restricted_join_rules( if not isinstance(space_id, str): continue - # The user was joined to one of the spaces specified, they can join - # this room! - if space_id in joined_rooms: - return + result.append(space_id) + + return result + + async def is_user_in_rooms(self, room_ids: Collection[str], user_id: str) -> bool: + """ + Check whether a user is a member of any of the provided rooms. + + Args: + room_ids: The rooms to check for membership. + user_id: The user to check. + + Returns: + True if the user is in any of the rooms, false otherwise. + """ + if not room_ids: + return False + + # Get the list of joined rooms and see if there's an overlap. + joined_rooms = await self._store.get_rooms_for_user(user_id) + + # Check each room and see if the user is in it. + for room_id in room_ids: + if room_id in joined_rooms: + return True - # The user was not in any of the required spaces. - raise AuthError( - 403, - "You do not belong to any of the required spaces to join this room.", - ) + # The user was not in any of the rooms. + return False diff --git a/synapse/handlers/space_summary.py b/synapse/handlers/space_summary.py index eb80a5ad67db..8d49ba81646e 100644 --- a/synapse/handlers/space_summary.py +++ b/synapse/handlers/space_summary.py @@ -16,11 +16,16 @@ import logging import re from collections import deque -from typing import TYPE_CHECKING, Iterable, List, Optional, Sequence, Set, Tuple, cast +from typing import TYPE_CHECKING, Iterable, List, Optional, Sequence, Set, Tuple import attr -from synapse.api.constants import EventContentFields, EventTypes, HistoryVisibility +from synapse.api.constants import ( + EventContentFields, + EventTypes, + HistoryVisibility, + Membership, +) from synapse.api.errors import AuthError from synapse.events import EventBase from synapse.events.utils import format_event_for_client_v2 @@ -47,6 +52,7 @@ def __init__(self, hs: "HomeServer"): self._auth = hs.get_auth() self._room_list_handler = hs.get_room_list_handler() self._state_handler = hs.get_state_handler() + self._event_auth_handler = hs.get_event_auth_handler() self._store = hs.get_datastore() self._event_serializer = hs.get_event_client_serializer() self._server_name = hs.hostname @@ -111,28 +117,88 @@ async def get_space_summary( max_children = max_rooms_per_space if processed_rooms else None if is_in_room: - rooms, events = await self._summarize_local_room( - requester, room_id, suggested_only, max_children + room, events = await self._summarize_local_room( + requester, None, room_id, suggested_only, max_children ) + + logger.debug( + "Query of local room %s returned events %s", + room_id, + ["%s->%s" % (ev["room_id"], ev["state_key"]) for ev in events], + ) + + if room: + rooms_result.append(room) else: - rooms, events = await self._summarize_remote_room( + fed_rooms, fed_events = await self._summarize_remote_room( queue_entry, suggested_only, max_children, exclude_rooms=processed_rooms, ) - logger.debug( - "Query of %s returned rooms %s, events %s", - queue_entry.room_id, - [room.get("room_id") for room in rooms], - ["%s->%s" % (ev["room_id"], ev["state_key"]) for ev in events], - ) - - rooms_result.extend(rooms) - - # any rooms returned don't need visiting again - processed_rooms.update(cast(str, room.get("room_id")) for room in rooms) + # The results over federation might include rooms that the we, + # as the requesting server, are allowed to see, but the requesting + # user is not permitted see. + # + # Filter the returned results to only what is accessible to the user. + room_ids = set() + events = [] + for room in fed_rooms: + fed_room_id = room.get("room_id") + if not fed_room_id or not isinstance(fed_room_id, str): + continue + + # The room should only be included in the summary if: + # a. the user is in the room; + # b. the room is world readable; or + # c. the user is in a space that has been granted access to + # the room. + # + # Note that we know the user is not in the root room (which is + # why the remote call was made in the first place), but the user + # could be in one of the children rooms and we just didn't know + # about the link. + include_room = room.get("world_readable") is True + + # Check if the user is a member of any of the allowed spaces + # from the response. + allowed_spaces = room.get("allowed_spaces") + if ( + not include_room + and allowed_spaces + and isinstance(allowed_spaces, list) + ): + include_room = await self._event_auth_handler.is_user_in_rooms( + allowed_spaces, requester + ) + + # Finally, if this isn't the requested room, check ourselves + # if we can access the room. + if not include_room and fed_room_id != queue_entry.room_id: + include_room = await self._is_room_accessible( + fed_room_id, requester, None + ) + + # The user can see the room, include it! + if include_room: + rooms_result.append(room) + room_ids.add(fed_room_id) + + # All rooms returned don't need visiting again (even if the user + # didn't have access to them). + processed_rooms.add(fed_room_id) + + for event in fed_events: + if event.get("room_id") in room_ids: + events.append(event) + + logger.debug( + "Query of %s returned rooms %s, events %s", + room_id, + [room.get("room_id") for room in fed_rooms], + ["%s->%s" % (ev["room_id"], ev["state_key"]) for ev in fed_events], + ) # the room we queried may or may not have been returned, but don't process # it again, anyway. @@ -158,10 +224,16 @@ async def get_space_summary( ) processed_events.add(ev_key) + # Before returning to the client, remove the allowed_spaces key for any + # rooms. + for room in rooms_result: + room.pop("allowed_spaces", None) + return {"rooms": rooms_result, "events": events_result} async def federation_space_summary( self, + origin: str, room_id: str, suggested_only: bool, max_rooms_per_space: Optional[int], @@ -171,6 +243,8 @@ async def federation_space_summary( Implementation of the space summary Federation API Args: + origin: The server requesting the spaces summary. + room_id: room id to start the summary at suggested_only: whether we should only return children with the "suggested" @@ -205,14 +279,15 @@ async def federation_space_summary( logger.debug("Processing room %s", room_id) - rooms, events = await self._summarize_local_room( - None, room_id, suggested_only, max_rooms_per_space + room, events = await self._summarize_local_room( + None, origin, room_id, suggested_only, max_rooms_per_space ) processed_rooms.add(room_id) - rooms_result.extend(rooms) - events_result.extend(events) + if room: + rooms_result.append(room) + events_result.extend(events) # add any children to the queue room_queue.extend(edge_event["state_key"] for edge_event in events) @@ -222,10 +297,11 @@ async def federation_space_summary( async def _summarize_local_room( self, requester: Optional[str], + origin: Optional[str], room_id: str, suggested_only: bool, max_children: Optional[int], - ) -> Tuple[Sequence[JsonDict], Sequence[JsonDict]]: + ) -> Tuple[Optional[JsonDict], Sequence[JsonDict]]: """ Generate a room entry and a list of event entries for a given room. @@ -233,6 +309,9 @@ async def _summarize_local_room( requester: The user requesting the summary, if it is a local request. None if this is a federation request. + origin: + The server requesting the summary, if it is a federation request. + None if this is a local request. room_id: The room ID to summarize. suggested_only: True if only suggested children should be returned. Otherwise, all children are returned. @@ -247,8 +326,8 @@ async def _summarize_local_room( An iterable of the sorted children events. This may be limited to a maximum size or may include all children. """ - if not await self._is_room_accessible(room_id, requester): - return (), () + if not await self._is_room_accessible(room_id, requester, origin): + return None, () room_entry = await self._build_room_entry(room_id) @@ -272,7 +351,7 @@ async def _summarize_local_room( event_format=format_event_for_client_v2, ) ) - return (room_entry,), events_result + return room_entry, events_result async def _summarize_remote_room( self, @@ -332,13 +411,17 @@ async def _summarize_remote_room( or ev.event_type == EventTypes.SpaceChild ) - async def _is_room_accessible(self, room_id: str, requester: Optional[str]) -> bool: + async def _is_room_accessible( + self, room_id: str, requester: Optional[str], origin: Optional[str] + ) -> bool: """ Calculate whether the room should be shown in the spaces summary. It should be included if: * The requester is joined or invited to the room. + * The requester can join without an invite (per MSC3083). + * The origin server has any user that is joined or invited to the room. * The history visibility is set to world readable. Args: @@ -346,31 +429,75 @@ async def _is_room_accessible(self, room_id: str, requester: Optional[str]) -> b requester: The user requesting the summary, if it is a local request. None if this is a federation request. + origin: + The server requesting the summary, if it is a federation request. + None if this is a local request. Returns: True if the room should be included in the spaces summary. """ + state_ids = await self._store.get_current_state_ids(room_id) + + # If there's no state for the room, it isn't known. + if not state_ids: + logger.info("room %s is unknown, omitting from summary", room_id) + return False + + room_version = await self._store.get_room_version(room_id) # if we have an authenticated requesting user, first check if they are able to view # stripped state in the room. if requester: + member_event_id = state_ids.get((EventTypes.Member, requester), None) + + # If they're in the room they can see info on it. + member_event = None + if member_event_id: + member_event = await self._store.get_event(member_event_id) + if member_event.membership in (Membership.JOIN, Membership.INVITE): + return True + + # Otherwise, check if they should be allowed access via membership in a space. try: - await self._auth.check_user_in_room(room_id, requester) - return True + await self._event_auth_handler.check_restricted_join_rules( + state_ids, room_version, requester, member_event + ) except AuthError: + # The user doesn't have access due to spaces, but might have access + # another way. Keep trying. pass + else: + return True + + # If this is a request over federation, check if the host is in the room or + # is in one of the spaces specified via the join rules. + elif origin: + if await self._auth.check_host_in_room(room_id, origin): + return True + + # Alternately, if the host has a user in any of the spaces specified + # for access, then the host can see this room (and should do filtering + # if the requester cannot see it). + if await self._event_auth_handler.has_restricted_join_rules( + state_ids, room_version + ): + allowed_spaces = ( + await self._event_auth_handler.get_spaces_that_allow_join(state_ids) + ) + for space_id in allowed_spaces: + if await self._auth.check_host_in_room(space_id, origin): + return True # otherwise, check if the room is peekable - hist_vis_ev = await self._state_handler.get_current_state( - room_id, EventTypes.RoomHistoryVisibility, "" - ) - if hist_vis_ev: + hist_vis_event_id = state_ids.get((EventTypes.RoomHistoryVisibility, ""), None) + if hist_vis_event_id: + hist_vis_ev = await self._store.get_event(hist_vis_event_id) hist_vis = hist_vis_ev.content.get("history_visibility") if hist_vis == HistoryVisibility.WORLD_READABLE: return True logger.info( - "room %s is unpeekable and user %s is not a member, omitting from summary", + "room %s is unpeekable and user %s is not a member / not allowed to join, omitting from summary", room_id, requester, ) @@ -395,6 +522,15 @@ async def _build_room_entry(self, room_id: str) -> JsonDict: if not room_type: room_type = create_event.content.get(EventContentFields.MSC1772_ROOM_TYPE) + room_version = await self._store.get_room_version(room_id) + allowed_spaces = None + if await self._event_auth_handler.has_restricted_join_rules( + current_state_ids, room_version + ): + allowed_spaces = await self._event_auth_handler.get_spaces_that_allow_join( + current_state_ids + ) + entry = { "room_id": stats["room_id"], "name": stats["name"], @@ -408,6 +544,7 @@ async def _build_room_entry(self, room_id: str) -> JsonDict: "guest_can_join": stats["guest_access"] == "can_join", "creation_ts": create_event.origin_server_ts, "room_type": room_type, + "allowed_spaces": allowed_spaces, } # Filter out Nones – rather omit the field altogether From 64887f06fcac63e069364d625d984b4951bf1ffc Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Thu, 20 May 2021 16:11:48 +0100 Subject: [PATCH 20/52] Use ijson to parse the response to `/send_join`, reducing memory usage. (#9958) Instead of parsing the full response to `/send_join` into Python objects (which can be huge for large rooms) and *then* parsing that into events, we instead use ijson to stream parse the response directly into `EventBase` objects. --- changelog.d/9958.feature | 1 + mypy.ini | 3 + synapse/federation/federation_client.py | 28 ++--- synapse/federation/transport/client.py | 85 ++++++++++++- synapse/http/client.py | 7 +- synapse/http/matrixfederationclient.py | 160 ++++++++++++++++++------ synapse/python_dependencies.py | 1 + 7 files changed, 227 insertions(+), 58 deletions(-) create mode 100644 changelog.d/9958.feature diff --git a/changelog.d/9958.feature b/changelog.d/9958.feature new file mode 100644 index 000000000000..d86ba36519f4 --- /dev/null +++ b/changelog.d/9958.feature @@ -0,0 +1 @@ +Reduce memory usage when joining very large rooms over federation. diff --git a/mypy.ini b/mypy.ini index ea655a0d4d92..1d1d1ea0f25f 100644 --- a/mypy.ini +++ b/mypy.ini @@ -174,3 +174,6 @@ ignore_missing_imports = True [mypy-pympler.*] ignore_missing_imports = True + +[mypy-ijson.*] +ignore_missing_imports = True diff --git a/synapse/federation/federation_client.py b/synapse/federation/federation_client.py index a5b6a611952b..e0e9f5d0beeb 100644 --- a/synapse/federation/federation_client.py +++ b/synapse/federation/federation_client.py @@ -55,6 +55,7 @@ ) from synapse.events import EventBase, builder from synapse.federation.federation_base import FederationBase, event_from_pdu_json +from synapse.federation.transport.client import SendJoinResponse from synapse.logging.context import make_deferred_yieldable, preserve_fn from synapse.logging.utils import log_function from synapse.types import JsonDict, get_domain_from_id @@ -665,19 +666,10 @@ async def send_join( """ async def send_request(destination) -> Dict[str, Any]: - content = await self._do_send_join(destination, pdu) + response = await self._do_send_join(room_version, destination, pdu) - logger.debug("Got content: %s", content) - - state = [ - event_from_pdu_json(p, room_version, outlier=True) - for p in content.get("state", []) - ] - - auth_chain = [ - event_from_pdu_json(p, room_version, outlier=True) - for p in content.get("auth_chain", []) - ] + state = response.state + auth_chain = response.auth_events pdus = {p.event_id: p for p in itertools.chain(state, auth_chain)} @@ -752,11 +744,14 @@ async def send_request(destination) -> Dict[str, Any]: return await self._try_destination_list("send_join", destinations, send_request) - async def _do_send_join(self, destination: str, pdu: EventBase) -> JsonDict: + async def _do_send_join( + self, room_version: RoomVersion, destination: str, pdu: EventBase + ) -> SendJoinResponse: time_now = self._clock.time_msec() try: return await self.transport_layer.send_join_v2( + room_version=room_version, destination=destination, room_id=pdu.room_id, event_id=pdu.event_id, @@ -771,17 +766,14 @@ async def _do_send_join(self, destination: str, pdu: EventBase) -> JsonDict: logger.debug("Couldn't send_join with the v2 API, falling back to the v1 API") - resp = await self.transport_layer.send_join_v1( + return await self.transport_layer.send_join_v1( + room_version=room_version, destination=destination, room_id=pdu.room_id, event_id=pdu.event_id, content=pdu.get_pdu_json(time_now), ) - # We expect the v1 API to respond with [200, content], so we only return the - # content. - return resp[1] - async def send_invite( self, destination: str, diff --git a/synapse/federation/transport/client.py b/synapse/federation/transport/client.py index 497848a2b75b..e93ab83f7ff4 100644 --- a/synapse/federation/transport/client.py +++ b/synapse/federation/transport/client.py @@ -17,13 +17,19 @@ import urllib from typing import Any, Dict, List, Optional +import attr +import ijson + from synapse.api.constants import Membership from synapse.api.errors import Codes, HttpResponseException, SynapseError +from synapse.api.room_versions import RoomVersion from synapse.api.urls import ( FEDERATION_UNSTABLE_PREFIX, FEDERATION_V1_PREFIX, FEDERATION_V2_PREFIX, ) +from synapse.events import EventBase, make_event_from_dict +from synapse.http.matrixfederationclient import ByteParser from synapse.logging.utils import log_function from synapse.types import JsonDict @@ -240,21 +246,36 @@ async def make_membership_event( return content @log_function - async def send_join_v1(self, destination, room_id, event_id, content): + async def send_join_v1( + self, + room_version, + destination, + room_id, + event_id, + content, + ) -> "SendJoinResponse": path = _create_v1_path("/send_join/%s/%s", room_id, event_id) response = await self.client.put_json( - destination=destination, path=path, data=content + destination=destination, + path=path, + data=content, + parser=SendJoinParser(room_version, v1_api=True), ) return response @log_function - async def send_join_v2(self, destination, room_id, event_id, content): + async def send_join_v2( + self, room_version, destination, room_id, event_id, content + ) -> "SendJoinResponse": path = _create_v2_path("/send_join/%s/%s", room_id, event_id) response = await self.client.put_json( - destination=destination, path=path, data=content + destination=destination, + path=path, + data=content, + parser=SendJoinParser(room_version, v1_api=False), ) return response @@ -1053,3 +1074,59 @@ def _create_v2_path(path, *args): str """ return _create_path(FEDERATION_V2_PREFIX, path, *args) + + +@attr.s(slots=True, auto_attribs=True) +class SendJoinResponse: + """The parsed response of a `/send_join` request.""" + + auth_events: List[EventBase] + state: List[EventBase] + + +@ijson.coroutine +def _event_list_parser(room_version: RoomVersion, events: List[EventBase]): + """Helper function for use with `ijson.items_coro` to parse an array of + events and add them to the given list. + """ + + while True: + obj = yield + event = make_event_from_dict(obj, room_version) + events.append(event) + + +class SendJoinParser(ByteParser[SendJoinResponse]): + """A parser for the response to `/send_join` requests. + + Args: + room_version: The version of the room. + v1_api: Whether the response is in the v1 format. + """ + + CONTENT_TYPE = "application/json" + + def __init__(self, room_version: RoomVersion, v1_api: bool): + self._response = SendJoinResponse([], []) + + # The V1 API has the shape of `[200, {...}]`, which we handle by + # prefixing with `item.*`. + prefix = "item." if v1_api else "" + + self._coro_state = ijson.items_coro( + _event_list_parser(room_version, self._response.state), + prefix + "state.item", + ) + self._coro_auth = ijson.items_coro( + _event_list_parser(room_version, self._response.auth_events), + prefix + "auth_chain.item", + ) + + def write(self, data: bytes) -> int: + self._coro_state.send(data) + self._coro_auth.send(data) + + return len(data) + + def finish(self) -> SendJoinResponse: + return self._response diff --git a/synapse/http/client.py b/synapse/http/client.py index 5f40f16e24d6..1ca6624fd5da 100644 --- a/synapse/http/client.py +++ b/synapse/http/client.py @@ -813,7 +813,12 @@ def dataReceived(self, data: bytes) -> None: if self.deferred.called: return - self.stream.write(data) + try: + self.stream.write(data) + except Exception: + self.deferred.errback() + return + self.length += len(data) # The first time the maximum size is exceeded, error and cancel the # connection. dataReceived might be called again if data was received diff --git a/synapse/http/matrixfederationclient.py b/synapse/http/matrixfederationclient.py index bb837b7b1979..f5503b394b37 100644 --- a/synapse/http/matrixfederationclient.py +++ b/synapse/http/matrixfederationclient.py @@ -11,6 +11,7 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. +import abc import cgi import codecs import logging @@ -19,13 +20,24 @@ import typing import urllib.parse from io import BytesIO, StringIO -from typing import Callable, Dict, List, Optional, Tuple, Union +from typing import ( + Callable, + Dict, + Generic, + List, + Optional, + Tuple, + TypeVar, + Union, + overload, +) import attr import treq from canonicaljson import encode_canonical_json from prometheus_client import Counter from signedjson.sign import sign_json +from typing_extensions import Literal from twisted.internet import defer from twisted.internet.error import DNSLookupError @@ -48,6 +60,7 @@ BlacklistingAgentWrapper, BlacklistingReactorWrapper, BodyExceededMaxSize, + ByteWriteable, encode_query_args, read_body_with_max_size, ) @@ -88,6 +101,27 @@ QueryArgs = Dict[str, Union[str, List[str]]] +T = TypeVar("T") + + +class ByteParser(ByteWriteable, Generic[T], abc.ABC): + """A `ByteWriteable` that has an additional `finish` function that returns + the parsed data. + """ + + CONTENT_TYPE = abc.abstractproperty() # type: str # type: ignore + """The expected content type of the response, e.g. `application/json`. If + the content type doesn't match we fail the request. + """ + + @abc.abstractmethod + def finish(self) -> T: + """Called when response has finished streaming and the parser should + return the final result (or error). + """ + pass + + @attr.s(slots=True, frozen=True) class MatrixFederationRequest: method = attr.ib(type=str) @@ -148,15 +182,32 @@ def get_json(self) -> Optional[JsonDict]: return self.json -async def _handle_json_response( +class JsonParser(ByteParser[Union[JsonDict, list]]): + """A parser that buffers the response and tries to parse it as JSON.""" + + CONTENT_TYPE = "application/json" + + def __init__(self): + self._buffer = StringIO() + self._binary_wrapper = BinaryIOWrapper(self._buffer) + + def write(self, data: bytes) -> int: + return self._binary_wrapper.write(data) + + def finish(self) -> Union[JsonDict, list]: + return json_decoder.decode(self._buffer.getvalue()) + + +async def _handle_response( reactor: IReactorTime, timeout_sec: float, request: MatrixFederationRequest, response: IResponse, start_ms: int, -) -> JsonDict: + parser: ByteParser[T], +) -> T: """ - Reads the JSON body of a response, with a timeout + Reads the body of a response with a timeout and sends it to a parser Args: reactor: twisted reactor, for the timeout @@ -164,23 +215,21 @@ async def _handle_json_response( request: the request that triggered the response response: response to the request start_ms: Timestamp when request was made + parser: The parser for the response Returns: - The parsed JSON response + The parsed response """ + try: - check_content_type_is_json(response.headers) + check_content_type_is(response.headers, parser.CONTENT_TYPE) - buf = StringIO() - d = read_body_with_max_size(response, BinaryIOWrapper(buf), MAX_RESPONSE_SIZE) + d = read_body_with_max_size(response, parser, MAX_RESPONSE_SIZE) d = timeout_deferred(d, timeout=timeout_sec, reactor=reactor) - def parse(_len: int): - return json_decoder.decode(buf.getvalue()) - - d.addCallback(parse) + length = await make_deferred_yieldable(d) - body = await make_deferred_yieldable(d) + value = parser.finish() except BodyExceededMaxSize as e: # The response was too big. logger.warning( @@ -193,9 +242,9 @@ def parse(_len: int): ) raise RequestSendFailed(e, can_retry=False) from e except ValueError as e: - # The JSON content was invalid. + # The content was invalid. logger.warning( - "{%s} [%s] Failed to parse JSON response - %s %s", + "{%s} [%s] Failed to parse response - %s %s", request.txn_id, request.destination, request.method, @@ -225,16 +274,17 @@ def parse(_len: int): time_taken_secs = reactor.seconds() - start_ms / 1000 logger.info( - "{%s} [%s] Completed request: %d %s in %.2f secs - %s %s", + "{%s} [%s] Completed request: %d %s in %.2f secs, got %d bytes - %s %s", request.txn_id, request.destination, response.code, response.phrase.decode("ascii", errors="replace"), time_taken_secs, + length, request.method, request.uri.decode("ascii"), ) - return body + return value class BinaryIOWrapper: @@ -671,6 +721,7 @@ def build_auth_headers( ) return auth_headers + @overload async def put_json( self, destination: str, @@ -683,7 +734,41 @@ async def put_json( ignore_backoff: bool = False, backoff_on_404: bool = False, try_trailing_slash_on_400: bool = False, + parser: Literal[None] = None, ) -> Union[JsonDict, list]: + ... + + @overload + async def put_json( + self, + destination: str, + path: str, + args: Optional[QueryArgs] = None, + data: Optional[JsonDict] = None, + json_data_callback: Optional[Callable[[], JsonDict]] = None, + long_retries: bool = False, + timeout: Optional[int] = None, + ignore_backoff: bool = False, + backoff_on_404: bool = False, + try_trailing_slash_on_400: bool = False, + parser: Optional[ByteParser[T]] = None, + ) -> T: + ... + + async def put_json( + self, + destination: str, + path: str, + args: Optional[QueryArgs] = None, + data: Optional[JsonDict] = None, + json_data_callback: Optional[Callable[[], JsonDict]] = None, + long_retries: bool = False, + timeout: Optional[int] = None, + ignore_backoff: bool = False, + backoff_on_404: bool = False, + try_trailing_slash_on_400: bool = False, + parser: Optional[ByteParser] = None, + ): """Sends the specified json data using PUT Args: @@ -716,6 +801,8 @@ async def put_json( of the request. Workaround for #3622 in Synapse <= v0.99.3. This will be attempted before backing off if backing off has been enabled. + parser: The parser to use to decode the response. Defaults to + parsing as JSON. Returns: Succeeds when we get a 2xx HTTP response. The @@ -756,8 +843,16 @@ async def put_json( else: _sec_timeout = self.default_timeout - body = await _handle_json_response( - self.reactor, _sec_timeout, request, response, start_ms + if parser is None: + parser = JsonParser() + + body = await _handle_response( + self.reactor, + _sec_timeout, + request, + response, + start_ms, + parser=parser, ) return body @@ -830,12 +925,8 @@ async def post_json( else: _sec_timeout = self.default_timeout - body = await _handle_json_response( - self.reactor, - _sec_timeout, - request, - response, - start_ms, + body = await _handle_response( + self.reactor, _sec_timeout, request, response, start_ms, parser=JsonParser() ) return body @@ -907,8 +998,8 @@ async def get_json( else: _sec_timeout = self.default_timeout - body = await _handle_json_response( - self.reactor, _sec_timeout, request, response, start_ms + body = await _handle_response( + self.reactor, _sec_timeout, request, response, start_ms, parser=JsonParser() ) return body @@ -975,8 +1066,8 @@ async def delete_json( else: _sec_timeout = self.default_timeout - body = await _handle_json_response( - self.reactor, _sec_timeout, request, response, start_ms + body = await _handle_response( + self.reactor, _sec_timeout, request, response, start_ms, parser=JsonParser() ) return body @@ -1068,16 +1159,16 @@ def _flatten_response_never_received(e): return repr(e) -def check_content_type_is_json(headers: Headers) -> None: +def check_content_type_is(headers: Headers, expected_content_type: str) -> None: """ Check that a set of HTTP headers have a Content-Type header, and that it - is application/json. + is the expected value.. Args: headers: headers to check Raises: - RequestSendFailed: if the Content-Type header is missing or isn't JSON + RequestSendFailed: if the Content-Type header is missing or doesn't match """ content_type_headers = headers.getRawHeaders(b"Content-Type") @@ -1089,11 +1180,10 @@ def check_content_type_is_json(headers: Headers) -> None: c_type = content_type_headers[0].decode("ascii") # only the first header val, options = cgi.parse_header(c_type) - if val != "application/json": + if val != expected_content_type: raise RequestSendFailed( RuntimeError( - "Remote server sent Content-Type header of '%s', not 'application/json'" - % c_type, + f"Remote server sent Content-Type header of '{c_type}', not '{expected_content_type}'", ), can_retry=False, ) diff --git a/synapse/python_dependencies.py b/synapse/python_dependencies.py index 989523c82374..546231bec0c9 100644 --- a/synapse/python_dependencies.py +++ b/synapse/python_dependencies.py @@ -87,6 +87,7 @@ # We enforce that we have a `cryptography` version that bundles an `openssl` # with the latest security patches. "cryptography>=3.4.7", + "ijson>=3.0", ] CONDITIONAL_REQUIREMENTS = { From 1c6a19002cb56cf93bc920854c81ae88bd7308ac Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Thu, 20 May 2021 16:25:11 +0100 Subject: [PATCH 21/52] Add `Keyring.verify_events_for_server` and reduce memory usage (#10018) Also add support for giving a callback to generate the JSON object to verify. This should reduce memory usage, as we no longer have the event in memory in dict form (which has a large memory footprint) for extend periods of time. --- changelog.d/10018.misc | 1 + synapse/crypto/keyring.py | 98 ++++++++++++++++++++++++--- synapse/federation/federation_base.py | 17 ++--- 3 files changed, 94 insertions(+), 22 deletions(-) create mode 100644 changelog.d/10018.misc diff --git a/changelog.d/10018.misc b/changelog.d/10018.misc new file mode 100644 index 000000000000..eaf9f64867a6 --- /dev/null +++ b/changelog.d/10018.misc @@ -0,0 +1 @@ +Reduce memory usage when verifying signatures on large numbers of events at once. diff --git a/synapse/crypto/keyring.py b/synapse/crypto/keyring.py index 5f18ef77489d..6fc07129788f 100644 --- a/synapse/crypto/keyring.py +++ b/synapse/crypto/keyring.py @@ -17,7 +17,7 @@ import logging import urllib from collections import defaultdict -from typing import TYPE_CHECKING, Dict, Iterable, List, Optional, Set, Tuple +from typing import TYPE_CHECKING, Callable, Dict, Iterable, List, Optional, Set, Tuple import attr from signedjson.key import ( @@ -42,6 +42,8 @@ SynapseError, ) from synapse.config.key import TrustedKeyServer +from synapse.events import EventBase +from synapse.events.utils import prune_event_dict from synapse.logging.context import ( PreserveLoggingContext, make_deferred_yieldable, @@ -69,7 +71,11 @@ class VerifyJsonRequest: Attributes: server_name: The name of the server to verify against. - json_object: The JSON object to verify. + get_json_object: A callback to fetch the JSON object to verify. + A callback is used to allow deferring the creation of the JSON + object to verify until needed, e.g. for events we can defer + creating the redacted copy. This reduces the memory usage when + there are large numbers of in flight requests. minimum_valid_until_ts: time at which we require the signing key to be valid. (0 implies we don't care) @@ -88,14 +94,50 @@ class VerifyJsonRequest: """ server_name = attr.ib(type=str) - json_object = attr.ib(type=JsonDict) + get_json_object = attr.ib(type=Callable[[], JsonDict]) minimum_valid_until_ts = attr.ib(type=int) request_name = attr.ib(type=str) - key_ids = attr.ib(init=False, type=List[str]) + key_ids = attr.ib(type=List[str]) key_ready = attr.ib(default=attr.Factory(defer.Deferred), type=defer.Deferred) - def __attrs_post_init__(self): - self.key_ids = signature_ids(self.json_object, self.server_name) + @staticmethod + def from_json_object( + server_name: str, + json_object: JsonDict, + minimum_valid_until_ms: int, + request_name: str, + ): + """Create a VerifyJsonRequest to verify all signatures on a signed JSON + object for the given server. + """ + key_ids = signature_ids(json_object, server_name) + return VerifyJsonRequest( + server_name, + lambda: json_object, + minimum_valid_until_ms, + request_name=request_name, + key_ids=key_ids, + ) + + @staticmethod + def from_event( + server_name: str, + event: EventBase, + minimum_valid_until_ms: int, + ): + """Create a VerifyJsonRequest to verify all signatures on an event + object for the given server. + """ + key_ids = list(event.signatures.get(server_name, [])) + return VerifyJsonRequest( + server_name, + # We defer creating the redacted json object, as it uses a lot more + # memory than the Event object itself. + lambda: prune_event_dict(event.room_version, event.get_pdu_json()), + minimum_valid_until_ms, + request_name=event.event_id, + key_ids=key_ids, + ) class KeyLookupError(ValueError): @@ -147,8 +189,13 @@ def verify_json_for_server( Deferred[None]: completes if the the object was correctly signed, otherwise errbacks with an error """ - req = VerifyJsonRequest(server_name, json_object, validity_time, request_name) - requests = (req,) + request = VerifyJsonRequest.from_json_object( + server_name, + json_object, + validity_time, + request_name, + ) + requests = (request,) return make_deferred_yieldable(self._verify_objects(requests)[0]) def verify_json_objects_for_server( @@ -175,10 +222,41 @@ def verify_json_objects_for_server( logcontext. """ return self._verify_objects( - VerifyJsonRequest(server_name, json_object, validity_time, request_name) + VerifyJsonRequest.from_json_object( + server_name, json_object, validity_time, request_name + ) for server_name, json_object, validity_time, request_name in server_and_json ) + def verify_events_for_server( + self, server_and_events: Iterable[Tuple[str, EventBase, int]] + ) -> List[defer.Deferred]: + """Bulk verification of signatures on events. + + Args: + server_and_events: + Iterable of `(server_name, event, validity_time)` tuples. + + `server_name` is which server we are verifying the signature for + on the event. + + `event` is the event that we'll verify the signatures of for + the given `server_name`. + + `validity_time` is a timestamp at which the signing key must be + valid. + + Returns: + List: for each input triplet, a deferred indicating success + or failure to verify each event's signature for the given + server_name. The deferreds run their callbacks in the sentinel + logcontext. + """ + return self._verify_objects( + VerifyJsonRequest.from_event(server_name, event, validity_time) + for server_name, event, validity_time in server_and_events + ) + def _verify_objects( self, verify_requests: Iterable[VerifyJsonRequest] ) -> List[defer.Deferred]: @@ -892,7 +970,7 @@ async def _handle_key_deferred(verify_request: VerifyJsonRequest) -> None: with PreserveLoggingContext(): _, key_id, verify_key = await verify_request.key_ready - json_object = verify_request.json_object + json_object = verify_request.get_json_object() try: verify_signed_json(json_object, server_name, verify_key) diff --git a/synapse/federation/federation_base.py b/synapse/federation/federation_base.py index 949dcd46141f..3fe496dcd330 100644 --- a/synapse/federation/federation_base.py +++ b/synapse/federation/federation_base.py @@ -137,11 +137,7 @@ def errback(failure: Failure, pdu: EventBase): return deferreds -class PduToCheckSig( - namedtuple( - "PduToCheckSig", ["pdu", "redacted_pdu_json", "sender_domain", "deferreds"] - ) -): +class PduToCheckSig(namedtuple("PduToCheckSig", ["pdu", "sender_domain", "deferreds"])): pass @@ -184,7 +180,6 @@ def _check_sigs_on_pdus( pdus_to_check = [ PduToCheckSig( pdu=p, - redacted_pdu_json=prune_event(p).get_pdu_json(), sender_domain=get_domain_from_id(p.sender), deferreds=[], ) @@ -195,13 +190,12 @@ def _check_sigs_on_pdus( # (except if its a 3pid invite, in which case it may be sent by any server) pdus_to_check_sender = [p for p in pdus_to_check if not _is_invite_via_3pid(p.pdu)] - more_deferreds = keyring.verify_json_objects_for_server( + more_deferreds = keyring.verify_events_for_server( [ ( p.sender_domain, - p.redacted_pdu_json, + p.pdu, p.pdu.origin_server_ts if room_version.enforce_key_validity else 0, - p.pdu.event_id, ) for p in pdus_to_check_sender ] @@ -230,13 +224,12 @@ def sender_err(e, pdu_to_check): if p.sender_domain != get_domain_from_id(p.pdu.event_id) ] - more_deferreds = keyring.verify_json_objects_for_server( + more_deferreds = keyring.verify_events_for_server( [ ( get_domain_from_id(p.pdu.event_id), - p.redacted_pdu_json, + p.pdu, p.pdu.origin_server_ts if room_version.enforce_key_validity else 0, - p.pdu.event_id, ) for p in pdus_to_check_event_id ] From 7958eadcd16087e6aaf7cce240c1f82856e0bcc7 Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Fri, 21 May 2021 11:20:51 +0100 Subject: [PATCH 22/52] Add a batching queue implementation. (#10017) --- changelog.d/10017.misc | 1 + synapse/util/batching_queue.py | 153 +++++++++++++++++++++++++++ tests/util/test_batching_queue.py | 169 ++++++++++++++++++++++++++++++ 3 files changed, 323 insertions(+) create mode 100644 changelog.d/10017.misc create mode 100644 synapse/util/batching_queue.py create mode 100644 tests/util/test_batching_queue.py diff --git a/changelog.d/10017.misc b/changelog.d/10017.misc new file mode 100644 index 000000000000..4777b7fb57b1 --- /dev/null +++ b/changelog.d/10017.misc @@ -0,0 +1 @@ +Add a batching queue implementation. diff --git a/synapse/util/batching_queue.py b/synapse/util/batching_queue.py new file mode 100644 index 000000000000..44bbb7b1a8ef --- /dev/null +++ b/synapse/util/batching_queue.py @@ -0,0 +1,153 @@ +# Copyright 2021 The Matrix.org Foundation C.I.C. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +import logging +from typing import ( + Awaitable, + Callable, + Dict, + Generic, + Hashable, + List, + Set, + Tuple, + TypeVar, +) + +from twisted.internet import defer + +from synapse.logging.context import PreserveLoggingContext, make_deferred_yieldable +from synapse.metrics import LaterGauge +from synapse.metrics.background_process_metrics import run_as_background_process +from synapse.util import Clock + +logger = logging.getLogger(__name__) + + +V = TypeVar("V") +R = TypeVar("R") + + +class BatchingQueue(Generic[V, R]): + """A queue that batches up work, calling the provided processing function + with all pending work (for a given key). + + The provided processing function will only be called once at a time for each + key. It will be called the next reactor tick after `add_to_queue` has been + called, and will keep being called until the queue has been drained (for the + given key). + + Note that the return value of `add_to_queue` will be the return value of the + processing function that processed the given item. This means that the + returned value will likely include data for other items that were in the + batch. + """ + + def __init__( + self, + name: str, + clock: Clock, + process_batch_callback: Callable[[List[V]], Awaitable[R]], + ): + self._name = name + self._clock = clock + + # The set of keys currently being processed. + self._processing_keys = set() # type: Set[Hashable] + + # The currently pending batch of values by key, with a Deferred to call + # with the result of the corresponding `_process_batch_callback` call. + self._next_values = {} # type: Dict[Hashable, List[Tuple[V, defer.Deferred]]] + + # The function to call with batches of values. + self._process_batch_callback = process_batch_callback + + LaterGauge( + "synapse_util_batching_queue_number_queued", + "The number of items waiting in the queue across all keys", + labels=("name",), + caller=lambda: sum(len(v) for v in self._next_values.values()), + ) + + LaterGauge( + "synapse_util_batching_queue_number_of_keys", + "The number of distinct keys that have items queued", + labels=("name",), + caller=lambda: len(self._next_values), + ) + + async def add_to_queue(self, value: V, key: Hashable = ()) -> R: + """Adds the value to the queue with the given key, returning the result + of the processing function for the batch that included the given value. + + The optional `key` argument allows sharding the queue by some key. The + queues will then be processed in parallel, i.e. the process batch + function will be called in parallel with batched values from a single + key. + """ + + # First we create a defer and add it and the value to the list of + # pending items. + d = defer.Deferred() + self._next_values.setdefault(key, []).append((value, d)) + + # If we're not currently processing the key fire off a background + # process to start processing. + if key not in self._processing_keys: + run_as_background_process(self._name, self._process_queue, key) + + return await make_deferred_yieldable(d) + + async def _process_queue(self, key: Hashable) -> None: + """A background task to repeatedly pull things off the queue for the + given key and call the `self._process_batch_callback` with the values. + """ + + try: + if key in self._processing_keys: + return + + self._processing_keys.add(key) + + while True: + # We purposefully wait a reactor tick to allow us to batch + # together requests that we're about to receive. A common + # pattern is to call `add_to_queue` multiple times at once, and + # deferring to the next reactor tick allows us to batch all of + # those up. + await self._clock.sleep(0) + + next_values = self._next_values.pop(key, []) + if not next_values: + # We've exhausted the queue. + break + + try: + values = [value for value, _ in next_values] + results = await self._process_batch_callback(values) + + for _, deferred in next_values: + with PreserveLoggingContext(): + deferred.callback(results) + + except Exception as e: + for _, deferred in next_values: + if deferred.called: + continue + + with PreserveLoggingContext(): + deferred.errback(e) + + finally: + self._processing_keys.discard(key) diff --git a/tests/util/test_batching_queue.py b/tests/util/test_batching_queue.py new file mode 100644 index 000000000000..5def1e56c9bb --- /dev/null +++ b/tests/util/test_batching_queue.py @@ -0,0 +1,169 @@ +# Copyright 2021 The Matrix.org Foundation C.I.C. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +from twisted.internet import defer + +from synapse.logging.context import make_deferred_yieldable +from synapse.util.batching_queue import BatchingQueue + +from tests.server import get_clock +from tests.unittest import TestCase + + +class BatchingQueueTestCase(TestCase): + def setUp(self): + self.clock, hs_clock = get_clock() + + self._pending_calls = [] + self.queue = BatchingQueue("test_queue", hs_clock, self._process_queue) + + async def _process_queue(self, values): + d = defer.Deferred() + self._pending_calls.append((values, d)) + return await make_deferred_yieldable(d) + + def test_simple(self): + """Tests the basic case of calling `add_to_queue` once and having + `_process_queue` return. + """ + + self.assertFalse(self._pending_calls) + + queue_d = defer.ensureDeferred(self.queue.add_to_queue("foo")) + + # The queue should wait a reactor tick before calling the processing + # function. + self.assertFalse(self._pending_calls) + self.assertFalse(queue_d.called) + + # We should see a call to `_process_queue` after a reactor tick. + self.clock.pump([0]) + + self.assertEqual(len(self._pending_calls), 1) + self.assertEqual(self._pending_calls[0][0], ["foo"]) + self.assertFalse(queue_d.called) + + # Return value of the `_process_queue` should be propagated back. + self._pending_calls.pop()[1].callback("bar") + + self.assertEqual(self.successResultOf(queue_d), "bar") + + def test_batching(self): + """Test that multiple calls at the same time get batched up into one + call to `_process_queue`. + """ + + self.assertFalse(self._pending_calls) + + queue_d1 = defer.ensureDeferred(self.queue.add_to_queue("foo1")) + queue_d2 = defer.ensureDeferred(self.queue.add_to_queue("foo2")) + + self.clock.pump([0]) + + # We should see only *one* call to `_process_queue` + self.assertEqual(len(self._pending_calls), 1) + self.assertEqual(self._pending_calls[0][0], ["foo1", "foo2"]) + self.assertFalse(queue_d1.called) + self.assertFalse(queue_d2.called) + + # Return value of the `_process_queue` should be propagated back to both. + self._pending_calls.pop()[1].callback("bar") + + self.assertEqual(self.successResultOf(queue_d1), "bar") + self.assertEqual(self.successResultOf(queue_d2), "bar") + + def test_queuing(self): + """Test that we queue up requests while a `_process_queue` is being + called. + """ + + self.assertFalse(self._pending_calls) + + queue_d1 = defer.ensureDeferred(self.queue.add_to_queue("foo1")) + self.clock.pump([0]) + + queue_d2 = defer.ensureDeferred(self.queue.add_to_queue("foo2")) + + # We should see only *one* call to `_process_queue` + self.assertEqual(len(self._pending_calls), 1) + self.assertEqual(self._pending_calls[0][0], ["foo1"]) + self.assertFalse(queue_d1.called) + self.assertFalse(queue_d2.called) + + # Return value of the `_process_queue` should be propagated back to the + # first. + self._pending_calls.pop()[1].callback("bar1") + + self.assertEqual(self.successResultOf(queue_d1), "bar1") + self.assertFalse(queue_d2.called) + + # We should now see a second call to `_process_queue` + self.clock.pump([0]) + self.assertEqual(len(self._pending_calls), 1) + self.assertEqual(self._pending_calls[0][0], ["foo2"]) + self.assertFalse(queue_d2.called) + + # Return value of the `_process_queue` should be propagated back to the + # second. + self._pending_calls.pop()[1].callback("bar2") + + self.assertEqual(self.successResultOf(queue_d2), "bar2") + + def test_different_keys(self): + """Test that calls to different keys get processed in parallel.""" + + self.assertFalse(self._pending_calls) + + queue_d1 = defer.ensureDeferred(self.queue.add_to_queue("foo1", key=1)) + self.clock.pump([0]) + queue_d2 = defer.ensureDeferred(self.queue.add_to_queue("foo2", key=2)) + self.clock.pump([0]) + + # We queue up another item with key=2 to check that we will keep taking + # things off the queue. + queue_d3 = defer.ensureDeferred(self.queue.add_to_queue("foo3", key=2)) + + # We should see two calls to `_process_queue` + self.assertEqual(len(self._pending_calls), 2) + self.assertEqual(self._pending_calls[0][0], ["foo1"]) + self.assertEqual(self._pending_calls[1][0], ["foo2"]) + self.assertFalse(queue_d1.called) + self.assertFalse(queue_d2.called) + self.assertFalse(queue_d3.called) + + # Return value of the `_process_queue` should be propagated back to the + # first. + self._pending_calls.pop(0)[1].callback("bar1") + + self.assertEqual(self.successResultOf(queue_d1), "bar1") + self.assertFalse(queue_d2.called) + self.assertFalse(queue_d3.called) + + # Return value of the `_process_queue` should be propagated back to the + # second. + self._pending_calls.pop()[1].callback("bar2") + + self.assertEqual(self.successResultOf(queue_d2), "bar2") + self.assertFalse(queue_d3.called) + + # We should now see a call `_pending_calls` for `foo3` + self.clock.pump([0]) + self.assertEqual(len(self._pending_calls), 1) + self.assertEqual(self._pending_calls[0][0], ["foo3"]) + self.assertFalse(queue_d3.called) + + # Return value of the `_process_queue` should be propagated back to the + # third deferred. + self._pending_calls.pop()[1].callback("bar4") + + self.assertEqual(self.successResultOf(queue_d3), "bar4") From 6a8643ff3da905568e3f2ec047182753352e39d1 Mon Sep 17 00:00:00 2001 From: Marek Matys <57749215+thermaq@users.noreply.github.com> Date: Fri, 21 May 2021 13:02:06 +0200 Subject: [PATCH 23/52] Fixed removal of new presence stream states (#10014) Fixes: https://github.com/matrix-org/synapse/issues/9962 This is a fix for above problem. I fixed it by swaping the order of insertion of new records and deletion of old ones. This ensures that we don't delete fresh database records as we do deletes before inserts. Signed-off-by: Marek Matys --- changelog.d/10014.bugfix | 1 + synapse/storage/databases/main/presence.py | 18 +++++++++--------- 2 files changed, 10 insertions(+), 9 deletions(-) create mode 100644 changelog.d/10014.bugfix diff --git a/changelog.d/10014.bugfix b/changelog.d/10014.bugfix new file mode 100644 index 000000000000..7cf3603f94df --- /dev/null +++ b/changelog.d/10014.bugfix @@ -0,0 +1 @@ +Fixed deletion of new presence stream states from database. diff --git a/synapse/storage/databases/main/presence.py b/synapse/storage/databases/main/presence.py index 669a2af8845a..6a2baa7841e0 100644 --- a/synapse/storage/databases/main/presence.py +++ b/synapse/storage/databases/main/presence.py @@ -97,6 +97,15 @@ def _update_presence_txn(self, txn, stream_orderings, presence_states): ) txn.call_after(self._get_presence_for_user.invalidate, (state.user_id,)) + # Delete old rows to stop database from getting really big + sql = "DELETE FROM presence_stream WHERE stream_id < ? AND " + + for states in batch_iter(presence_states, 50): + clause, args = make_in_list_sql_clause( + self.database_engine, "user_id", [s.user_id for s in states] + ) + txn.execute(sql + clause, [stream_id] + list(args)) + # Actually insert new rows self.db_pool.simple_insert_many_txn( txn, @@ -117,15 +126,6 @@ def _update_presence_txn(self, txn, stream_orderings, presence_states): ], ) - # Delete old rows to stop database from getting really big - sql = "DELETE FROM presence_stream WHERE stream_id < ? AND " - - for states in batch_iter(presence_states, 50): - clause, args = make_in_list_sql_clause( - self.database_engine, "user_id", [s.user_id for s in states] - ) - txn.execute(sql + clause, [stream_id] + list(args)) - async def get_all_presence_updates( self, instance_name: str, last_id: int, current_id: int, limit: int ) -> Tuple[List[Tuple[int, list]], int, bool]: From c5413d0e9ef6afa9cb5140774acfb4954fe0bf37 Mon Sep 17 00:00:00 2001 From: Patrick Cloke Date: Fri, 21 May 2021 12:02:01 -0400 Subject: [PATCH 24/52] Remove unused properties from the SpaceSummaryHandler. (#10038) --- changelog.d/10038.feature | 1 + synapse/handlers/space_summary.py | 2 -- 2 files changed, 1 insertion(+), 2 deletions(-) create mode 100644 changelog.d/10038.feature diff --git a/changelog.d/10038.feature b/changelog.d/10038.feature new file mode 100644 index 000000000000..2c655350c04f --- /dev/null +++ b/changelog.d/10038.feature @@ -0,0 +1 @@ +Experimental support to allow a user who could join a restricted room to view it in the spaces summary. diff --git a/synapse/handlers/space_summary.py b/synapse/handlers/space_summary.py index 8d49ba81646e..abd9ddecca20 100644 --- a/synapse/handlers/space_summary.py +++ b/synapse/handlers/space_summary.py @@ -50,8 +50,6 @@ class SpaceSummaryHandler: def __init__(self, hs: "HomeServer"): self._clock = hs.get_clock() self._auth = hs.get_auth() - self._room_list_handler = hs.get_room_list_handler() - self._state_handler = hs.get_state_handler() self._event_auth_handler = hs.get_event_auth_handler() self._store = hs.get_datastore() self._event_serializer = hs.get_event_client_serializer() From 21bd230831022a69ce90f334e869edd151155067 Mon Sep 17 00:00:00 2001 From: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com> Date: Fri, 21 May 2021 17:29:14 +0100 Subject: [PATCH 25/52] Add a test for update_presence (#10033) https://github.com/matrix-org/synapse/issues/9962 uncovered that we accidentally removed all but one of the presence updates that we store in the database when persisting multiple updates. This could cause users' presence state to be stale. The bug was fixed in #10014, and this PR just adds a test that failed on the old code, and was used to initially verify the bug. The test attempts to insert some presence into the database in a batch using `PresenceStore.update_presence`, and then simply pulls it out again. --- changelog.d/10033.bugfix | 1 + tests/handlers/test_presence.py | 47 ++++++++++++++++++++++++++++++++- 2 files changed, 47 insertions(+), 1 deletion(-) create mode 100644 changelog.d/10033.bugfix diff --git a/changelog.d/10033.bugfix b/changelog.d/10033.bugfix new file mode 100644 index 000000000000..587d839b8c9d --- /dev/null +++ b/changelog.d/10033.bugfix @@ -0,0 +1 @@ +Fixed deletion of new presence stream states from database. \ No newline at end of file diff --git a/tests/handlers/test_presence.py b/tests/handlers/test_presence.py index 1ffab709fcb8..d90a9fec91fa 100644 --- a/tests/handlers/test_presence.py +++ b/tests/handlers/test_presence.py @@ -32,13 +32,19 @@ handle_timeout, handle_update, ) +from synapse.rest import admin from synapse.rest.client.v1 import room from synapse.types import UserID, get_domain_from_id from tests import unittest -class PresenceUpdateTestCase(unittest.TestCase): +class PresenceUpdateTestCase(unittest.HomeserverTestCase): + servlets = [admin.register_servlets] + + def prepare(self, reactor, clock, homeserver): + self.store = homeserver.get_datastore() + def test_offline_to_online(self): wheel_timer = Mock() user_id = "@foo:bar" @@ -292,6 +298,45 @@ def test_online_to_idle(self): any_order=True, ) + def test_persisting_presence_updates(self): + """Tests that the latest presence state for each user is persisted correctly""" + # Create some test users and presence states for them + presence_states = [] + for i in range(5): + user_id = self.register_user(f"user_{i}", "password") + + presence_state = UserPresenceState( + user_id=user_id, + state="online", + last_active_ts=1, + last_federation_update_ts=1, + last_user_sync_ts=1, + status_msg="I'm online!", + currently_active=True, + ) + presence_states.append(presence_state) + + # Persist these presence updates to the database + self.get_success(self.store.update_presence(presence_states)) + + # Check that each update is present in the database + db_presence_states = self.get_success( + self.store.get_all_presence_updates( + instance_name="master", + last_id=0, + current_id=len(presence_states) + 1, + limit=len(presence_states), + ) + ) + + # Extract presence update user ID and state information into lists of tuples + db_presence_states = [(ps[0], ps[1]) for _, ps in db_presence_states[0]] + presence_states = [(ps.user_id, ps.state) for ps in presence_states] + + # Compare what we put into the storage with what we got out. + # They should be identical. + self.assertEqual(presence_states, db_presence_states) + class PresenceTimeoutTestCase(unittest.TestCase): def test_idle_timer(self): From e8ac9ac8ca18fe3456bfeba7a5883be1c991b2a6 Mon Sep 17 00:00:00 2001 From: Michael Telatynski <7t3chguy@gmail.com> Date: Fri, 21 May 2021 17:31:59 +0100 Subject: [PATCH 26/52] Fix /upload 500'ing when presented a very large image (#10029) * Fix /upload 500'ing when presented a very large image Catch DecompressionBombError and re-raise as ThumbnailErrors * Set PIL's MAX_IMAGE_PIXELS to match homeserver.yaml to get it to bomb out quicker, to load less into memory in the case of super large images * Add changelog entry for 10029 --- changelog.d/10029.bugfix | 1 + synapse/rest/media/v1/media_repository.py | 2 ++ synapse/rest/media/v1/thumbnailer.py | 9 +++++++++ 3 files changed, 12 insertions(+) create mode 100644 changelog.d/10029.bugfix diff --git a/changelog.d/10029.bugfix b/changelog.d/10029.bugfix new file mode 100644 index 000000000000..c214cbdaecc1 --- /dev/null +++ b/changelog.d/10029.bugfix @@ -0,0 +1 @@ +Fixed a bug with very high resolution image uploads throwing internal server errors. \ No newline at end of file diff --git a/synapse/rest/media/v1/media_repository.py b/synapse/rest/media/v1/media_repository.py index e8a875b90093..21c43c340c3b 100644 --- a/synapse/rest/media/v1/media_repository.py +++ b/synapse/rest/media/v1/media_repository.py @@ -76,6 +76,8 @@ def __init__(self, hs: "HomeServer"): self.max_upload_size = hs.config.max_upload_size self.max_image_pixels = hs.config.max_image_pixels + Thumbnailer.set_limits(self.max_image_pixels) + self.primary_base_path = hs.config.media_store_path # type: str self.filepaths = MediaFilePaths(self.primary_base_path) # type: MediaFilePaths diff --git a/synapse/rest/media/v1/thumbnailer.py b/synapse/rest/media/v1/thumbnailer.py index 37fe582390e3..a65e9e18022e 100644 --- a/synapse/rest/media/v1/thumbnailer.py +++ b/synapse/rest/media/v1/thumbnailer.py @@ -40,6 +40,10 @@ class Thumbnailer: FORMATS = {"image/jpeg": "JPEG", "image/png": "PNG"} + @staticmethod + def set_limits(max_image_pixels: int): + Image.MAX_IMAGE_PIXELS = max_image_pixels + def __init__(self, input_path: str): try: self.image = Image.open(input_path) @@ -47,6 +51,11 @@ def __init__(self, input_path: str): # If an error occurs opening the image, a thumbnail won't be able to # be generated. raise ThumbnailError from e + except Image.DecompressionBombError as e: + # If an image decompression bomb error occurs opening the image, + # then the image exceeds the pixel limit and a thumbnail won't + # be able to be generated. + raise ThumbnailError from e self.width, self.height = self.image.size self.transpose_method = None From 3e831f24ffc887e174f67ff7b1cfe3a429b7b5c1 Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Fri, 21 May 2021 17:57:08 +0100 Subject: [PATCH 27/52] Don't hammer the database for destination retry timings every ~5mins (#10036) --- changelog.d/10036.misc | 1 + synapse/app/generic_worker.py | 2 - synapse/federation/transport/server.py | 2 +- .../replication/slave/storage/transactions.py | 21 ------ synapse/storage/databases/main/__init__.py | 4 +- .../storage/databases/main/transactions.py | 66 +++++++++++-------- synapse/util/retryutils.py | 8 +-- tests/handlers/test_typing.py | 8 +-- tests/storage/test_transactions.py | 8 ++- tests/util/test_retryutils.py | 18 +++-- 10 files changed, 62 insertions(+), 76 deletions(-) create mode 100644 changelog.d/10036.misc delete mode 100644 synapse/replication/slave/storage/transactions.py diff --git a/changelog.d/10036.misc b/changelog.d/10036.misc new file mode 100644 index 000000000000..d2cf1e54734e --- /dev/null +++ b/changelog.d/10036.misc @@ -0,0 +1 @@ +Properly invalidate caches for destination retry timings every (instead of expiring entries every 5 minutes). diff --git a/synapse/app/generic_worker.py b/synapse/app/generic_worker.py index f730cdbd78f4..91ad326f193f 100644 --- a/synapse/app/generic_worker.py +++ b/synapse/app/generic_worker.py @@ -61,7 +61,6 @@ from synapse.replication.slave.storage.receipts import SlavedReceiptsStore from synapse.replication.slave.storage.registration import SlavedRegistrationStore from synapse.replication.slave.storage.room import RoomStore -from synapse.replication.slave.storage.transactions import SlavedTransactionStore from synapse.rest.admin import register_servlets_for_media_repo from synapse.rest.client.v1 import events, login, presence, room from synapse.rest.client.v1.initial_sync import InitialSyncRestServlet @@ -237,7 +236,6 @@ class GenericWorkerSlavedStore( DirectoryStore, SlavedApplicationServiceStore, SlavedRegistrationStore, - SlavedTransactionStore, SlavedProfileStore, SlavedClientIpStore, SlavedFilteringStore, diff --git a/synapse/federation/transport/server.py b/synapse/federation/transport/server.py index c17a085a4f93..9d50b05d0152 100644 --- a/synapse/federation/transport/server.py +++ b/synapse/federation/transport/server.py @@ -160,7 +160,7 @@ async def authenticate_request(self, request, content): # If we get a valid signed request from the other side, its probably # alive retry_timings = await self.store.get_destination_retry_timings(origin) - if retry_timings and retry_timings["retry_last_ts"]: + if retry_timings and retry_timings.retry_last_ts: run_in_background(self._reset_retry_timings, origin) return origin diff --git a/synapse/replication/slave/storage/transactions.py b/synapse/replication/slave/storage/transactions.py deleted file mode 100644 index a59e54392441..000000000000 --- a/synapse/replication/slave/storage/transactions.py +++ /dev/null @@ -1,21 +0,0 @@ -# Copyright 2015, 2016 OpenMarket Ltd -# -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. - -from synapse.storage.databases.main.transactions import TransactionStore - -from ._base import BaseSlavedStore - - -class SlavedTransactionStore(TransactionStore, BaseSlavedStore): - pass diff --git a/synapse/storage/databases/main/__init__.py b/synapse/storage/databases/main/__init__.py index 49c7606d5138..9cce62ae6c5b 100644 --- a/synapse/storage/databases/main/__init__.py +++ b/synapse/storage/databases/main/__init__.py @@ -67,7 +67,7 @@ from .stats import StatsStore from .stream import StreamStore from .tags import TagsStore -from .transactions import TransactionStore +from .transactions import TransactionWorkerStore from .ui_auth import UIAuthStore from .user_directory import UserDirectoryStore from .user_erasure_store import UserErasureStore @@ -83,7 +83,7 @@ class DataStore( StreamStore, ProfileStore, PresenceStore, - TransactionStore, + TransactionWorkerStore, DirectoryStore, KeyStore, StateStore, diff --git a/synapse/storage/databases/main/transactions.py b/synapse/storage/databases/main/transactions.py index 82335e7a9dad..d211c423b2ce 100644 --- a/synapse/storage/databases/main/transactions.py +++ b/synapse/storage/databases/main/transactions.py @@ -16,13 +16,15 @@ from collections import namedtuple from typing import Iterable, List, Optional, Tuple +import attr from canonicaljson import encode_canonical_json from synapse.metrics.background_process_metrics import wrap_as_background_process -from synapse.storage._base import SQLBaseStore, db_to_json +from synapse.storage._base import db_to_json from synapse.storage.database import DatabasePool, LoggingTransaction +from synapse.storage.databases.main.cache import CacheInvalidationWorkerStore from synapse.types import JsonDict -from synapse.util.caches.expiringcache import ExpiringCache +from synapse.util.caches.descriptors import cached db_binary_type = memoryview @@ -38,10 +40,23 @@ "_TransactionRow", ("response_code", "response_json") ) -SENTINEL = object() +@attr.s(slots=True, frozen=True, auto_attribs=True) +class DestinationRetryTimings: + """The current destination retry timing info for a remote server.""" -class TransactionWorkerStore(SQLBaseStore): + # The first time we tried and failed to reach the remote server, in ms. + failure_ts: int + + # The last time we tried and failed to reach the remote server, in ms. + retry_last_ts: int + + # How long since the last time we tried to reach the remote server before + # trying again, in ms. + retry_interval: int + + +class TransactionWorkerStore(CacheInvalidationWorkerStore): def __init__(self, database: DatabasePool, db_conn, hs): super().__init__(database, db_conn, hs) @@ -60,19 +75,6 @@ def _cleanup_transactions_txn(txn): "_cleanup_transactions", _cleanup_transactions_txn ) - -class TransactionStore(TransactionWorkerStore): - """A collection of queries for handling PDUs.""" - - def __init__(self, database: DatabasePool, db_conn, hs): - super().__init__(database, db_conn, hs) - - self._destination_retry_cache = ExpiringCache( - cache_name="get_destination_retry_timings", - clock=self._clock, - expiry_ms=5 * 60 * 1000, - ) - async def get_received_txn_response( self, transaction_id: str, origin: str ) -> Optional[Tuple[int, JsonDict]]: @@ -145,7 +147,11 @@ async def set_received_txn_response( desc="set_received_txn_response", ) - async def get_destination_retry_timings(self, destination): + @cached(max_entries=10000) + async def get_destination_retry_timings( + self, + destination: str, + ) -> Optional[DestinationRetryTimings]: """Gets the current retry timings (if any) for a given destination. Args: @@ -156,34 +162,29 @@ async def get_destination_retry_timings(self, destination): Otherwise a dict for the retry scheme """ - result = self._destination_retry_cache.get(destination, SENTINEL) - if result is not SENTINEL: - return result - result = await self.db_pool.runInteraction( "get_destination_retry_timings", self._get_destination_retry_timings, destination, ) - # We don't hugely care about race conditions between getting and - # invalidating the cache, since we time out fairly quickly anyway. - self._destination_retry_cache[destination] = result return result - def _get_destination_retry_timings(self, txn, destination): + def _get_destination_retry_timings( + self, txn, destination: str + ) -> Optional[DestinationRetryTimings]: result = self.db_pool.simple_select_one_txn( txn, table="destinations", keyvalues={"destination": destination}, - retcols=("destination", "failure_ts", "retry_last_ts", "retry_interval"), + retcols=("failure_ts", "retry_last_ts", "retry_interval"), allow_none=True, ) # check we have a row and retry_last_ts is not null or zero # (retry_last_ts can't be negative) if result and result["retry_last_ts"]: - return result + return DestinationRetryTimings(**result) else: return None @@ -204,7 +205,6 @@ async def set_destination_retry_timings( retry_interval: how long until next retry in ms """ - self._destination_retry_cache.pop(destination, None) if self.database_engine.can_native_upsert: return await self.db_pool.runInteraction( "set_destination_retry_timings", @@ -252,6 +252,10 @@ def _set_destination_retry_timings_native( txn.execute(sql, (destination, failure_ts, retry_last_ts, retry_interval)) + self._invalidate_cache_and_stream( + txn, self.get_destination_retry_timings, (destination,) + ) + def _set_destination_retry_timings_emulated( self, txn, destination, failure_ts, retry_last_ts, retry_interval ): @@ -295,6 +299,10 @@ def _set_destination_retry_timings_emulated( }, ) + self._invalidate_cache_and_stream( + txn, self.get_destination_retry_timings, (destination,) + ) + async def store_destination_rooms_entries( self, destinations: Iterable[str], diff --git a/synapse/util/retryutils.py b/synapse/util/retryutils.py index f9c370a814cc..129b47cd4994 100644 --- a/synapse/util/retryutils.py +++ b/synapse/util/retryutils.py @@ -82,11 +82,9 @@ async def get_retry_limiter(destination, clock, store, ignore_backoff=False, **k retry_timings = await store.get_destination_retry_timings(destination) if retry_timings: - failure_ts = retry_timings["failure_ts"] - retry_last_ts, retry_interval = ( - retry_timings["retry_last_ts"], - retry_timings["retry_interval"], - ) + failure_ts = retry_timings.failure_ts + retry_last_ts = retry_timings.retry_last_ts + retry_interval = retry_timings.retry_interval now = int(clock.time_msec()) diff --git a/tests/handlers/test_typing.py b/tests/handlers/test_typing.py index 0c89487eaf34..f58afbc244a8 100644 --- a/tests/handlers/test_typing.py +++ b/tests/handlers/test_typing.py @@ -89,14 +89,8 @@ def prepare(self, reactor, clock, hs): self.event_source = hs.get_event_sources().sources["typing"] self.datastore = hs.get_datastore() - retry_timings_res = { - "destination": "", - "retry_last_ts": 0, - "retry_interval": 0, - "failure_ts": None, - } self.datastore.get_destination_retry_timings = Mock( - return_value=defer.succeed(retry_timings_res) + return_value=defer.succeed(None) ) self.datastore.get_device_updates_by_remote = Mock( diff --git a/tests/storage/test_transactions.py b/tests/storage/test_transactions.py index b7f7eae8d096..bea9091d3089 100644 --- a/tests/storage/test_transactions.py +++ b/tests/storage/test_transactions.py @@ -12,6 +12,7 @@ # See the License for the specific language governing permissions and # limitations under the License. +from synapse.storage.databases.main.transactions import DestinationRetryTimings from synapse.util.retryutils import MAX_RETRY_INTERVAL from tests.unittest import HomeserverTestCase @@ -36,8 +37,11 @@ def test_get_set_transactions(self): d = self.store.get_destination_retry_timings("example.com") r = self.get_success(d) - self.assert_dict( - {"retry_last_ts": 50, "retry_interval": 100, "failure_ts": 1000}, r + self.assertEqual( + DestinationRetryTimings( + retry_last_ts=50, retry_interval=100, failure_ts=1000 + ), + r, ) def test_initial_set_transactions(self): diff --git a/tests/util/test_retryutils.py b/tests/util/test_retryutils.py index 9b2be83a43a1..9e1bebdc8335 100644 --- a/tests/util/test_retryutils.py +++ b/tests/util/test_retryutils.py @@ -51,10 +51,12 @@ def test_limiter(self): except AssertionError: pass + self.pump() + new_timings = self.get_success(store.get_destination_retry_timings("test_dest")) - self.assertEqual(new_timings["failure_ts"], failure_ts) - self.assertEqual(new_timings["retry_last_ts"], failure_ts) - self.assertEqual(new_timings["retry_interval"], MIN_RETRY_INTERVAL) + self.assertEqual(new_timings.failure_ts, failure_ts) + self.assertEqual(new_timings.retry_last_ts, failure_ts) + self.assertEqual(new_timings.retry_interval, MIN_RETRY_INTERVAL) # now if we try again we should get a failure self.get_failure( @@ -77,14 +79,16 @@ def test_limiter(self): except AssertionError: pass + self.pump() + new_timings = self.get_success(store.get_destination_retry_timings("test_dest")) - self.assertEqual(new_timings["failure_ts"], failure_ts) - self.assertEqual(new_timings["retry_last_ts"], retry_ts) + self.assertEqual(new_timings.failure_ts, failure_ts) + self.assertEqual(new_timings.retry_last_ts, retry_ts) self.assertGreaterEqual( - new_timings["retry_interval"], MIN_RETRY_INTERVAL * RETRY_MULTIPLIER * 0.5 + new_timings.retry_interval, MIN_RETRY_INTERVAL * RETRY_MULTIPLIER * 0.5 ) self.assertLessEqual( - new_timings["retry_interval"], MIN_RETRY_INTERVAL * RETRY_MULTIPLIER * 2.0 + new_timings.retry_interval, MIN_RETRY_INTERVAL * RETRY_MULTIPLIER * 2.0 ) # From 5f1198a67ebe182db14b5f98bb4c9a19d4889918 Mon Sep 17 00:00:00 2001 From: Eric Eastwood Date: Mon, 24 May 2021 04:43:33 -0500 Subject: [PATCH 28/52] Fix `get_state_ids_for_event` return type typo to match what the function actually does (#10050) It looks like a typo copy/paste from `get_state_for_event` above. --- changelog.d/10050.misc | 1 + synapse/storage/state.py | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) create mode 100644 changelog.d/10050.misc diff --git a/changelog.d/10050.misc b/changelog.d/10050.misc new file mode 100644 index 000000000000..2cac953cca1e --- /dev/null +++ b/changelog.d/10050.misc @@ -0,0 +1 @@ +Fix typo in `get_state_ids_for_event` docstring where the return type was incorrect. diff --git a/synapse/storage/state.py b/synapse/storage/state.py index cfafba22c5e0..c9dce726cb0e 100644 --- a/synapse/storage/state.py +++ b/synapse/storage/state.py @@ -540,7 +540,7 @@ async def get_state_ids_for_event( state_filter: The state filter used to fetch state from the database. Returns: - A dict from (type, state_key) -> state_event + A dict from (type, state_key) -> state_event_id """ state_map = await self.get_state_ids_for_events( [event_id], state_filter or StateFilter.all() From 387c297489b90bd80212ac1391666eecd01ff701 Mon Sep 17 00:00:00 2001 From: Dirk Klimpel <5740567+dklimpel@users.noreply.github.com> Date: Mon, 24 May 2021 13:37:30 +0200 Subject: [PATCH 29/52] Add missing entry to the table of contents of room admin API (#10043) --- changelog.d/10043.doc | 1 + docs/admin_api/rooms.md | 1 + 2 files changed, 2 insertions(+) create mode 100644 changelog.d/10043.doc diff --git a/changelog.d/10043.doc b/changelog.d/10043.doc new file mode 100644 index 000000000000..a574ec0bf06c --- /dev/null +++ b/changelog.d/10043.doc @@ -0,0 +1 @@ +Add missing room state entry to the table of contents of room admin API. \ No newline at end of file diff --git a/docs/admin_api/rooms.md b/docs/admin_api/rooms.md index 01d3882426b3..5721210fee11 100644 --- a/docs/admin_api/rooms.md +++ b/docs/admin_api/rooms.md @@ -4,6 +4,7 @@ * [Usage](#usage) - [Room Details API](#room-details-api) - [Room Members API](#room-members-api) +- [Room State API](#room-state-api) - [Delete Room API](#delete-room-api) * [Parameters](#parameters-1) * [Response](#response) From 316f89e87f462608a5e63ba567ad549330f1456a Mon Sep 17 00:00:00 2001 From: Patrick Cloke Date: Mon, 24 May 2021 08:57:14 -0400 Subject: [PATCH 30/52] Enable experimental spaces by default. (#10011) The previous spaces_enabled flag now defaults to true and is exposed in the sample config. --- changelog.d/10011.feature | 1 + docs/sample_config.yaml | 15 +++++++++++++++ synapse/config/experimental.py | 19 ++++++++++++++++++- synapse/config/homeserver.py | 2 +- 4 files changed, 35 insertions(+), 2 deletions(-) create mode 100644 changelog.d/10011.feature diff --git a/changelog.d/10011.feature b/changelog.d/10011.feature new file mode 100644 index 000000000000..409140fb1367 --- /dev/null +++ b/changelog.d/10011.feature @@ -0,0 +1 @@ +Enable experimental support for [MSC2946](https://github.com/matrix-org/matrix-doc/pull/2946) (spaces summary API) and [MSC3083](https://github.com/matrix-org/matrix-doc/pull/3083) (restricted join rules) by default. diff --git a/docs/sample_config.yaml b/docs/sample_config.yaml index 2952f2ba3201..f0f9f06a6ee3 100644 --- a/docs/sample_config.yaml +++ b/docs/sample_config.yaml @@ -2943,3 +2943,18 @@ redis: # Optional password if configured on the Redis instance # #password: + + +# Enable experimental features in Synapse. +# +# Experimental features might break or be removed without a deprecation +# period. +# +experimental_features: + # Support for Spaces (MSC1772), it enables the following: + # + # * The Spaces Summary API (MSC2946). + # * Restricting room membership based on space membership (MSC3083). + # + # Uncomment to disable support for Spaces. + #spaces_enabled: false diff --git a/synapse/config/experimental.py b/synapse/config/experimental.py index a693fba87779..cc67377f0fe7 100644 --- a/synapse/config/experimental.py +++ b/synapse/config/experimental.py @@ -29,9 +29,26 @@ def read_config(self, config: JsonDict, **kwargs): self.msc2858_enabled = experimental.get("msc2858_enabled", False) # type: bool # Spaces (MSC1772, MSC2946, MSC3083, etc) - self.spaces_enabled = experimental.get("spaces_enabled", False) # type: bool + self.spaces_enabled = experimental.get("spaces_enabled", True) # type: bool if self.spaces_enabled: KNOWN_ROOM_VERSIONS[RoomVersions.MSC3083.identifier] = RoomVersions.MSC3083 # MSC3026 (busy presence state) self.msc3026_enabled = experimental.get("msc3026_enabled", False) # type: bool + + def generate_config_section(self, **kwargs): + return """\ + # Enable experimental features in Synapse. + # + # Experimental features might break or be removed without a deprecation + # period. + # + experimental_features: + # Support for Spaces (MSC1772), it enables the following: + # + # * The Spaces Summary API (MSC2946). + # * Restricting room membership based on space membership (MSC3083). + # + # Uncomment to disable support for Spaces. + #spaces_enabled: false + """ diff --git a/synapse/config/homeserver.py b/synapse/config/homeserver.py index c23b66c88c76..5ae0f55bccd6 100644 --- a/synapse/config/homeserver.py +++ b/synapse/config/homeserver.py @@ -57,7 +57,6 @@ class HomeServerConfig(RootConfig): config_classes = [ ServerConfig, - ExperimentalConfig, TlsConfig, FederationConfig, CacheConfig, @@ -94,4 +93,5 @@ class HomeServerConfig(RootConfig): TracerConfig, WorkerConfig, RedisConfig, + ExperimentalConfig, ] From c0df6bae066fe818bb80d41af65503be7a07275d Mon Sep 17 00:00:00 2001 From: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> Date: Mon, 24 May 2021 14:02:01 +0100 Subject: [PATCH 31/52] Remove `keylen` from `LruCache`. (#9993) `keylen` seems to be a thing that is frequently incorrectly set, and we don't really need it. The only time it was used was to figure out if we had removed a subtree in `del_multi`, which we can do better by changing `TreeCache.pop` to return a different type (`TreeCacheNode`). Commits should be independently reviewable. --- changelog.d/9993.misc | 1 + .../replication/slave/storage/client_ips.py | 2 +- synapse/storage/databases/main/client_ips.py | 2 +- synapse/storage/databases/main/devices.py | 2 +- .../storage/databases/main/events_worker.py | 1 - synapse/util/caches/deferred_cache.py | 2 - synapse/util/caches/descriptors.py | 1 - synapse/util/caches/lrucache.py | 10 +- synapse/util/caches/treecache.py | 104 +++++++++++------- tests/util/test_lrucache.py | 4 +- tests/util/test_treecache.py | 6 +- 11 files changed, 80 insertions(+), 55 deletions(-) create mode 100644 changelog.d/9993.misc diff --git a/changelog.d/9993.misc b/changelog.d/9993.misc new file mode 100644 index 000000000000..0dd92440717d --- /dev/null +++ b/changelog.d/9993.misc @@ -0,0 +1 @@ +Remove `keylen` param on `LruCache`. diff --git a/synapse/replication/slave/storage/client_ips.py b/synapse/replication/slave/storage/client_ips.py index 8730966380de..13ed87adc4ab 100644 --- a/synapse/replication/slave/storage/client_ips.py +++ b/synapse/replication/slave/storage/client_ips.py @@ -24,7 +24,7 @@ def __init__(self, database: DatabasePool, db_conn, hs): super().__init__(database, db_conn, hs) self.client_ip_last_seen = LruCache( - cache_name="client_ip_last_seen", keylen=4, max_size=50000 + cache_name="client_ip_last_seen", max_size=50000 ) # type: LruCache[tuple, int] async def insert_client_ip(self, user_id, access_token, ip, user_agent, device_id): diff --git a/synapse/storage/databases/main/client_ips.py b/synapse/storage/databases/main/client_ips.py index d60010e942ab..074b077bef91 100644 --- a/synapse/storage/databases/main/client_ips.py +++ b/synapse/storage/databases/main/client_ips.py @@ -436,7 +436,7 @@ class ClientIpStore(ClientIpWorkerStore): def __init__(self, database: DatabasePool, db_conn, hs): self.client_ip_last_seen = LruCache( - cache_name="client_ip_last_seen", keylen=4, max_size=50000 + cache_name="client_ip_last_seen", max_size=50000 ) super().__init__(database, db_conn, hs) diff --git a/synapse/storage/databases/main/devices.py b/synapse/storage/databases/main/devices.py index a1f98b7e3854..fd87ba71abcb 100644 --- a/synapse/storage/databases/main/devices.py +++ b/synapse/storage/databases/main/devices.py @@ -1053,7 +1053,7 @@ def __init__(self, database: DatabasePool, db_conn, hs): # Map of (user_id, device_id) -> bool. If there is an entry that implies # the device exists. self.device_id_exists_cache = LruCache( - cache_name="device_id_exists", keylen=2, max_size=10000 + cache_name="device_id_exists", max_size=10000 ) async def store_device( diff --git a/synapse/storage/databases/main/events_worker.py b/synapse/storage/databases/main/events_worker.py index 2c823e09cfa6..6963bbf7f484 100644 --- a/synapse/storage/databases/main/events_worker.py +++ b/synapse/storage/databases/main/events_worker.py @@ -157,7 +157,6 @@ def __init__(self, database: DatabasePool, db_conn, hs): self._get_event_cache = LruCache( cache_name="*getEvent*", - keylen=3, max_size=hs.config.caches.event_cache_size, ) diff --git a/synapse/util/caches/deferred_cache.py b/synapse/util/caches/deferred_cache.py index 484097a48a45..371e7e4dd0de 100644 --- a/synapse/util/caches/deferred_cache.py +++ b/synapse/util/caches/deferred_cache.py @@ -70,7 +70,6 @@ def __init__( self, name: str, max_entries: int = 1000, - keylen: int = 1, tree: bool = False, iterable: bool = False, apply_cache_factor_from_config: bool = True, @@ -101,7 +100,6 @@ def metrics_cb(): # a Deferred. self.cache = LruCache( max_size=max_entries, - keylen=keylen, cache_name=name, cache_type=cache_type, size_callback=(lambda d: len(d) or 1) if iterable else None, diff --git a/synapse/util/caches/descriptors.py b/synapse/util/caches/descriptors.py index 3a4d027095bb..2ac24a2f2536 100644 --- a/synapse/util/caches/descriptors.py +++ b/synapse/util/caches/descriptors.py @@ -270,7 +270,6 @@ def __get__(self, obj, owner): cache = DeferredCache( name=self.orig.__name__, max_entries=self.max_entries, - keylen=self.num_args, tree=self.tree, iterable=self.iterable, ) # type: DeferredCache[CacheKey, Any] diff --git a/synapse/util/caches/lrucache.py b/synapse/util/caches/lrucache.py index 1be675e014af..54df407ff70c 100644 --- a/synapse/util/caches/lrucache.py +++ b/synapse/util/caches/lrucache.py @@ -34,7 +34,7 @@ from synapse.config import cache as cache_config from synapse.util import caches from synapse.util.caches import CacheMetric, register_cache -from synapse.util.caches.treecache import TreeCache +from synapse.util.caches.treecache import TreeCache, iterate_tree_cache_entry try: from pympler.asizeof import Asizer @@ -160,7 +160,6 @@ def __init__( self, max_size: int, cache_name: Optional[str] = None, - keylen: int = 1, cache_type: Type[Union[dict, TreeCache]] = dict, size_callback: Optional[Callable] = None, metrics_collection_callback: Optional[Callable[[], None]] = None, @@ -173,9 +172,6 @@ def __init__( cache_name: The name of this cache, for the prometheus metrics. If unset, no metrics will be reported on this cache. - keylen: The length of the tuple used as the cache key. Ignored unless - cache_type is `TreeCache`. - cache_type (type): type of underlying cache to be used. Typically one of dict or TreeCache. @@ -403,7 +399,9 @@ def cache_del_multi(key: KT) -> None: popped = cache.pop(key) if popped is None: return - for leaf in enumerate_leaves(popped, keylen - len(cast(tuple, key))): + # for each deleted node, we now need to remove it from the linked list + # and run its callbacks. + for leaf in iterate_tree_cache_entry(popped): delete_node(leaf) @synchronized diff --git a/synapse/util/caches/treecache.py b/synapse/util/caches/treecache.py index eb4d98f683cb..73502a8b061b 100644 --- a/synapse/util/caches/treecache.py +++ b/synapse/util/caches/treecache.py @@ -1,18 +1,43 @@ -from typing import Dict +# Copyright 2016-2021 The Matrix.org Foundation C.I.C. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. SENTINEL = object() +class TreeCacheNode(dict): + """The type of nodes in our tree. + + Has its own type so we can distinguish it from real dicts that are stored at the + leaves. + """ + + pass + + class TreeCache: """ Tree-based backing store for LruCache. Allows subtrees of data to be deleted efficiently. Keys must be tuples. + + The data structure is a chain of TreeCacheNodes: + root = {key_1: {key_2: _value}} """ def __init__(self): self.size = 0 - self.root = {} # type: Dict + self.root = TreeCacheNode() def __setitem__(self, key, value): return self.set(key, value) @@ -21,10 +46,23 @@ def __contains__(self, key): return self.get(key, SENTINEL) is not SENTINEL def set(self, key, value): + if isinstance(value, TreeCacheNode): + # this would mean we couldn't tell where our tree ended and the value + # started. + raise ValueError("Cannot store TreeCacheNodes in a TreeCache") + node = self.root for k in key[:-1]: - node = node.setdefault(k, {}) - node[key[-1]] = _Entry(value) + next_node = node.get(k, SENTINEL) + if next_node is SENTINEL: + next_node = node[k] = TreeCacheNode() + elif not isinstance(next_node, TreeCacheNode): + # this suggests that the caller is not being consistent with its key + # length. + raise ValueError("value conflicts with an existing subtree") + node = next_node + + node[key[-1]] = value self.size += 1 def get(self, key, default=None): @@ -33,25 +71,41 @@ def get(self, key, default=None): node = node.get(k, None) if node is None: return default - return node.get(key[-1], _Entry(default)).value + return node.get(key[-1], default) def clear(self): self.size = 0 - self.root = {} + self.root = TreeCacheNode() def pop(self, key, default=None): + """Remove the given key, or subkey, from the cache + + Args: + key: key or subkey to remove. + default: value to return if key is not found + + Returns: + If the key is not found, 'default'. If the key is complete, the removed + value. If the key is partial, the TreeCacheNode corresponding to the part + of the tree that was removed. + """ + # a list of the nodes we have touched on the way down the tree nodes = [] node = self.root for k in key[:-1]: node = node.get(k, None) - nodes.append(node) # don't add the root node if node is None: return default + if not isinstance(node, TreeCacheNode): + # we've gone off the end of the tree + raise ValueError("pop() key too long") + nodes.append(node) # don't add the root node popped = node.pop(key[-1], SENTINEL) if popped is SENTINEL: return default + # working back up the tree, clear out any nodes that are now empty node_and_keys = list(zip(nodes, key)) node_and_keys.reverse() node_and_keys.append((self.root, None)) @@ -61,14 +115,15 @@ def pop(self, key, default=None): if n: break + # found an empty node: remove it from its parent, and loop. node_and_keys[i + 1][0].pop(k) - popped, cnt = _strip_and_count_entires(popped) + cnt = sum(1 for _ in iterate_tree_cache_entry(popped)) self.size -= cnt return popped def values(self): - return list(iterate_tree_cache_entry(self.root)) + return iterate_tree_cache_entry(self.root) def __len__(self): return self.size @@ -78,36 +133,9 @@ def iterate_tree_cache_entry(d): """Helper function to iterate over the leaves of a tree, i.e. a dict of that can contain dicts. """ - if isinstance(d, dict): + if isinstance(d, TreeCacheNode): for value_d in d.values(): for value in iterate_tree_cache_entry(value_d): yield value else: - if isinstance(d, _Entry): - yield d.value - else: - yield d - - -class _Entry: - __slots__ = ["value"] - - def __init__(self, value): - self.value = value - - -def _strip_and_count_entires(d): - """Takes an _Entry or dict with leaves of _Entry's, and either returns the - value or a dictionary with _Entry's replaced by their values. - - Also returns the count of _Entry's - """ - if isinstance(d, dict): - cnt = 0 - for key, value in d.items(): - v, n = _strip_and_count_entires(value) - d[key] = v - cnt += n - return d, cnt - else: - return d.value, 1 + yield d diff --git a/tests/util/test_lrucache.py b/tests/util/test_lrucache.py index df3e27779fd2..377904e72e6c 100644 --- a/tests/util/test_lrucache.py +++ b/tests/util/test_lrucache.py @@ -59,7 +59,7 @@ def test_pop(self): self.assertEquals(cache.pop("key"), None) def test_del_multi(self): - cache = LruCache(4, keylen=2, cache_type=TreeCache) + cache = LruCache(4, cache_type=TreeCache) cache[("animal", "cat")] = "mew" cache[("animal", "dog")] = "woof" cache[("vehicles", "car")] = "vroom" @@ -165,7 +165,7 @@ def test_del_multi(self): m2 = Mock() m3 = Mock() m4 = Mock() - cache = LruCache(4, keylen=2, cache_type=TreeCache) + cache = LruCache(4, cache_type=TreeCache) cache.set(("a", "1"), "value", callbacks=[m1]) cache.set(("a", "2"), "value", callbacks=[m2]) diff --git a/tests/util/test_treecache.py b/tests/util/test_treecache.py index 3b077af27e90..60663720536d 100644 --- a/tests/util/test_treecache.py +++ b/tests/util/test_treecache.py @@ -13,7 +13,7 @@ # limitations under the License. -from synapse.util.caches.treecache import TreeCache +from synapse.util.caches.treecache import TreeCache, iterate_tree_cache_entry from .. import unittest @@ -64,12 +64,14 @@ def test_pop_mixedlevel(self): cache[("a", "b")] = "AB" cache[("b", "a")] = "BA" self.assertEquals(cache.get(("a", "a")), "AA") - cache.pop(("a",)) + popped = cache.pop(("a",)) self.assertEquals(cache.get(("a", "a")), None) self.assertEquals(cache.get(("a", "b")), None) self.assertEquals(cache.get(("b", "a")), "BA") self.assertEquals(len(cache), 1) + self.assertEquals({"AA", "AB"}, set(iterate_tree_cache_entry(popped))) + def test_clear(self): cache = TreeCache() cache[("a",)] = "A" From daca7b2794fb86514dc01de551bb0e5db77cf914 Mon Sep 17 00:00:00 2001 From: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> Date: Mon, 24 May 2021 14:03:00 +0100 Subject: [PATCH 32/52] Fix off-by-one-error in synapse_port_db (#9991) fixes #9979 --- .buildkite/postgres-config.yaml | 6 ++---- .buildkite/scripts/test_synapse_port_db.sh | 4 ++++ .buildkite/sqlite-config.yaml | 6 ++---- changelog.d/9991.bugfix | 1 + scripts/synapse_port_db | 2 +- 5 files changed, 10 insertions(+), 9 deletions(-) create mode 100644 changelog.d/9991.bugfix diff --git a/.buildkite/postgres-config.yaml b/.buildkite/postgres-config.yaml index 2acbe66f4caf..67e17fa9d1de 100644 --- a/.buildkite/postgres-config.yaml +++ b/.buildkite/postgres-config.yaml @@ -3,7 +3,7 @@ # CI's Docker setup at the point where this file is considered. server_name: "localhost:8800" -signing_key_path: "/src/.buildkite/test.signing.key" +signing_key_path: ".buildkite/test.signing.key" report_stats: false @@ -16,6 +16,4 @@ database: database: synapse # Suppress the key server warning. -trusted_key_servers: - - server_name: "matrix.org" -suppress_key_server_warning: true +trusted_key_servers: [] diff --git a/.buildkite/scripts/test_synapse_port_db.sh b/.buildkite/scripts/test_synapse_port_db.sh index a7e245476937..82d7d56d4e9f 100755 --- a/.buildkite/scripts/test_synapse_port_db.sh +++ b/.buildkite/scripts/test_synapse_port_db.sh @@ -33,6 +33,10 @@ scripts-dev/update_database --database-config .buildkite/sqlite-config.yaml echo "+++ Run synapse_port_db against test database" coverage run scripts/synapse_port_db --sqlite-database .buildkite/test_db.db --postgres-config .buildkite/postgres-config.yaml +# We should be able to run twice against the same database. +echo "+++ Run synapse_port_db a second time" +coverage run scripts/synapse_port_db --sqlite-database .buildkite/test_db.db --postgres-config .buildkite/postgres-config.yaml + ##### # Now do the same again, on an empty database. diff --git a/.buildkite/sqlite-config.yaml b/.buildkite/sqlite-config.yaml index 6d9bf80d844b..d16459cfd947 100644 --- a/.buildkite/sqlite-config.yaml +++ b/.buildkite/sqlite-config.yaml @@ -3,7 +3,7 @@ # schema and run background updates on it. server_name: "localhost:8800" -signing_key_path: "/src/.buildkite/test.signing.key" +signing_key_path: ".buildkite/test.signing.key" report_stats: false @@ -13,6 +13,4 @@ database: database: ".buildkite/test_db.db" # Suppress the key server warning. -trusted_key_servers: - - server_name: "matrix.org" -suppress_key_server_warning: true +trusted_key_servers: [] diff --git a/changelog.d/9991.bugfix b/changelog.d/9991.bugfix new file mode 100644 index 000000000000..665ff04dea84 --- /dev/null +++ b/changelog.d/9991.bugfix @@ -0,0 +1 @@ +Fix a bug introduced in v1.26.0 which meant that `synapse_port_db` would not correctly initialise some postgres sequences, requiring manual updates afterwards. diff --git a/scripts/synapse_port_db b/scripts/synapse_port_db index 7c7645c05a63..86eb76cbca7a 100755 --- a/scripts/synapse_port_db +++ b/scripts/synapse_port_db @@ -959,7 +959,7 @@ class Porter(object): def r(txn): txn.execute( "ALTER SEQUENCE event_auth_chain_id RESTART WITH %s", - (curr_chain_id,), + (curr_chain_id + 1,), ) if curr_chain_id is not None: From 82eacb0e071657e796952638fe8da90cdb94f2a1 Mon Sep 17 00:00:00 2001 From: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> Date: Mon, 24 May 2021 14:03:30 +0100 Subject: [PATCH 33/52] Fix --no-daemonize for synctl with workers (#9995) --- changelog.d/9995.bugfix | 1 + synctl | 102 +++++++++++++--------------------------- 2 files changed, 33 insertions(+), 70 deletions(-) create mode 100644 changelog.d/9995.bugfix diff --git a/changelog.d/9995.bugfix b/changelog.d/9995.bugfix new file mode 100644 index 000000000000..3b63e7c42aed --- /dev/null +++ b/changelog.d/9995.bugfix @@ -0,0 +1 @@ +Fix `synctl`'s `--no-daemonize` parameter to work correctly with worker processes. diff --git a/synctl b/synctl index ccf404accb45..6ce19918d212 100755 --- a/synctl +++ b/synctl @@ -24,12 +24,13 @@ import signal import subprocess import sys import time +from typing import Iterable import yaml from synapse.config import find_config_files -SYNAPSE = [sys.executable, "-m", "synapse.app.homeserver"] +MAIN_PROCESS = "synapse.app.homeserver" GREEN = "\x1b[1;32m" YELLOW = "\x1b[1;33m" @@ -68,71 +69,37 @@ def abort(message, colour=RED, stream=sys.stderr): sys.exit(1) -def start(configfile: str, daemonize: bool = True) -> bool: - """Attempts to start synapse. +def start(pidfile: str, app: str, config_files: Iterable[str], daemonize: bool) -> bool: + """Attempts to start a synapse main or worker process. Args: - configfile: path to a yaml synapse config file - daemonize: whether to daemonize synapse or keep it attached to the current - session + pidfile: the pidfile we expect the process to create + app: the python module to run + config_files: config files to pass to synapse + daemonize: if True, will include a --daemonize argument to synapse Returns: - True if the process started successfully + True if the process started successfully or was already running False if there was an error starting the process - - If deamonize is False it will only return once synapse exits. """ - write("Starting ...") - args = SYNAPSE - - if daemonize: - args.extend(["--daemonize", "-c", configfile]) - else: - args.extend(["-c", configfile]) - - try: - subprocess.check_call(args) - write("started synapse.app.homeserver(%r)" % (configfile,), colour=GREEN) + if os.path.exists(pidfile) and pid_running(int(open(pidfile).read())): + print(app + " already running") return True - except subprocess.CalledProcessError as e: - write( - "error starting (exit code: %d); see above for logs" % e.returncode, - colour=RED, - ) - return False - -def start_worker(app: str, configfile: str, worker_configfile: str) -> bool: - """Attempts to start a synapse worker. - Args: - app: name of the worker's appservice - configfile: path to a yaml synapse config file - worker_configfile: path to worker specific yaml synapse file - - Returns: - True if the process started successfully - False if there was an error starting the process - """ - - args = [ - sys.executable, - "-m", - app, - "-c", - configfile, - "-c", - worker_configfile, - "--daemonize", - ] + args = [sys.executable, "-m", app] + for c in config_files: + args += ["-c", c] + if daemonize: + args.append("--daemonize") try: subprocess.check_call(args) - write("started %s(%r)" % (app, worker_configfile), colour=GREEN) + write("started %s(%s)" % (app, ",".join(config_files)), colour=GREEN) return True except subprocess.CalledProcessError as e: write( - "error starting %s(%r) (exit code: %d); see above for logs" - % (app, worker_configfile, e.returncode), + "error starting %s(%s) (exit code: %d); see above for logs" + % (app, ",".join(config_files), e.returncode), colour=RED, ) return False @@ -224,10 +191,11 @@ def main(): if not os.path.exists(configfile): write( - "No config file found\n" - "To generate a config file, run '%s -c %s --generate-config" - " --server-name= --report-stats='\n" - % (" ".join(SYNAPSE), options.configfile), + f"Config file {configfile} does not exist.\n" + f"To generate a config file, run:\n" + f" {sys.executable} -m {MAIN_PROCESS}" + f" -c {configfile} --generate-config" + f" --server-name= --report-stats=\n", stream=sys.stderr, ) sys.exit(1) @@ -323,7 +291,7 @@ def main(): has_stopped = False if start_stop_synapse: - if not stop(pidfile, "synapse.app.homeserver"): + if not stop(pidfile, MAIN_PROCESS): has_stopped = False if not has_stopped and action == "stop": sys.exit(1) @@ -346,30 +314,24 @@ def main(): if action == "start" or action == "restart": error = False if start_stop_synapse: - # Check if synapse is already running - if os.path.exists(pidfile) and pid_running(int(open(pidfile).read())): - abort("synapse.app.homeserver already running") - - if not start(configfile, bool(options.daemonize)): + if not start(pidfile, MAIN_PROCESS, (configfile,), options.daemonize): error = True for worker in workers: env = os.environ.copy() - # Skip starting a worker if its already running - if os.path.exists(worker.pidfile) and pid_running( - int(open(worker.pidfile).read()) - ): - print(worker.app + " already running") - continue - if worker.cache_factor: os.environ["SYNAPSE_CACHE_FACTOR"] = str(worker.cache_factor) for cache_name, factor in worker.cache_factors.items(): os.environ["SYNAPSE_CACHE_FACTOR_" + cache_name.upper()] = str(factor) - if not start_worker(worker.app, configfile, worker.configfile): + if not start( + worker.pidfile, + worker.app, + (configfile, worker.configfile), + options.daemonize, + ): error = True # Reset env back to the original From 057ce7b75406dc97be8ff2c890c47fd9357b0773 Mon Sep 17 00:00:00 2001 From: Jerin J Titus <72017981+jerinjtitus@users.noreply.github.com> Date: Mon, 24 May 2021 22:13:30 +0530 Subject: [PATCH 34/52] Remove tls_fingerprints option (#9280) Signed-off-by: Jerin J Titus <72017981+jerinjtitus@users.noreply.github.com> --- changelog.d/9280.removal | 1 + docs/sample_config.yaml | 27 ------------ scripts-dev/convert_server_keys.py | 7 --- synapse/config/tls.py | 50 ---------------------- synapse/rest/key/v2/local_key_resource.py | 8 ---- synapse/rest/key/v2/remote_key_resource.py | 3 -- 6 files changed, 1 insertion(+), 95 deletions(-) create mode 100644 changelog.d/9280.removal diff --git a/changelog.d/9280.removal b/changelog.d/9280.removal new file mode 100644 index 000000000000..c2ed3d308dda --- /dev/null +++ b/changelog.d/9280.removal @@ -0,0 +1 @@ +Removed support for the deprecated `tls_fingerprints` configuration setting. Contributed by Jerin J Titus. \ No newline at end of file diff --git a/docs/sample_config.yaml b/docs/sample_config.yaml index f0f9f06a6ee3..6576b153d0eb 100644 --- a/docs/sample_config.yaml +++ b/docs/sample_config.yaml @@ -683,33 +683,6 @@ acme: # account_key_file: DATADIR/acme_account.key -# List of allowed TLS fingerprints for this server to publish along -# with the signing keys for this server. Other matrix servers that -# make HTTPS requests to this server will check that the TLS -# certificates returned by this server match one of the fingerprints. -# -# Synapse automatically adds the fingerprint of its own certificate -# to the list. So if federation traffic is handled directly by synapse -# then no modification to the list is required. -# -# If synapse is run behind a load balancer that handles the TLS then it -# will be necessary to add the fingerprints of the certificates used by -# the loadbalancers to this list if they are different to the one -# synapse is using. -# -# Homeservers are permitted to cache the list of TLS fingerprints -# returned in the key responses up to the "valid_until_ts" returned in -# key. It may be necessary to publish the fingerprints of a new -# certificate and wait until the "valid_until_ts" of the previous key -# responses have passed before deploying it. -# -# You can calculate a fingerprint from a given TLS listener via: -# openssl s_client -connect $host:$port < /dev/null 2> /dev/null | -# openssl x509 -outform DER | openssl sha256 -binary | base64 | tr -d '=' -# or by checking matrix.org/federationtester/api/report?server_name=$host -# -#tls_fingerprints: [{"sha256": ""}] - ## Federation ## diff --git a/scripts-dev/convert_server_keys.py b/scripts-dev/convert_server_keys.py index 961dc59f1134..d4314a054c44 100644 --- a/scripts-dev/convert_server_keys.py +++ b/scripts-dev/convert_server_keys.py @@ -1,4 +1,3 @@ -import hashlib import json import sys import time @@ -54,15 +53,9 @@ def convert_v1_to_v2(server_name, valid_until, keys, certificate): "server_name": server_name, "verify_keys": {key_id: {"key": key} for key_id, key in keys.items()}, "valid_until_ts": valid_until, - "tls_fingerprints": [fingerprint(certificate)], } -def fingerprint(certificate): - finger = hashlib.sha256(certificate) - return {"sha256": encode_base64(finger.digest())} - - def rows_v2(server, json): valid_until = json["valid_until_ts"] key_json = encode_canonical_json(json) diff --git a/synapse/config/tls.py b/synapse/config/tls.py index 7df4e4c3e6b4..26f1150ca506 100644 --- a/synapse/config/tls.py +++ b/synapse/config/tls.py @@ -16,11 +16,8 @@ import os import warnings from datetime import datetime -from hashlib import sha256 from typing import List, Optional, Pattern -from unpaddedbase64 import encode_base64 - from OpenSSL import SSL, crypto from twisted.internet._sslverify import Certificate, trustRootFromCertificates @@ -83,13 +80,6 @@ def read_config(self, config: dict, config_dir_path: str, **kwargs): "configured." ) - self._original_tls_fingerprints = config.get("tls_fingerprints", []) - - if self._original_tls_fingerprints is None: - self._original_tls_fingerprints = [] - - self.tls_fingerprints = list(self._original_tls_fingerprints) - # Whether to verify certificates on outbound federation traffic self.federation_verify_certificates = config.get( "federation_verify_certificates", True @@ -248,19 +238,6 @@ def read_certificate_from_disk(self, require_cert_and_key: bool): e, ) - self.tls_fingerprints = list(self._original_tls_fingerprints) - - if self.tls_certificate: - # Check that our own certificate is included in the list of fingerprints - # and include it if it is not. - x509_certificate_bytes = crypto.dump_certificate( - crypto.FILETYPE_ASN1, self.tls_certificate - ) - sha256_fingerprint = encode_base64(sha256(x509_certificate_bytes).digest()) - sha256_fingerprints = {f["sha256"] for f in self.tls_fingerprints} - if sha256_fingerprint not in sha256_fingerprints: - self.tls_fingerprints.append({"sha256": sha256_fingerprint}) - def generate_config_section( self, config_dir_path, @@ -443,33 +420,6 @@ def generate_config_section( # If unspecified, we will use CONFDIR/client.key. # account_key_file: %(default_acme_account_file)s - - # List of allowed TLS fingerprints for this server to publish along - # with the signing keys for this server. Other matrix servers that - # make HTTPS requests to this server will check that the TLS - # certificates returned by this server match one of the fingerprints. - # - # Synapse automatically adds the fingerprint of its own certificate - # to the list. So if federation traffic is handled directly by synapse - # then no modification to the list is required. - # - # If synapse is run behind a load balancer that handles the TLS then it - # will be necessary to add the fingerprints of the certificates used by - # the loadbalancers to this list if they are different to the one - # synapse is using. - # - # Homeservers are permitted to cache the list of TLS fingerprints - # returned in the key responses up to the "valid_until_ts" returned in - # key. It may be necessary to publish the fingerprints of a new - # certificate and wait until the "valid_until_ts" of the previous key - # responses have passed before deploying it. - # - # You can calculate a fingerprint from a given TLS listener via: - # openssl s_client -connect $host:$port < /dev/null 2> /dev/null | - # openssl x509 -outform DER | openssl sha256 -binary | base64 | tr -d '=' - # or by checking matrix.org/federationtester/api/report?server_name=$host - # - #tls_fingerprints: [{"sha256": ""}] """ # Lowercase the string representation of boolean values % { diff --git a/synapse/rest/key/v2/local_key_resource.py b/synapse/rest/key/v2/local_key_resource.py index e8dbe240d81e..a5fcd15e3af4 100644 --- a/synapse/rest/key/v2/local_key_resource.py +++ b/synapse/rest/key/v2/local_key_resource.py @@ -48,11 +48,6 @@ class LocalKey(Resource): "key": # base64 encoded NACL verification key. } }, - "tls_fingerprints": [ # Fingerprints of the TLS certs this server uses. - { - "sha256": # base64 encoded sha256 fingerprint of the X509 cert - }, - ], "signatures": { "this.server.example.com": { "algorithm:version": # NACL signature for this server @@ -89,14 +84,11 @@ def response_json_object(self): "expired_ts": key.expired_ts, } - tls_fingerprints = self.config.tls_fingerprints - json_object = { "valid_until_ts": self.valid_until_ts, "server_name": self.config.server_name, "verify_keys": verify_keys, "old_verify_keys": old_verify_keys, - "tls_fingerprints": tls_fingerprints, } for key in self.config.signing_key: json_object = sign_json(json_object, self.config.server_name, key) diff --git a/synapse/rest/key/v2/remote_key_resource.py b/synapse/rest/key/v2/remote_key_resource.py index f648678b0903..aba1734a55ac 100644 --- a/synapse/rest/key/v2/remote_key_resource.py +++ b/synapse/rest/key/v2/remote_key_resource.py @@ -73,9 +73,6 @@ class RemoteKey(DirectServeJsonResource): "expired_ts": 0, # when the key stop being used. } } - "tls_fingerprints": [ - { "sha256": # fingerprint } - ] "signatures": { "remote.server.example.com": {...} "this.server.example.com": {...} From 22a8838f626834c4ffc09761f4c5d65215cfc885 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Sergio=20Migu=C3=A9ns?= Date: Mon, 24 May 2021 21:23:54 +0200 Subject: [PATCH 35/52] Fix docker image to not log at `/homeserver.log` (#10045) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Fixes #9970 Signed-off-by: Sergio Miguéns Iglesias lonyelon@lony.xyz --- changelog.d/10045.docker | 1 + docker/conf/log.config | 4 +++- 2 files changed, 4 insertions(+), 1 deletion(-) create mode 100644 changelog.d/10045.docker diff --git a/changelog.d/10045.docker b/changelog.d/10045.docker new file mode 100644 index 000000000000..70b65b0a01a4 --- /dev/null +++ b/changelog.d/10045.docker @@ -0,0 +1 @@ +Fix bug introduced in Synapse 1.33.0 which caused a `Permission denied: '/homeserver.log'` error when starting Synapse with the generated log configuration. Contributed by Sergio Miguéns Iglesias. diff --git a/docker/conf/log.config b/docker/conf/log.config index 34572bc0f36a..a99462692628 100644 --- a/docker/conf/log.config +++ b/docker/conf/log.config @@ -9,10 +9,11 @@ formatters: {% endif %} handlers: +{% if LOG_FILE_PATH %} file: class: logging.handlers.TimedRotatingFileHandler formatter: precise - filename: {{ LOG_FILE_PATH or "homeserver.log" }} + filename: {{ LOG_FILE_PATH }} when: "midnight" backupCount: 6 # Does not include the current log file. encoding: utf8 @@ -29,6 +30,7 @@ handlers: # be written to disk. capacity: 10 flushLevel: 30 # Flush for WARNING logs as well +{% endif %} console: class: logging.StreamHandler From 7adcb20fc02d614b4a2b03b128b279f25633e2bd Mon Sep 17 00:00:00 2001 From: Patrick Cloke Date: Mon, 24 May 2021 15:32:01 -0400 Subject: [PATCH 36/52] Add missing type hints to synapse.util (#9982) --- changelog.d/9982.misc | 1 + mypy.ini | 9 +++++++++ synapse/config/saml2.py | 8 +++++++- synapse/storage/databases/main/keys.py | 2 +- synapse/util/hash.py | 10 +++++----- synapse/util/iterutils.py | 11 ++++------- synapse/util/module_loader.py | 9 +++++---- synapse/util/msisdn.py | 10 +++++----- tests/util/test_itertools.py | 4 ++-- 9 files changed, 39 insertions(+), 25 deletions(-) create mode 100644 changelog.d/9982.misc diff --git a/changelog.d/9982.misc b/changelog.d/9982.misc new file mode 100644 index 000000000000..f3821f61a32a --- /dev/null +++ b/changelog.d/9982.misc @@ -0,0 +1 @@ +Add missing type hints to `synapse.util` module. diff --git a/mypy.ini b/mypy.ini index 1d1d1ea0f25f..062872020e4c 100644 --- a/mypy.ini +++ b/mypy.ini @@ -71,8 +71,13 @@ files = synapse/types.py, synapse/util/async_helpers.py, synapse/util/caches, + synapse/util/daemonize.py, + synapse/util/hash.py, + synapse/util/iterutils.py, synapse/util/metrics.py, synapse/util/macaroons.py, + synapse/util/module_loader.py, + synapse/util/msisdn.py, synapse/util/stringutils.py, synapse/visibility.py, tests/replication, @@ -80,6 +85,7 @@ files = tests/handlers/test_password_providers.py, tests/rest/client/v1/test_login.py, tests/rest/client/v2_alpha/test_auth.py, + tests/util/test_itertools.py, tests/util/test_stream_change_cache.py [mypy-pymacaroons.*] @@ -175,5 +181,8 @@ ignore_missing_imports = True [mypy-pympler.*] ignore_missing_imports = True +[mypy-phonenumbers.*] +ignore_missing_imports = True + [mypy-ijson.*] ignore_missing_imports = True diff --git a/synapse/config/saml2.py b/synapse/config/saml2.py index 3d1218c8d14b..05e983625dc8 100644 --- a/synapse/config/saml2.py +++ b/synapse/config/saml2.py @@ -164,7 +164,13 @@ def read_config(self, config, **kwargs): config_path = saml2_config.get("config_path", None) if config_path is not None: mod = load_python_module(config_path) - _dict_merge(merge_dict=mod.CONFIG, into_dict=saml2_config_dict) + config = getattr(mod, "CONFIG", None) + if config is None: + raise ConfigError( + "Config path specified by saml2_config.config_path does not " + "have a CONFIG property." + ) + _dict_merge(merge_dict=config, into_dict=saml2_config_dict) import saml2.config diff --git a/synapse/storage/databases/main/keys.py b/synapse/storage/databases/main/keys.py index 0e8680783404..6990f3ed1d27 100644 --- a/synapse/storage/databases/main/keys.py +++ b/synapse/storage/databases/main/keys.py @@ -55,7 +55,7 @@ async def get_server_verify_keys( """ keys = {} - def _get_keys(txn: Cursor, batch: Tuple[Tuple[str, str]]) -> None: + def _get_keys(txn: Cursor, batch: Tuple[Tuple[str, str], ...]) -> None: """Processes a batch of keys to fetch, and adds the result to `keys`.""" # batch_iter always returns tuples so it's safe to do len(batch) diff --git a/synapse/util/hash.py b/synapse/util/hash.py index ba676e176240..7625ca8c2c63 100644 --- a/synapse/util/hash.py +++ b/synapse/util/hash.py @@ -17,15 +17,15 @@ import unpaddedbase64 -def sha256_and_url_safe_base64(input_text): +def sha256_and_url_safe_base64(input_text: str) -> str: """SHA256 hash an input string, encode the digest as url-safe base64, and return - :param input_text: string to hash - :type input_text: str + Args: + input_text: string to hash - :returns a sha256 hashed and url-safe base64 encoded digest - :rtype: str + returns: + A sha256 hashed and url-safe base64 encoded digest """ digest = hashlib.sha256(input_text.encode()).digest() return unpaddedbase64.encode_base64(digest, urlsafe=True) diff --git a/synapse/util/iterutils.py b/synapse/util/iterutils.py index abfdc2983261..886afa9d1931 100644 --- a/synapse/util/iterutils.py +++ b/synapse/util/iterutils.py @@ -30,12 +30,12 @@ T = TypeVar("T") -def batch_iter(iterable: Iterable[T], size: int) -> Iterator[Tuple[T]]: +def batch_iter(iterable: Iterable[T], size: int) -> Iterator[Tuple[T, ...]]: """batch an iterable up into tuples with a maximum size Args: - iterable (iterable): the iterable to slice - size (int): the maximum batch size + iterable: the iterable to slice + size: the maximum batch size Returns: an iterator over the chunks @@ -46,10 +46,7 @@ def batch_iter(iterable: Iterable[T], size: int) -> Iterator[Tuple[T]]: return iter(lambda: tuple(islice(sourceiter, size)), ()) -ISeq = TypeVar("ISeq", bound=Sequence, covariant=True) - - -def chunk_seq(iseq: ISeq, maxlen: int) -> Iterable[ISeq]: +def chunk_seq(iseq: Sequence[T], maxlen: int) -> Iterable[Sequence[T]]: """Split the given sequence into chunks of the given size The last chunk may be shorter than the given size. diff --git a/synapse/util/module_loader.py b/synapse/util/module_loader.py index 8acbe276e4bc..cbfbd097f9a1 100644 --- a/synapse/util/module_loader.py +++ b/synapse/util/module_loader.py @@ -15,6 +15,7 @@ import importlib import importlib.util import itertools +from types import ModuleType from typing import Any, Iterable, Tuple, Type import jsonschema @@ -44,8 +45,8 @@ def load_module(provider: dict, config_path: Iterable[str]) -> Tuple[Type, Any]: # We need to import the module, and then pick the class out of # that, so we split based on the last dot. - module, clz = modulename.rsplit(".", 1) - module = importlib.import_module(module) + module_name, clz = modulename.rsplit(".", 1) + module = importlib.import_module(module_name) provider_class = getattr(module, clz) # Load the module config. If None, pass an empty dictionary instead @@ -69,11 +70,11 @@ def load_module(provider: dict, config_path: Iterable[str]) -> Tuple[Type, Any]: return provider_class, provider_config -def load_python_module(location: str): +def load_python_module(location: str) -> ModuleType: """Load a python module, and return a reference to its global namespace Args: - location (str): path to the module + location: path to the module Returns: python module object diff --git a/synapse/util/msisdn.py b/synapse/util/msisdn.py index bbbdebf2648e..1046224f1598 100644 --- a/synapse/util/msisdn.py +++ b/synapse/util/msisdn.py @@ -17,19 +17,19 @@ from synapse.api.errors import SynapseError -def phone_number_to_msisdn(country, number): +def phone_number_to_msisdn(country: str, number: str) -> str: """ Takes an ISO-3166-1 2 letter country code and phone number and returns an msisdn representing the canonical version of that phone number. Args: - country (str): ISO-3166-1 2 letter country code - number (str): Phone number in a national or international format + country: ISO-3166-1 2 letter country code + number: Phone number in a national or international format Returns: - (str) The canonical form of the phone number, as an msisdn + The canonical form of the phone number, as an msisdn Raises: - SynapseError if the number could not be parsed. + SynapseError if the number could not be parsed. """ try: phoneNumber = phonenumbers.parse(number, country) diff --git a/tests/util/test_itertools.py b/tests/util/test_itertools.py index 1bd0b45d940a..e712eb42ea4b 100644 --- a/tests/util/test_itertools.py +++ b/tests/util/test_itertools.py @@ -11,7 +11,7 @@ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. -from typing import Dict, List +from typing import Dict, Iterable, List, Sequence from synapse.util.iterutils import chunk_seq, sorted_topologically @@ -44,7 +44,7 @@ def test_uneven_parts(self): ) def test_empty_input(self): - parts = chunk_seq([], 5) + parts = chunk_seq([], 5) # type: Iterable[Sequence] self.assertEqual( list(parts), From 7d90d6ce9b6733bdebd4c5aca0c785e28e265e13 Mon Sep 17 00:00:00 2001 From: Patrick Cloke Date: Mon, 24 May 2021 15:32:45 -0400 Subject: [PATCH 37/52] Run complement with Synapse workers manually. (#10039) Adds an option to complement.sh to run Synapse in worker mode (instead of the default monolith mode). --- changelog.d/10039.misc | 1 + docker/configure_workers_and_start.py | 8 ++++---- scripts-dev/complement.sh | 25 ++++++++++++++++++++++--- 3 files changed, 27 insertions(+), 7 deletions(-) create mode 100644 changelog.d/10039.misc diff --git a/changelog.d/10039.misc b/changelog.d/10039.misc new file mode 100644 index 000000000000..8855f141d903 --- /dev/null +++ b/changelog.d/10039.misc @@ -0,0 +1 @@ +Fix running complement tests with Synapse workers. diff --git a/docker/configure_workers_and_start.py b/docker/configure_workers_and_start.py index 4be6afc65d9a..1d22a4d5719d 100755 --- a/docker/configure_workers_and_start.py +++ b/docker/configure_workers_and_start.py @@ -184,18 +184,18 @@ """ NGINX_LOCATION_CONFIG_BLOCK = """ - location ~* {endpoint} { + location ~* {endpoint} {{ proxy_pass {upstream}; proxy_set_header X-Forwarded-For $remote_addr; proxy_set_header X-Forwarded-Proto $scheme; proxy_set_header Host $host; - } + }} """ NGINX_UPSTREAM_CONFIG_BLOCK = """ -upstream {upstream_worker_type} { +upstream {upstream_worker_type} {{ {body} -} +}} """ diff --git a/scripts-dev/complement.sh b/scripts-dev/complement.sh index 1612ab522c33..004396467378 100755 --- a/scripts-dev/complement.sh +++ b/scripts-dev/complement.sh @@ -10,6 +10,9 @@ # checkout by setting the COMPLEMENT_DIR environment variable to the # filepath of a local Complement checkout. # +# By default Synapse is run in monolith mode. This can be overridden by +# setting the WORKERS environment variable. +# # A regular expression of test method names can be supplied as the first # argument to the script. Complement will then only run those tests. If # no regex is supplied, all tests are run. For example; @@ -32,10 +35,26 @@ if [[ -z "$COMPLEMENT_DIR" ]]; then echo "Checkout available at 'complement-master'" fi +# If we're using workers, modify the docker files slightly. +if [[ -n "$WORKERS" ]]; then + BASE_IMAGE=matrixdotorg/synapse-workers + BASE_DOCKERFILE=docker/Dockerfile-workers + export COMPLEMENT_BASE_IMAGE=complement-synapse-workers + COMPLEMENT_DOCKERFILE=SynapseWorkers.Dockerfile + # And provide some more configuration to complement. + export COMPLEMENT_CA=true + export COMPLEMENT_VERSION_CHECK_ITERATIONS=500 +else + BASE_IMAGE=matrixdotorg/synapse + BASE_DOCKERFILE=docker/Dockerfile + export COMPLEMENT_BASE_IMAGE=complement-synapse + COMPLEMENT_DOCKERFILE=Synapse.Dockerfile +fi + # Build the base Synapse image from the local checkout -docker build -t matrixdotorg/synapse -f docker/Dockerfile . +docker build -t $BASE_IMAGE -f "$BASE_DOCKERFILE" . # Build the Synapse monolith image from Complement, based on the above image we just built -docker build -t complement-synapse -f "$COMPLEMENT_DIR/dockerfiles/Synapse.Dockerfile" "$COMPLEMENT_DIR/dockerfiles" +docker build -t $COMPLEMENT_BASE_IMAGE -f "$COMPLEMENT_DIR/dockerfiles/$COMPLEMENT_DOCKERFILE" "$COMPLEMENT_DIR/dockerfiles" cd "$COMPLEMENT_DIR" @@ -46,4 +65,4 @@ if [[ -n "$1" ]]; then fi # Run the tests! -COMPLEMENT_BASE_IMAGE=complement-synapse go test -v -tags synapse_blacklist,msc2946,msc3083 -count=1 $EXTRA_COMPLEMENT_ARGS ./tests +go test -v -tags synapse_blacklist,msc2946,msc3083 -count=1 $EXTRA_COMPLEMENT_ARGS ./tests From 557635f69ab734142ae5889e215ea512f7678f21 Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Tue, 25 May 2021 11:00:13 +0100 Subject: [PATCH 38/52] 1.35.0rc1 --- CHANGES.md | 64 +++++++++++++++++++++++++++++++++++++++ changelog.d/10001.misc | 1 - changelog.d/10002.bugfix | 1 - changelog.d/10007.feature | 1 - changelog.d/10011.feature | 1 - changelog.d/10014.bugfix | 1 - changelog.d/10016.doc | 1 - changelog.d/10017.misc | 1 - changelog.d/10018.misc | 1 - changelog.d/10029.bugfix | 1 - changelog.d/10033.bugfix | 1 - changelog.d/10036.misc | 1 - changelog.d/10038.feature | 1 - changelog.d/10039.misc | 1 - changelog.d/10043.doc | 1 - changelog.d/10045.docker | 1 - changelog.d/10050.misc | 1 - changelog.d/9280.removal | 1 - changelog.d/9803.doc | 1 - changelog.d/9823.misc | 1 - changelog.d/9922.feature | 1 - changelog.d/9958.feature | 1 - changelog.d/9974.misc | 1 - changelog.d/9975.misc | 1 - changelog.d/9977.misc | 1 - changelog.d/9978.feature | 1 - changelog.d/9980.doc | 1 - changelog.d/9981.misc | 1 - changelog.d/9982.misc | 1 - changelog.d/9984.misc | 1 - changelog.d/9985.misc | 1 - changelog.d/9986.misc | 1 - changelog.d/9987.misc | 1 - changelog.d/9988.doc | 1 - changelog.d/9989.doc | 1 - changelog.d/9991.bugfix | 1 - changelog.d/9993.misc | 1 - changelog.d/9995.bugfix | 1 - synapse/__init__.py | 2 +- 39 files changed, 65 insertions(+), 38 deletions(-) delete mode 100644 changelog.d/10001.misc delete mode 100644 changelog.d/10002.bugfix delete mode 100644 changelog.d/10007.feature delete mode 100644 changelog.d/10011.feature delete mode 100644 changelog.d/10014.bugfix delete mode 100644 changelog.d/10016.doc delete mode 100644 changelog.d/10017.misc delete mode 100644 changelog.d/10018.misc delete mode 100644 changelog.d/10029.bugfix delete mode 100644 changelog.d/10033.bugfix delete mode 100644 changelog.d/10036.misc delete mode 100644 changelog.d/10038.feature delete mode 100644 changelog.d/10039.misc delete mode 100644 changelog.d/10043.doc delete mode 100644 changelog.d/10045.docker delete mode 100644 changelog.d/10050.misc delete mode 100644 changelog.d/9280.removal delete mode 100644 changelog.d/9803.doc delete mode 100644 changelog.d/9823.misc delete mode 100644 changelog.d/9922.feature delete mode 100644 changelog.d/9958.feature delete mode 100644 changelog.d/9974.misc delete mode 100644 changelog.d/9975.misc delete mode 100644 changelog.d/9977.misc delete mode 100644 changelog.d/9978.feature delete mode 100644 changelog.d/9980.doc delete mode 100644 changelog.d/9981.misc delete mode 100644 changelog.d/9982.misc delete mode 100644 changelog.d/9984.misc delete mode 100644 changelog.d/9985.misc delete mode 100644 changelog.d/9986.misc delete mode 100644 changelog.d/9987.misc delete mode 100644 changelog.d/9988.doc delete mode 100644 changelog.d/9989.doc delete mode 100644 changelog.d/9991.bugfix delete mode 100644 changelog.d/9993.misc delete mode 100644 changelog.d/9995.bugfix diff --git a/CHANGES.md b/CHANGES.md index 709436da978c..0e451f983c95 100644 --- a/CHANGES.md +++ b/CHANGES.md @@ -1,3 +1,67 @@ +Synapse 1.35.0rc1 (2021-05-25) +============================== + +Features +-------- + +- Add experimental support to allow a user who could join a restricted room to view it in the spaces summary. ([\#9922](https://github.com/matrix-org/synapse/issues/9922), [\#10007](https://github.com/matrix-org/synapse/issues/10007), [\#10038](https://github.com/matrix-org/synapse/issues/10038)) +- Reduce memory usage when joining very large rooms over federation. ([\#9958](https://github.com/matrix-org/synapse/issues/9958)) +- Add a configuration option which allows enabling opentracing by user id. ([\#9978](https://github.com/matrix-org/synapse/issues/9978)) +- Enable experimental support for [MSC2946](https://github.com/matrix-org/matrix-doc/pull/2946) (spaces summary API) and [MSC3083](https://github.com/matrix-org/matrix-doc/pull/3083) (restricted join rules) by default. ([\#10011](https://github.com/matrix-org/synapse/issues/10011)) + + +Bugfixes +-------- + +- Fix a bug introduced in v1.26.0 which meant that `synapse_port_db` would not correctly initialise some postgres sequences, requiring manual updates afterwards. ([\#9991](https://github.com/matrix-org/synapse/issues/9991)) +- Fix `synctl`'s `--no-daemonize` parameter to work correctly with worker processes. ([\#9995](https://github.com/matrix-org/synapse/issues/9995)) +- Fix a validation bug introduced in v1.34.0 in the ordering of spaces in the space summary API. ([\#10002](https://github.com/matrix-org/synapse/issues/10002)) +- Fixed deletion of new presence stream states from database. ([\#10014](https://github.com/matrix-org/synapse/issues/10014), [\#10033](https://github.com/matrix-org/synapse/issues/10033)) +- Fixed a bug with very high resolution image uploads throwing internal server errors. ([\#10029](https://github.com/matrix-org/synapse/issues/10029)) + + +Updates to the Docker image +--------------------------- + +- Fix bug introduced in Synapse 1.33.0 which caused a `Permission denied: '/homeserver.log'` error when starting Synapse with the generated log configuration. Contributed by Sergio Miguéns Iglesias. ([\#10045](https://github.com/matrix-org/synapse/issues/10045)) + + +Improved Documentation +---------------------- + +- Add hardened systemd files as proposed in [#9760](https://github.com/matrix-org/synapse/issues/9760) and added them to `contrib/`. Change the docs to reflect the presence of these files. ([\#9803](https://github.com/matrix-org/synapse/issues/9803)) +- Clarify documentation around SSO mapping providers generating unique IDs and localparts. ([\#9980](https://github.com/matrix-org/synapse/issues/9980)) +- Updates to the PostgreSQL documentation (`postgres.md`). ([\#9988](https://github.com/matrix-org/synapse/issues/9988), [\#9989](https://github.com/matrix-org/synapse/issues/9989)) +- Fix broken link in user directory documentation. Contributed by @junquera. ([\#10016](https://github.com/matrix-org/synapse/issues/10016)) +- Add missing room state entry to the table of contents of room admin API. ([\#10043](https://github.com/matrix-org/synapse/issues/10043)) + + +Deprecations and Removals +------------------------- + +- Removed support for the deprecated `tls_fingerprints` configuration setting. Contributed by Jerin J Titus. ([\#9280](https://github.com/matrix-org/synapse/issues/9280)) + + +Internal Changes +---------------- + +- Allow sending full presence to users via workers other than the one that called `ModuleApi.send_local_online_presence_to`. ([\#9823](https://github.com/matrix-org/synapse/issues/9823)) +- Update comments in the space summary handler. ([\#9974](https://github.com/matrix-org/synapse/issues/9974)) +- Minor enhancements to the `@cachedList` descriptor. ([\#9975](https://github.com/matrix-org/synapse/issues/9975)) +- Split multipart email sending into a dedicated handler. ([\#9977](https://github.com/matrix-org/synapse/issues/9977)) +- Run `black` on files in the `scripts` directory. ([\#9981](https://github.com/matrix-org/synapse/issues/9981)) +- Add missing type hints to `synapse.util` module. ([\#9982](https://github.com/matrix-org/synapse/issues/9982)) +- Simplify a few helper functions. ([\#9984](https://github.com/matrix-org/synapse/issues/9984), [\#9985](https://github.com/matrix-org/synapse/issues/9985), [\#9986](https://github.com/matrix-org/synapse/issues/9986)) +- Remove unnecessary property from SQLBaseStore. ([\#9987](https://github.com/matrix-org/synapse/issues/9987)) +- Remove `keylen` param on `LruCache`. ([\#9993](https://github.com/matrix-org/synapse/issues/9993)) +- Update the Grafana dashboard in `contrib/`. ([\#10001](https://github.com/matrix-org/synapse/issues/10001)) +- Add a batching queue implementation. ([\#10017](https://github.com/matrix-org/synapse/issues/10017)) +- Reduce memory usage when verifying signatures on large numbers of events at once. ([\#10018](https://github.com/matrix-org/synapse/issues/10018)) +- Properly invalidate caches for destination retry timings every (instead of expiring entries every 5 minutes). ([\#10036](https://github.com/matrix-org/synapse/issues/10036)) +- Fix running complement tests with Synapse workers. ([\#10039](https://github.com/matrix-org/synapse/issues/10039)) +- Fix typo in `get_state_ids_for_event` docstring where the return type was incorrect. ([\#10050](https://github.com/matrix-org/synapse/issues/10050)) + + Synapse 1.34.0 (2021-05-17) =========================== diff --git a/changelog.d/10001.misc b/changelog.d/10001.misc deleted file mode 100644 index 8740cc478dda..000000000000 --- a/changelog.d/10001.misc +++ /dev/null @@ -1 +0,0 @@ -Update the Grafana dashboard in `contrib/`. diff --git a/changelog.d/10002.bugfix b/changelog.d/10002.bugfix deleted file mode 100644 index 1fabdad22eee..000000000000 --- a/changelog.d/10002.bugfix +++ /dev/null @@ -1 +0,0 @@ -Fix a validation bug introduced in v1.34.0 in the ordering of spaces in the space summary API. diff --git a/changelog.d/10007.feature b/changelog.d/10007.feature deleted file mode 100644 index 2c655350c04f..000000000000 --- a/changelog.d/10007.feature +++ /dev/null @@ -1 +0,0 @@ -Experimental support to allow a user who could join a restricted room to view it in the spaces summary. diff --git a/changelog.d/10011.feature b/changelog.d/10011.feature deleted file mode 100644 index 409140fb1367..000000000000 --- a/changelog.d/10011.feature +++ /dev/null @@ -1 +0,0 @@ -Enable experimental support for [MSC2946](https://github.com/matrix-org/matrix-doc/pull/2946) (spaces summary API) and [MSC3083](https://github.com/matrix-org/matrix-doc/pull/3083) (restricted join rules) by default. diff --git a/changelog.d/10014.bugfix b/changelog.d/10014.bugfix deleted file mode 100644 index 7cf3603f94df..000000000000 --- a/changelog.d/10014.bugfix +++ /dev/null @@ -1 +0,0 @@ -Fixed deletion of new presence stream states from database. diff --git a/changelog.d/10016.doc b/changelog.d/10016.doc deleted file mode 100644 index f9b615d7d7ab..000000000000 --- a/changelog.d/10016.doc +++ /dev/null @@ -1 +0,0 @@ -Fix broken link in user directory documentation. Contributed by @junquera. diff --git a/changelog.d/10017.misc b/changelog.d/10017.misc deleted file mode 100644 index 4777b7fb57b1..000000000000 --- a/changelog.d/10017.misc +++ /dev/null @@ -1 +0,0 @@ -Add a batching queue implementation. diff --git a/changelog.d/10018.misc b/changelog.d/10018.misc deleted file mode 100644 index eaf9f64867a6..000000000000 --- a/changelog.d/10018.misc +++ /dev/null @@ -1 +0,0 @@ -Reduce memory usage when verifying signatures on large numbers of events at once. diff --git a/changelog.d/10029.bugfix b/changelog.d/10029.bugfix deleted file mode 100644 index c214cbdaecc1..000000000000 --- a/changelog.d/10029.bugfix +++ /dev/null @@ -1 +0,0 @@ -Fixed a bug with very high resolution image uploads throwing internal server errors. \ No newline at end of file diff --git a/changelog.d/10033.bugfix b/changelog.d/10033.bugfix deleted file mode 100644 index 587d839b8c9d..000000000000 --- a/changelog.d/10033.bugfix +++ /dev/null @@ -1 +0,0 @@ -Fixed deletion of new presence stream states from database. \ No newline at end of file diff --git a/changelog.d/10036.misc b/changelog.d/10036.misc deleted file mode 100644 index d2cf1e54734e..000000000000 --- a/changelog.d/10036.misc +++ /dev/null @@ -1 +0,0 @@ -Properly invalidate caches for destination retry timings every (instead of expiring entries every 5 minutes). diff --git a/changelog.d/10038.feature b/changelog.d/10038.feature deleted file mode 100644 index 2c655350c04f..000000000000 --- a/changelog.d/10038.feature +++ /dev/null @@ -1 +0,0 @@ -Experimental support to allow a user who could join a restricted room to view it in the spaces summary. diff --git a/changelog.d/10039.misc b/changelog.d/10039.misc deleted file mode 100644 index 8855f141d903..000000000000 --- a/changelog.d/10039.misc +++ /dev/null @@ -1 +0,0 @@ -Fix running complement tests with Synapse workers. diff --git a/changelog.d/10043.doc b/changelog.d/10043.doc deleted file mode 100644 index a574ec0bf06c..000000000000 --- a/changelog.d/10043.doc +++ /dev/null @@ -1 +0,0 @@ -Add missing room state entry to the table of contents of room admin API. \ No newline at end of file diff --git a/changelog.d/10045.docker b/changelog.d/10045.docker deleted file mode 100644 index 70b65b0a01a4..000000000000 --- a/changelog.d/10045.docker +++ /dev/null @@ -1 +0,0 @@ -Fix bug introduced in Synapse 1.33.0 which caused a `Permission denied: '/homeserver.log'` error when starting Synapse with the generated log configuration. Contributed by Sergio Miguéns Iglesias. diff --git a/changelog.d/10050.misc b/changelog.d/10050.misc deleted file mode 100644 index 2cac953cca1e..000000000000 --- a/changelog.d/10050.misc +++ /dev/null @@ -1 +0,0 @@ -Fix typo in `get_state_ids_for_event` docstring where the return type was incorrect. diff --git a/changelog.d/9280.removal b/changelog.d/9280.removal deleted file mode 100644 index c2ed3d308dda..000000000000 --- a/changelog.d/9280.removal +++ /dev/null @@ -1 +0,0 @@ -Removed support for the deprecated `tls_fingerprints` configuration setting. Contributed by Jerin J Titus. \ No newline at end of file diff --git a/changelog.d/9803.doc b/changelog.d/9803.doc deleted file mode 100644 index 16c7ba703394..000000000000 --- a/changelog.d/9803.doc +++ /dev/null @@ -1 +0,0 @@ -Add hardened systemd files as proposed in [#9760](https://github.com/matrix-org/synapse/issues/9760) and added them to `contrib/`. Change the docs to reflect the presence of these files. diff --git a/changelog.d/9823.misc b/changelog.d/9823.misc deleted file mode 100644 index bf924ab68cc4..000000000000 --- a/changelog.d/9823.misc +++ /dev/null @@ -1 +0,0 @@ -Allow sending full presence to users via workers other than the one that called `ModuleApi.send_local_online_presence_to`. \ No newline at end of file diff --git a/changelog.d/9922.feature b/changelog.d/9922.feature deleted file mode 100644 index 2c655350c04f..000000000000 --- a/changelog.d/9922.feature +++ /dev/null @@ -1 +0,0 @@ -Experimental support to allow a user who could join a restricted room to view it in the spaces summary. diff --git a/changelog.d/9958.feature b/changelog.d/9958.feature deleted file mode 100644 index d86ba36519f4..000000000000 --- a/changelog.d/9958.feature +++ /dev/null @@ -1 +0,0 @@ -Reduce memory usage when joining very large rooms over federation. diff --git a/changelog.d/9974.misc b/changelog.d/9974.misc deleted file mode 100644 index 9ddee2618e21..000000000000 --- a/changelog.d/9974.misc +++ /dev/null @@ -1 +0,0 @@ -Update comments in the space summary handler. diff --git a/changelog.d/9975.misc b/changelog.d/9975.misc deleted file mode 100644 index 28b1e40c2b99..000000000000 --- a/changelog.d/9975.misc +++ /dev/null @@ -1 +0,0 @@ -Minor enhancements to the `@cachedList` descriptor. diff --git a/changelog.d/9977.misc b/changelog.d/9977.misc deleted file mode 100644 index 093dffc6beac..000000000000 --- a/changelog.d/9977.misc +++ /dev/null @@ -1 +0,0 @@ -Split multipart email sending into a dedicated handler. diff --git a/changelog.d/9978.feature b/changelog.d/9978.feature deleted file mode 100644 index 851adb9f6ea8..000000000000 --- a/changelog.d/9978.feature +++ /dev/null @@ -1 +0,0 @@ -Add a configuration option which allows enabling opentracing by user id. diff --git a/changelog.d/9980.doc b/changelog.d/9980.doc deleted file mode 100644 index d30ed0601da6..000000000000 --- a/changelog.d/9980.doc +++ /dev/null @@ -1 +0,0 @@ -Clarify documentation around SSO mapping providers generating unique IDs and localparts. diff --git a/changelog.d/9981.misc b/changelog.d/9981.misc deleted file mode 100644 index 677c9b4cbdc1..000000000000 --- a/changelog.d/9981.misc +++ /dev/null @@ -1 +0,0 @@ -Run `black` on files in the `scripts` directory. diff --git a/changelog.d/9982.misc b/changelog.d/9982.misc deleted file mode 100644 index f3821f61a32a..000000000000 --- a/changelog.d/9982.misc +++ /dev/null @@ -1 +0,0 @@ -Add missing type hints to `synapse.util` module. diff --git a/changelog.d/9984.misc b/changelog.d/9984.misc deleted file mode 100644 index 97bd747f2653..000000000000 --- a/changelog.d/9984.misc +++ /dev/null @@ -1 +0,0 @@ -Simplify a few helper functions. diff --git a/changelog.d/9985.misc b/changelog.d/9985.misc deleted file mode 100644 index 97bd747f2653..000000000000 --- a/changelog.d/9985.misc +++ /dev/null @@ -1 +0,0 @@ -Simplify a few helper functions. diff --git a/changelog.d/9986.misc b/changelog.d/9986.misc deleted file mode 100644 index 97bd747f2653..000000000000 --- a/changelog.d/9986.misc +++ /dev/null @@ -1 +0,0 @@ -Simplify a few helper functions. diff --git a/changelog.d/9987.misc b/changelog.d/9987.misc deleted file mode 100644 index 02c088e3e6c2..000000000000 --- a/changelog.d/9987.misc +++ /dev/null @@ -1 +0,0 @@ -Remove unnecessary property from SQLBaseStore. diff --git a/changelog.d/9988.doc b/changelog.d/9988.doc deleted file mode 100644 index 25338c44c312..000000000000 --- a/changelog.d/9988.doc +++ /dev/null @@ -1 +0,0 @@ -Updates to the PostgreSQL documentation (`postgres.md`). diff --git a/changelog.d/9989.doc b/changelog.d/9989.doc deleted file mode 100644 index 25338c44c312..000000000000 --- a/changelog.d/9989.doc +++ /dev/null @@ -1 +0,0 @@ -Updates to the PostgreSQL documentation (`postgres.md`). diff --git a/changelog.d/9991.bugfix b/changelog.d/9991.bugfix deleted file mode 100644 index 665ff04dea84..000000000000 --- a/changelog.d/9991.bugfix +++ /dev/null @@ -1 +0,0 @@ -Fix a bug introduced in v1.26.0 which meant that `synapse_port_db` would not correctly initialise some postgres sequences, requiring manual updates afterwards. diff --git a/changelog.d/9993.misc b/changelog.d/9993.misc deleted file mode 100644 index 0dd92440717d..000000000000 --- a/changelog.d/9993.misc +++ /dev/null @@ -1 +0,0 @@ -Remove `keylen` param on `LruCache`. diff --git a/changelog.d/9995.bugfix b/changelog.d/9995.bugfix deleted file mode 100644 index 3b63e7c42aed..000000000000 --- a/changelog.d/9995.bugfix +++ /dev/null @@ -1 +0,0 @@ -Fix `synctl`'s `--no-daemonize` parameter to work correctly with worker processes. diff --git a/synapse/__init__.py b/synapse/__init__.py index 7498a6016f0b..e60e9db71ef8 100644 --- a/synapse/__init__.py +++ b/synapse/__init__.py @@ -47,7 +47,7 @@ except ImportError: pass -__version__ = "1.34.0" +__version__ = "1.35.0rc1" if bool(os.environ.get("SYNAPSE_TEST_PATCH_LOG_CONTEXTS", False)): # We import here so that we don't have to install a bunch of deps when From 8e15c92c2f9e581e59ff68495fa99e998849bb4d Mon Sep 17 00:00:00 2001 From: Patrick Cloke Date: Thu, 27 May 2021 08:52:28 -0400 Subject: [PATCH 39/52] Pass the origin when calculating the spaces summary over GET. (#10079) Fixes a bug due to conflicting PRs which were merged. (One added a new caller to a method, the other added a new parameter to the same method.) --- changelog.d/10079.bugfix | 1 + synapse/federation/transport/server.py | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) create mode 100644 changelog.d/10079.bugfix diff --git a/changelog.d/10079.bugfix b/changelog.d/10079.bugfix new file mode 100644 index 000000000000..2b93c4534abc --- /dev/null +++ b/changelog.d/10079.bugfix @@ -0,0 +1 @@ +Fix a bug introduced in v1.35.0rc1 when calling the spaces summary API via a GET request. diff --git a/synapse/federation/transport/server.py b/synapse/federation/transport/server.py index 9d50b05d0152..40eab4554926 100644 --- a/synapse/federation/transport/server.py +++ b/synapse/federation/transport/server.py @@ -1398,7 +1398,7 @@ async def on_GET( ) return 200, await self.handler.federation_space_summary( - room_id, suggested_only, max_rooms_per_space, exclude_rooms + origin, room_id, suggested_only, max_rooms_per_space, exclude_rooms ) # TODO When switching to the stable endpoint, remove the POST handler. From b1bc26a909f4f9af137e72ab27952a8d2dfc1cb3 Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Thu, 27 May 2021 14:46:24 +0100 Subject: [PATCH 40/52] 1.35.0rc2 --- CHANGES.md | 9 +++++++++ changelog.d/10079.bugfix | 1 - synapse/__init__.py | 2 +- 3 files changed, 10 insertions(+), 2 deletions(-) delete mode 100644 changelog.d/10079.bugfix diff --git a/CHANGES.md b/CHANGES.md index 0e451f983c95..1fac16580d13 100644 --- a/CHANGES.md +++ b/CHANGES.md @@ -1,3 +1,12 @@ +Synapse 1.35.0rc2 (2021-05-27) +============================== + +Bugfixes +-------- + +- Fix a bug introduced in v1.35.0rc1 when calling the spaces summary API via a GET request. ([\#10079](https://github.com/matrix-org/synapse/issues/10079)) + + Synapse 1.35.0rc1 (2021-05-25) ============================== diff --git a/changelog.d/10079.bugfix b/changelog.d/10079.bugfix deleted file mode 100644 index 2b93c4534abc..000000000000 --- a/changelog.d/10079.bugfix +++ /dev/null @@ -1 +0,0 @@ -Fix a bug introduced in v1.35.0rc1 when calling the spaces summary API via a GET request. diff --git a/synapse/__init__.py b/synapse/__init__.py index e60e9db71ef8..6faf31dbbce5 100644 --- a/synapse/__init__.py +++ b/synapse/__init__.py @@ -47,7 +47,7 @@ except ImportError: pass -__version__ = "1.35.0rc1" +__version__ = "1.35.0rc2" if bool(os.environ.get("SYNAPSE_TEST_PATCH_LOG_CONTEXTS", False)): # We import here so that we don't have to install a bunch of deps when From 84cf3e47a0318aba51d9f830d5e724182c5d93c4 Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Fri, 28 May 2021 16:28:01 +0100 Subject: [PATCH 41/52] Allow response of `/send_join` to be larger. (#10093) Fixes #10087. --- changelog.d/10093.bugfix | 1 + synapse/federation/transport/client.py | 7 +++++++ synapse/http/matrixfederationclient.py | 14 +++++++++++++- 3 files changed, 21 insertions(+), 1 deletion(-) create mode 100644 changelog.d/10093.bugfix diff --git a/changelog.d/10093.bugfix b/changelog.d/10093.bugfix new file mode 100644 index 000000000000..e50de4b2ea2c --- /dev/null +++ b/changelog.d/10093.bugfix @@ -0,0 +1 @@ +Fix HTTP response size limit to allow joining very large rooms over federation. diff --git a/synapse/federation/transport/client.py b/synapse/federation/transport/client.py index e93ab83f7ff4..5b4f5d17f774 100644 --- a/synapse/federation/transport/client.py +++ b/synapse/federation/transport/client.py @@ -35,6 +35,11 @@ logger = logging.getLogger(__name__) +# Send join responses can be huge, so we set a separate limit here. The response +# is parsed in a streaming manner, which helps alleviate the issue of memory +# usage a bit. +MAX_RESPONSE_SIZE_SEND_JOIN = 500 * 1024 * 1024 + class TransportLayerClient: """Sends federation HTTP requests to other servers""" @@ -261,6 +266,7 @@ async def send_join_v1( path=path, data=content, parser=SendJoinParser(room_version, v1_api=True), + max_response_size=MAX_RESPONSE_SIZE_SEND_JOIN, ) return response @@ -276,6 +282,7 @@ async def send_join_v2( path=path, data=content, parser=SendJoinParser(room_version, v1_api=False), + max_response_size=MAX_RESPONSE_SIZE_SEND_JOIN, ) return response diff --git a/synapse/http/matrixfederationclient.py b/synapse/http/matrixfederationclient.py index f5503b394b37..1998990a144e 100644 --- a/synapse/http/matrixfederationclient.py +++ b/synapse/http/matrixfederationclient.py @@ -205,6 +205,7 @@ async def _handle_response( response: IResponse, start_ms: int, parser: ByteParser[T], + max_response_size: Optional[int] = None, ) -> T: """ Reads the body of a response with a timeout and sends it to a parser @@ -216,15 +217,20 @@ async def _handle_response( response: response to the request start_ms: Timestamp when request was made parser: The parser for the response + max_response_size: The maximum size to read from the response, if None + uses the default. Returns: The parsed response """ + if max_response_size is None: + max_response_size = MAX_RESPONSE_SIZE + try: check_content_type_is(response.headers, parser.CONTENT_TYPE) - d = read_body_with_max_size(response, parser, MAX_RESPONSE_SIZE) + d = read_body_with_max_size(response, parser, max_response_size) d = timeout_deferred(d, timeout=timeout_sec, reactor=reactor) length = await make_deferred_yieldable(d) @@ -735,6 +741,7 @@ async def put_json( backoff_on_404: bool = False, try_trailing_slash_on_400: bool = False, parser: Literal[None] = None, + max_response_size: Optional[int] = None, ) -> Union[JsonDict, list]: ... @@ -752,6 +759,7 @@ async def put_json( backoff_on_404: bool = False, try_trailing_slash_on_400: bool = False, parser: Optional[ByteParser[T]] = None, + max_response_size: Optional[int] = None, ) -> T: ... @@ -768,6 +776,7 @@ async def put_json( backoff_on_404: bool = False, try_trailing_slash_on_400: bool = False, parser: Optional[ByteParser] = None, + max_response_size: Optional[int] = None, ): """Sends the specified json data using PUT @@ -803,6 +812,8 @@ async def put_json( enabled. parser: The parser to use to decode the response. Defaults to parsing as JSON. + max_response_size: The maximum size to read from the response, if None + uses the default. Returns: Succeeds when we get a 2xx HTTP response. The @@ -853,6 +864,7 @@ async def put_json( response, start_ms, parser=parser, + max_response_size=max_response_size, ) return body From 1641c5c707fe9cac5f68589863082409c8979da6 Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Fri, 28 May 2021 15:57:53 +0100 Subject: [PATCH 42/52] Log method and path when dropping request due to size limit (#10091) --- changelog.d/10091.misc | 1 + synapse/http/site.py | 4 +++- 2 files changed, 4 insertions(+), 1 deletion(-) create mode 100644 changelog.d/10091.misc diff --git a/changelog.d/10091.misc b/changelog.d/10091.misc new file mode 100644 index 000000000000..dbe310fd17a1 --- /dev/null +++ b/changelog.d/10091.misc @@ -0,0 +1 @@ +Log method and path when dropping request due to size limit. diff --git a/synapse/http/site.py b/synapse/http/site.py index 671fd3fbcc29..40754b7bea7e 100644 --- a/synapse/http/site.py +++ b/synapse/http/site.py @@ -105,8 +105,10 @@ def handleContentChunk(self, data): assert self.content, "handleContentChunk() called before gotLength()" if self.content.tell() + len(data) > self._max_request_body_size: logger.warning( - "Aborting connection from %s because the request exceeds maximum size", + "Aborting connection from %s because the request exceeds maximum size: %s %s", self.client, + self.get_method(), + self.get_redacted_uri(), ) self.transport.abortConnection() return From 9408b86f5c3616e8cfaa2c183e787780a3a64f95 Mon Sep 17 00:00:00 2001 From: Brendan Abolivier Date: Thu, 27 May 2021 18:10:58 +0200 Subject: [PATCH 43/52] Limit the number of events sent over replication when persisting events. (#10082) --- changelog.d/10082.bugfix | 1 + synapse/handlers/federation.py | 17 ++++++++++------- 2 files changed, 11 insertions(+), 7 deletions(-) create mode 100644 changelog.d/10082.bugfix diff --git a/changelog.d/10082.bugfix b/changelog.d/10082.bugfix new file mode 100644 index 000000000000..b4f8bcc4fa3c --- /dev/null +++ b/changelog.d/10082.bugfix @@ -0,0 +1 @@ +Fixed a bug causing replication requests to fail when receiving a lot of events via federation. diff --git a/synapse/handlers/federation.py b/synapse/handlers/federation.py index 678f6b770797..bf113152517f 100644 --- a/synapse/handlers/federation.py +++ b/synapse/handlers/federation.py @@ -91,6 +91,7 @@ get_domain_from_id, ) from synapse.util.async_helpers import Linearizer, concurrently_execute +from synapse.util.iterutils import batch_iter from synapse.util.retryutils import NotRetryingDestination from synapse.util.stringutils import shortstr from synapse.visibility import filter_events_for_server @@ -3053,13 +3054,15 @@ async def persist_events_and_notify( """ instance = self.config.worker.events_shard_config.get_instance(room_id) if instance != self._instance_name: - result = await self._send_events( - instance_name=instance, - store=self.store, - room_id=room_id, - event_and_contexts=event_and_contexts, - backfilled=backfilled, - ) + # Limit the number of events sent over federation. + for batch in batch_iter(event_and_contexts, 1000): + result = await self._send_events( + instance_name=instance, + store=self.store, + room_id=room_id, + event_and_contexts=batch, + backfilled=backfilled, + ) return result["max_stream_id"] else: assert self.storage.persistence From 258a9a9e8bea851493fc2275ac6b81639c997afb Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Fri, 28 May 2021 17:06:05 +0100 Subject: [PATCH 44/52] 1.35.0rc3 --- CHANGES.md | 16 ++++++++++++++++ changelog.d/10082.bugfix | 1 - changelog.d/10091.misc | 1 - changelog.d/10093.bugfix | 1 - synapse/__init__.py | 2 +- 5 files changed, 17 insertions(+), 4 deletions(-) delete mode 100644 changelog.d/10082.bugfix delete mode 100644 changelog.d/10091.misc delete mode 100644 changelog.d/10093.bugfix diff --git a/CHANGES.md b/CHANGES.md index 1fac16580d13..8bd05c318dcf 100644 --- a/CHANGES.md +++ b/CHANGES.md @@ -1,3 +1,19 @@ +Synapse 1.35.0rc3 (2021-05-28) +============================== + +Bugfixes +-------- + +- Fixed a bug causing replication requests to fail when receiving a lot of events via federation. ([\#10082](https://github.com/matrix-org/synapse/issues/10082)) +- Fix HTTP response size limit to allow joining very large rooms over federation. ([\#10093](https://github.com/matrix-org/synapse/issues/10093)) + + +Internal Changes +---------------- + +- Log method and path when dropping request due to size limit. ([\#10091](https://github.com/matrix-org/synapse/issues/10091)) + + Synapse 1.35.0rc2 (2021-05-27) ============================== diff --git a/changelog.d/10082.bugfix b/changelog.d/10082.bugfix deleted file mode 100644 index b4f8bcc4fa3c..000000000000 --- a/changelog.d/10082.bugfix +++ /dev/null @@ -1 +0,0 @@ -Fixed a bug causing replication requests to fail when receiving a lot of events via federation. diff --git a/changelog.d/10091.misc b/changelog.d/10091.misc deleted file mode 100644 index dbe310fd17a1..000000000000 --- a/changelog.d/10091.misc +++ /dev/null @@ -1 +0,0 @@ -Log method and path when dropping request due to size limit. diff --git a/changelog.d/10093.bugfix b/changelog.d/10093.bugfix deleted file mode 100644 index e50de4b2ea2c..000000000000 --- a/changelog.d/10093.bugfix +++ /dev/null @@ -1 +0,0 @@ -Fix HTTP response size limit to allow joining very large rooms over federation. diff --git a/synapse/__init__.py b/synapse/__init__.py index 6faf31dbbce5..4591246bd187 100644 --- a/synapse/__init__.py +++ b/synapse/__init__.py @@ -47,7 +47,7 @@ except ImportError: pass -__version__ = "1.35.0rc2" +__version__ = "1.35.0rc3" if bool(os.environ.get("SYNAPSE_TEST_PATCH_LOG_CONTEXTS", False)): # We import here so that we don't have to install a bunch of deps when From 4f41b711d8b37da3403ce67c88d62133f732a459 Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Fri, 28 May 2021 17:13:57 +0100 Subject: [PATCH 45/52] CHANGELOG --- CHANGES.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/CHANGES.md b/CHANGES.md index 8bd05c318dcf..7e6f478d425e 100644 --- a/CHANGES.md +++ b/CHANGES.md @@ -4,8 +4,8 @@ Synapse 1.35.0rc3 (2021-05-28) Bugfixes -------- -- Fixed a bug causing replication requests to fail when receiving a lot of events via federation. ([\#10082](https://github.com/matrix-org/synapse/issues/10082)) -- Fix HTTP response size limit to allow joining very large rooms over federation. ([\#10093](https://github.com/matrix-org/synapse/issues/10093)) +- Fixed a bug causing replication requests to fail when receiving a lot of events via federation. Introduced in v1.33.0. ([\#10082](https://github.com/matrix-org/synapse/issues/10082)) +- Fix HTTP response size limit to allow joining very large rooms over federation. Introduced in v1.33.0. ([\#10093](https://github.com/matrix-org/synapse/issues/10093)) Internal Changes From 408ecf8ece397bcf08564031379b461d5c9b0de5 Mon Sep 17 00:00:00 2001 From: Erik Johnston Date: Tue, 1 Jun 2021 13:19:50 +0100 Subject: [PATCH 46/52] Announce deprecation of experimental `msc2858_enabled` option. (#10101) c.f. https://github.com/matrix-org/synapse/pull/9617 and https://github.com/matrix-org/matrix-doc/blob/master/proposals/2858-Multiple-SSO-Identity-Providers.md Fixes #9627. --- changelog.d/10101.removal | 1 + 1 file changed, 1 insertion(+) create mode 100644 changelog.d/10101.removal diff --git a/changelog.d/10101.removal b/changelog.d/10101.removal new file mode 100644 index 000000000000..f2020e9ddf38 --- /dev/null +++ b/changelog.d/10101.removal @@ -0,0 +1 @@ +The core Synapse development team plan to drop support for the [unstable API of MSC2858](https://github.com/matrix-org/matrix-doc/blob/master/proposals/2858-Multiple-SSO-Identity-Providers.md#unstable-prefix), including the undocumented `experimental.msc2858_enabled` config option, in August 2021. Client authors should ensure that their clients are updated to use the stable API (which has been supported since Synapse 1.30) well before that time, to give their users time to upgrade. From a8372ad591e07fa76e194a22732a5301d9e55b6f Mon Sep 17 00:00:00 2001 From: Andrew Morgan Date: Tue, 1 Jun 2021 13:23:55 +0100 Subject: [PATCH 47/52] 1.35.0 --- CHANGES.md | 9 +++++++++ changelog.d/10101.removal | 1 - debian/changelog | 6 ++++++ synapse/__init__.py | 2 +- 4 files changed, 16 insertions(+), 2 deletions(-) delete mode 100644 changelog.d/10101.removal diff --git a/CHANGES.md b/CHANGES.md index 7e6f478d425e..09f0be8e176e 100644 --- a/CHANGES.md +++ b/CHANGES.md @@ -1,3 +1,12 @@ +Synapse 1.35.0 (2021-06-01) +=========================== + +Deprecations and Removals +------------------------- + +- The core Synapse development team plan to drop support for the [unstable API of MSC2858](https://github.com/matrix-org/matrix-doc/blob/master/proposals/2858-Multiple-SSO-Identity-Providers.md#unstable-prefix), including the undocumented `experimental.msc2858_enabled` config option, in August 2021. Client authors should ensure that their clients are updated to use the stable API (which has been supported since Synapse 1.30) well before that time, to give their users time to upgrade. ([\#10101](https://github.com/matrix-org/synapse/issues/10101)) + + Synapse 1.35.0rc3 (2021-05-28) ============================== diff --git a/changelog.d/10101.removal b/changelog.d/10101.removal deleted file mode 100644 index f2020e9ddf38..000000000000 --- a/changelog.d/10101.removal +++ /dev/null @@ -1 +0,0 @@ -The core Synapse development team plan to drop support for the [unstable API of MSC2858](https://github.com/matrix-org/matrix-doc/blob/master/proposals/2858-Multiple-SSO-Identity-Providers.md#unstable-prefix), including the undocumented `experimental.msc2858_enabled` config option, in August 2021. Client authors should ensure that their clients are updated to use the stable API (which has been supported since Synapse 1.30) well before that time, to give their users time to upgrade. diff --git a/debian/changelog b/debian/changelog index bf99ae772c9e..d5efb8ccba63 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,3 +1,9 @@ +matrix-synapse-py3 (1.35.0) stable; urgency=medium + + * New synapse release 1.35.0. + + -- Synapse Packaging team Tue, 01 Jun 2021 13:23:35 +0100 + matrix-synapse-py3 (1.34.0) stable; urgency=medium * New synapse release 1.34.0. diff --git a/synapse/__init__.py b/synapse/__init__.py index 4591246bd187..d9843a170860 100644 --- a/synapse/__init__.py +++ b/synapse/__init__.py @@ -47,7 +47,7 @@ except ImportError: pass -__version__ = "1.35.0rc3" +__version__ = "1.35.0" if bool(os.environ.get("SYNAPSE_TEST_PATCH_LOG_CONTEXTS", False)): # We import here so that we don't have to install a bunch of deps when From 08e54345b1332889cd9e88a778bc13caca7b556f Mon Sep 17 00:00:00 2001 From: Andrew Morgan Date: Tue, 1 Jun 2021 13:25:18 +0100 Subject: [PATCH 48/52] Indicate that there were no functional changes since v1.35.0rc3 --- CHANGES.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/CHANGES.md b/CHANGES.md index 09f0be8e176e..c969fe1ebddc 100644 --- a/CHANGES.md +++ b/CHANGES.md @@ -1,6 +1,8 @@ Synapse 1.35.0 (2021-06-01) =========================== +No changes since v1.35.0rc3. + Deprecations and Removals ------------------------- From 3fdaf4df55f52ccf283cf6b0ca73a3f98cd5e8f0 Mon Sep 17 00:00:00 2001 From: Andrew Morgan Date: Tue, 1 Jun 2021 13:40:46 +0100 Subject: [PATCH 49/52] Merge v1.35.0rc3 into v1.35.0 due to incorrect tagging --- CHANGES.md | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/CHANGES.md b/CHANGES.md index c969fe1ebddc..f03a53affc1a 100644 --- a/CHANGES.md +++ b/CHANGES.md @@ -1,17 +1,13 @@ Synapse 1.35.0 (2021-06-01) =========================== -No changes since v1.35.0rc3. +Note that [the tag](https://github.com/matrix-org/synapse/releases/tag/v1.35.0rc3) and [docker images](https://hub.docker.com/layers/matrixdotorg/synapse/v1.35.0rc3/images/sha256-34ccc87bd99a17e2cbc0902e678b5937d16bdc1991ead097eee6096481ecf2c4?context=explore) for `v1.35.0rc3` were incorrectly built. If you are experiencing issues with either, it is recommended to upgrade to the equivalent tag or docker image for the `v1.35.0` release. Deprecations and Removals ------------------------- - The core Synapse development team plan to drop support for the [unstable API of MSC2858](https://github.com/matrix-org/matrix-doc/blob/master/proposals/2858-Multiple-SSO-Identity-Providers.md#unstable-prefix), including the undocumented `experimental.msc2858_enabled` config option, in August 2021. Client authors should ensure that their clients are updated to use the stable API (which has been supported since Synapse 1.30) well before that time, to give their users time to upgrade. ([\#10101](https://github.com/matrix-org/synapse/issues/10101)) - -Synapse 1.35.0rc3 (2021-05-28) -============================== - Bugfixes -------- From 36a7ff0c867e6df969517c58d3eb2520b2ab39d9 Mon Sep 17 00:00:00 2001 From: Patrick Cloke Date: Wed, 2 Jun 2021 11:31:41 -0400 Subject: [PATCH 50/52] Do not show invite-only rooms in spaces summary (unless joined/invited). (#10109) --- changelog.d/10109.bugfix | 1 + synapse/handlers/space_summary.py | 19 +++++++++---------- 2 files changed, 10 insertions(+), 10 deletions(-) create mode 100644 changelog.d/10109.bugfix diff --git a/changelog.d/10109.bugfix b/changelog.d/10109.bugfix new file mode 100644 index 000000000000..bc41bf9e5e3c --- /dev/null +++ b/changelog.d/10109.bugfix @@ -0,0 +1 @@ +Fix a bug introduced in v1.35.0 where invite-only rooms would be shown to users in a space who were not invited. diff --git a/synapse/handlers/space_summary.py b/synapse/handlers/space_summary.py index abd9ddecca20..046dba6fd87a 100644 --- a/synapse/handlers/space_summary.py +++ b/synapse/handlers/space_summary.py @@ -26,7 +26,6 @@ HistoryVisibility, Membership, ) -from synapse.api.errors import AuthError from synapse.events import EventBase from synapse.events.utils import format_event_for_client_v2 from synapse.types import JsonDict @@ -456,16 +455,16 @@ async def _is_room_accessible( return True # Otherwise, check if they should be allowed access via membership in a space. - try: - await self._event_auth_handler.check_restricted_join_rules( - state_ids, room_version, requester, member_event + if self._event_auth_handler.has_restricted_join_rules( + state_ids, room_version + ): + allowed_spaces = ( + await self._event_auth_handler.get_spaces_that_allow_join(state_ids) ) - except AuthError: - # The user doesn't have access due to spaces, but might have access - # another way. Keep trying. - pass - else: - return True + if await self._event_auth_handler.is_user_in_rooms( + allowed_spaces, requester + ): + return True # If this is a request over federation, check if the host is in the room or # is in one of the spaces specified via the join rules. From 57c01dca297b7e14eb7be2b40d80f7577002754f Mon Sep 17 00:00:00 2001 From: Patrick Cloke Date: Thu, 3 Jun 2021 08:18:22 -0400 Subject: [PATCH 51/52] 1.35.1 --- CHANGES.md | 9 +++++++++ changelog.d/10109.bugfix | 1 - debian/changelog | 6 ++++++ synapse/__init__.py | 2 +- 4 files changed, 16 insertions(+), 2 deletions(-) delete mode 100644 changelog.d/10109.bugfix diff --git a/CHANGES.md b/CHANGES.md index f03a53affc1a..5794f2bffd30 100644 --- a/CHANGES.md +++ b/CHANGES.md @@ -1,3 +1,12 @@ +Synapse 1.35.1 (2021-06-03) +=========================== + +Bugfixes +-------- + +- Fix a bug introduced in v1.35.0 where invite-only rooms would be shown to users in a space who were not invited. ([\#10109](https://github.com/matrix-org/synapse/issues/10109)) + + Synapse 1.35.0 (2021-06-01) =========================== diff --git a/changelog.d/10109.bugfix b/changelog.d/10109.bugfix deleted file mode 100644 index bc41bf9e5e3c..000000000000 --- a/changelog.d/10109.bugfix +++ /dev/null @@ -1 +0,0 @@ -Fix a bug introduced in v1.35.0 where invite-only rooms would be shown to users in a space who were not invited. diff --git a/debian/changelog b/debian/changelog index d5efb8ccba63..084e878def4f 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,3 +1,9 @@ +matrix-synapse-py3 (1.35.1) stable; urgency=medium + + * New synapse release 1.35.1. + + -- Synapse Packaging team Thu, 03 Jun 2021 08:11:29 -0400 + matrix-synapse-py3 (1.35.0) stable; urgency=medium * New synapse release 1.35.0. diff --git a/synapse/__init__.py b/synapse/__init__.py index d9843a170860..445e8a5cad37 100644 --- a/synapse/__init__.py +++ b/synapse/__init__.py @@ -47,7 +47,7 @@ except ImportError: pass -__version__ = "1.35.0" +__version__ = "1.35.1" if bool(os.environ.get("SYNAPSE_TEST_PATCH_LOG_CONTEXTS", False)): # We import here so that we don't have to install a bunch of deps when From 56667733419ebf070f1a7f7c9a04070f1b944572 Mon Sep 17 00:00:00 2001 From: Patrick Cloke Date: Thu, 3 Jun 2021 08:19:38 -0400 Subject: [PATCH 52/52] Clarify changelog. --- CHANGES.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CHANGES.md b/CHANGES.md index 5794f2bffd30..04d260f8e5f7 100644 --- a/CHANGES.md +++ b/CHANGES.md @@ -4,7 +4,7 @@ Synapse 1.35.1 (2021-06-03) Bugfixes -------- -- Fix a bug introduced in v1.35.0 where invite-only rooms would be shown to users in a space who were not invited. ([\#10109](https://github.com/matrix-org/synapse/issues/10109)) +- Fix a bug introduced in v1.35.0 where invite-only rooms would be shown to all users in a space, regardless of if the user had access to it. ([\#10109](https://github.com/matrix-org/synapse/issues/10109)) Synapse 1.35.0 (2021-06-01)