Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Allow configuration of Synapse's cache without using synctl or environment variables #6391

Merged
merged 66 commits into from
May 11, 2020
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
66 commits
Select commit Hold shift + click to select a range
99ab65a
fix up logging to use rapidjson
hawkowl Oct 24, 2019
d684ec8
benchmarks
hawkowl Oct 24, 2019
135fdaa
update linting
hawkowl Oct 31, 2019
f36f3ab
Merge remote-tracking branch 'origin/develop' into hawkowl/structured…
hawkowl Oct 31, 2019
73cfdeb
fix
hawkowl Oct 31, 2019
babf4ea
Merge branch 'develop' into hawkowl/structured-logging-perf
hawkowl Nov 1, 2019
e55591d
Merge remote-tracking branch 'origin/develop' into hawkowl/structured…
hawkowl Nov 4, 2019
c580eb3
revert this
hawkowl Nov 4, 2019
f888515
Merge remote-tracking branch 'origin/develop' into hawkowl/structured…
hawkowl Nov 4, 2019
a14831d
Move cache configuration into a homeserver config option, instead of …
hawkowl Nov 20, 2019
50fcb4a
Re-sync every so often, in case caches appear
hawkowl Nov 21, 2019
8439ce5
Merge remote-tracking branch 'origin/develop' into hawkowl/structured…
hawkowl Nov 25, 2019
92d4d13
fixes
hawkowl Dec 2, 2019
d70be18
Merge remote-tracking branch 'origin/develop' into hawkowl/structured…
hawkowl Dec 2, 2019
c11c8ad
more fixes
hawkowl Dec 2, 2019
0e368ee
more fixes
hawkowl Dec 2, 2019
f7ec526
add some LruCache benchmarks
hawkowl Dec 2, 2019
e174b2d
fix style
hawkowl Dec 2, 2019
946650c
Merge remote-tracking branch 'origin/develop' into hawkowl/benchmark-…
hawkowl Dec 3, 2019
9735a08
newsfile
hawkowl Dec 3, 2019
c76a412
Merge remote-tracking branch 'origin/develop' into hawkowl/cache-conf…
hawkowl Jan 16, 2020
0a02b2a
cleanup
hawkowl Jan 16, 2020
a21702f
cleanup so it can refer to late-config
hawkowl Jan 16, 2020
125c5a0
Merge branch 'hawkowl/benchmark-lrucache' into hawkowl/cache-config-w…
hawkowl Jan 16, 2020
0b069b7
make expiring caches resizeable
hawkowl Jan 16, 2020
a96b5d9
Load cache factors from the environment, and add a test.
hawkowl Jan 21, 2020
2f4dbfa
document as well as refactor so that CacheMetric is not nested
hawkowl Jan 21, 2020
ac020de
fix style
hawkowl Jan 21, 2020
18c1dbf
don't need this comment
hawkowl Jan 21, 2020
4aeb6fb
Merge remote-tracking branch 'origin/develop' into hawkowl/cache-conf…
hawkowl Feb 17, 2020
5f508e7
add tests for individual cache sizing, and fix up the individual cach…
hawkowl Feb 17, 2020
2619891
fix
hawkowl Feb 17, 2020
0fc0a7d
fix
hawkowl Feb 27, 2020
965e259
fixes
hawkowl Feb 27, 2020
f004cee
Merge remote-tracking branch 'origin/develop' into hawkowl/cache-conf…
hawkowl Feb 27, 2020
546c23e
Merge branch 'develop' of github.com:matrix-org/synapse into hawkowl/…
anoadragon453 Apr 3, 2020
7117b89
Update synapse/config/cache.py
anoadragon453 Apr 3, 2020
4ed7aa1
Move separate CacheConfig global vars into a single global dict
anoadragon453 Apr 3, 2020
be0737a
Move _DEFAULT_CONFIG to class method
anoadragon453 Apr 3, 2020
14fbf27
Merge branch 'hawkowl/cache-config-without-synctl' of github.com:matr…
anoadragon453 Apr 3, 2020
cc1dd98
lint
anoadragon453 Apr 3, 2020
de25659
Fix typing
anoadragon453 Apr 3, 2020
6e979a1
sample config
anoadragon453 Apr 3, 2020
fec13be
Fix lint
anoadragon453 Apr 6, 2020
905c833
Add some mypy ignores
anoadragon453 Apr 6, 2020
f300c08
Merge branch 'develop' of github.com:matrix-org/synapse into hawkowl/…
anoadragon453 Apr 24, 2020
28f7f59
Re-add default_size_factor multipler. Revert _max_size modification
anoadragon453 Apr 24, 2020
5f3eddc
Refactor globals in CacheConfig
anoadragon453 Apr 24, 2020
fcc9c3a
dict() -> {}
anoadragon453 Apr 24, 2020
5acca12
Some variable cleanups in cache.py
anoadragon453 Apr 27, 2020
fdef3ad
Make add_resizable_cache args clearer in cache tests
anoadragon453 Apr 27, 2020
ca98241
lint
anoadragon453 Apr 27, 2020
48709a9
Pull global cache prefix from environment by default
anoadragon453 Apr 30, 2020
9e3a9ba
Make config follow the guidelines
anoadragon453 May 1, 2020
fa56e16
sample config
anoadragon453 May 1, 2020
83cf583
Allow creating an LruCache that's not affected by cache factor
anoadragon453 May 1, 2020
424215a
Fix cache: config block parsing
anoadragon453 May 1, 2020
c7f3bf6
Environment takes precedence over config values
anoadragon453 May 1, 2020
485f177
fix tests
anoadragon453 May 1, 2020
fe89050
Fix and clarify config comments
anoadragon453 May 1, 2020
ae6070b
Move event_cache_size to CacheConfig
anoadragon453 May 1, 2020
4448633
Follow config file conventions. Move cache config location
anoadragon453 May 4, 2020
f8ed432
Attempt to explain the default global cache factor
anoadragon453 May 4, 2020
c1b339b
Merge branch 'develop' of github.com:matrix-org/synapse into hawkowl/…
anoadragon453 May 4, 2020
06fbfc2
Merge branch 'develop' of github.com:matrix-org/synapse into hawkowl/…
anoadragon453 May 5, 2020
edb04df
Merge branch 'develop' of github.com:matrix-org/synapse into hawkowl/…
anoadragon453 May 11, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/6391.feature
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Synapse's cache factor can now be configured in `homeserver.yaml` by the `caches.global_factor` setting. Additionally, `caches.per_cache_factors` controls the cache factors for individual caches.
4 changes: 2 additions & 2 deletions synapse/api/auth.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@
)
from synapse.config.server import is_threepid_reserved
from synapse.types import UserID
from synapse.util.caches import CACHE_SIZE_FACTOR, register_cache
from synapse.util.caches import register_cache
from synapse.util.caches.lrucache import LruCache
from synapse.util.metrics import Measure

Expand Down Expand Up @@ -77,7 +77,7 @@ def __init__(self, hs):
self.store = hs.get_datastore()
self.state = hs.get_state_handler()

self.token_cache = LruCache(CACHE_SIZE_FACTOR * 10000)
self.token_cache = LruCache(10000)
register_cache("cache", "token_cache", self.token_cache)

self._account_validity = hs.config.account_validity
Expand Down
3 changes: 1 addition & 2 deletions synapse/app/homeserver.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,6 @@
from synapse.storage import DataStore, are_all_users_on_domain
from synapse.storage.engines import IncorrectDatabaseSetup, create_engine
from synapse.storage.prepare_database import UpgradeDatabaseException, prepare_database
from synapse.util.caches import CACHE_SIZE_FACTOR
from synapse.util.httpresourcetree import create_resource_tree
from synapse.util.manhole import manhole
from synapse.util.module_loader import load_module
Expand Down Expand Up @@ -516,7 +515,7 @@ def phone_stats_home(hs, stats, stats_process=_stats_process):

daily_sent_messages = yield hs.get_datastore().count_daily_sent_messages()
stats["daily_sent_messages"] = daily_sent_messages
stats["cache_factor"] = CACHE_SIZE_FACTOR
stats["cache_factor"] = hs.config.caches.global_factor
stats["event_cache_size"] = hs.config.event_cache_size

#
Expand Down
100 changes: 100 additions & 0 deletions synapse/config/cache.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# -*- coding: utf-8 -*-
# Copyright 2019 Matrix.org Foundation C.I.C.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import os
from collections import defaultdict
from typing import DefaultDict

from twisted.logger import Logger

from ._base import Config, ConfigError

log = Logger()
hawkowl marked this conversation as resolved.
Show resolved Hide resolved

_CACHES = {}
DEFAULT_CACHE_SIZE_FACTOR = float(os.environ.get("SYNAPSE_CACHE_FACTOR", 0.5))


_DEFAULT_CONFIG = """\
anoadragon453 marked this conversation as resolved.
Show resolved Hide resolved
# Cache configuration
#
# 'global_factor' controls the global cache factor. This overrides the
# "SYNAPSE_CACHE_FACTOR" environment variable.
#
# 'per_cache_factors' is a dictionary of cache name to cache factor for that
# individual cache.
#
#caches:
# global_factor: 0.5
# per_cache_factors:
# get_users_who_share_room_with_user: 2
#
"""


def add_resizable_cache(cache_name, cache_resize_callback):
_CACHES[cache_name.lower()] = cache_resize_callback


class CacheConfig(Config):
section = "caches"

def read_config(self, config, **kwargs):
global DEFAULT_CACHE_SIZE_FACTOR

cache_config = config.get("caches", {})

self.global_factor = cache_config.get(
"global_factor", DEFAULT_CACHE_SIZE_FACTOR
)
if not isinstance(self.global_factor, (int, float)):
raise ConfigError("caches.global_factor must be a number.")

# Set the global one so that it's reflected in new caches
DEFAULT_CACHE_SIZE_FACTOR = self.global_factor

individual_factors = cache_config.get("per_cache_factors", {}) or {}
if not isinstance(individual_factors, dict):
raise ConfigError("caches.per_cache_factors must be a dictionary")

self.cache_factors = defaultdict(
lambda: self.global_factor
) # type: DefaultDict[str, float]

for cache, factor in individual_factors.items():
if not isinstance(factor, (int, float)):
raise ConfigError(
"caches.per_cache_factors.%s must be a number" % (cache.lower(),)
)
self.cache_factors[cache.lower()] = factor

def resize_caches(self):
hawkowl marked this conversation as resolved.
Show resolved Hide resolved
for cache_name, cache_resize_callback in _CACHES.items():
cache_factor = self.cache_factors[cache_name]
log.debug(
"Setting cache factor for {cache_name} to {new_cache_factor}",
cache_name=cache_name,
new_cache_factor=cache_factor,
)
changed = cache_resize_callback(cache_factor)
if changed:
log.info(
"Cache factor for {cache_name} set to {new_cache_factor}",
cache_name=cache_name,
new_cache_factor=cache_factor,
)

def get_factor_for(self, cache_name):
hawkowl marked this conversation as resolved.
Show resolved Hide resolved
return self.cache_factors[cache_name.lower()]
2 changes: 2 additions & 0 deletions synapse/config/homeserver.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
from ._base import RootConfig
from .api import ApiConfig
from .appservice import AppServiceConfig
from .cache import CacheConfig
from .captcha import CaptchaConfig
from .cas import CasConfig
from .consent_config import ConsentConfig
Expand Down Expand Up @@ -80,4 +81,5 @@ class HomeServerConfig(RootConfig):
RoomDirectoryConfig,
ThirdPartyRulesConfig,
TracerConfig,
CacheConfig,
]
4 changes: 2 additions & 2 deletions synapse/http/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,6 @@
from synapse.logging.context import make_deferred_yieldable
from synapse.logging.opentracing import set_tag, start_active_span, tags
from synapse.util.async_helpers import timeout_deferred
from synapse.util.caches import CACHE_SIZE_FACTOR

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -241,7 +240,8 @@ def __getattr__(_self, attr):
# tends to do so in batches, so we need to allow the pool to keep
# lots of idle connections around.
pool = HTTPConnectionPool(self.reactor)
pool.maxPersistentPerHost = max((100 * CACHE_SIZE_FACTOR, 5))
# XXX: Why does this use the cache factor????
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was just a handy way of making it scale for larger instances. The logic went that if you're running a larger instance, you'll bump your cache factor, and will probably want a bigger connection pool too. it's a bit of a hack, but otoh I didn't want to have millions of options that you have to tune.

pool.maxPersistentPerHost = max((100 * hs.config.caches.global_factor, 5))
pool.cachedConnectionTimeout = 2 * 60

# The default context factory in Twisted 14.0.0 (which we require) is
Expand Down
12 changes: 8 additions & 4 deletions synapse/metrics/_exposition.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@

from twisted.web.resource import Resource

from synapse.util import caches

try:
from prometheus_client.samples import Sample
except ImportError:
Expand Down Expand Up @@ -103,13 +105,15 @@ def nameify_sample(sample):


def generate_latest(registry, emit_help=False):
output = []

for metric in registry.collect():
# Trigger the cache metrics to be rescraped, which updates the common
# metrics but do not produce metrics themselves
for collector in caches.collectors_by_name.values():
collector.collect()

if metric.name.startswith("__unused"):
continue
output = []

for metric in registry.collect():
if not metric.samples:
# No samples, don't bother.
continue
Expand Down
4 changes: 3 additions & 1 deletion synapse/push/bulk_push_rule_evaluator.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@
"cache",
"push_rules_delta_state_cache_metric",
cache=[], # Meaningless size, as this isn't a cache that stores values
resizable=False,
)


Expand All @@ -67,7 +68,8 @@ def __init__(self, hs):
self.room_push_rule_cache_metrics = register_cache(
"cache",
"room_push_rule_cache",
cache=[], # Meaningless size, as this isn't a cache that stores values
cache=[], # Meaningless size, as this isn't a cache that stores values,
resizable=False,
)

@defer.inlineCallbacks
Expand Down
4 changes: 2 additions & 2 deletions synapse/push/push_rule_evaluator.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
from six import string_types

from synapse.types import UserID
from synapse.util.caches import CACHE_SIZE_FACTOR, register_cache
from synapse.util.caches import register_cache
from synapse.util.caches.lrucache import LruCache

logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -149,7 +149,7 @@ def _get_value(self, dotted_key):


# Caches (glob, word_boundary) -> regex for push. See _glob_matches
regex_cache = LruCache(50000 * CACHE_SIZE_FACTOR)
regex_cache = LruCache(50000)
register_cache("cache", "regex_push_cache", regex_cache)


Expand Down
3 changes: 1 addition & 2 deletions synapse/replication/slave/storage/client_ips.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@
# limitations under the License.

from synapse.storage.data_stores.main.client_ips import LAST_SEEN_GRANULARITY
from synapse.util.caches import CACHE_SIZE_FACTOR
from synapse.util.caches.descriptors import Cache

from ._base import BaseSlavedStore
Expand All @@ -25,7 +24,7 @@ def __init__(self, db_conn, hs):
super(SlavedClientIpStore, self).__init__(db_conn, hs)

self.client_ip_last_seen = Cache(
name="client_ip_last_seen", keylen=4, max_entries=50000 * CACHE_SIZE_FACTOR
name="client_ip_last_seen", keylen=4, max_entries=50000
)

def insert_client_ip(self, user_id, access_token, ip, user_agent, device_id):
Expand Down
1 change: 1 addition & 0 deletions synapse/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -242,6 +242,7 @@ def setup(self):
self.datastores = DataStores(datastore, conn, self)
conn.commit()
self.start_time = int(self.get_clock().time())
self.get_clock().looping_call(self.config.caches.resize_caches, 10000)
hawkowl marked this conversation as resolved.
Show resolved Hide resolved
logger.info("Finished setting up.")

def setup_master(self):
Expand Down
5 changes: 2 additions & 3 deletions synapse/state/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,6 @@
from synapse.logging.utils import log_function
from synapse.state import v1, v2
from synapse.util.async_helpers import Linearizer
from synapse.util.caches import get_cache_factor_for
from synapse.util.caches.expiringcache import ExpiringCache
from synapse.util.metrics import Measure, measure_func

Expand All @@ -51,7 +50,6 @@
KeyStateTuple = namedtuple("KeyStateTuple", ("context", "type", "state_key"))


SIZE_OF_CACHE = 100000 * get_cache_factor_for("state_cache")
EVICTION_TIMEOUT_SECONDS = 60 * 60


Expand Down Expand Up @@ -441,10 +439,11 @@ def __init__(self, hs):
self._state_cache = None
self.resolve_linearizer = Linearizer(name="state_resolve_lock")

cache_size = 100000 * hs.config.caches.get_factor_for("state_cache")
self._state_cache = ExpiringCache(
cache_name="state_cache",
clock=self.clock,
max_len=SIZE_OF_CACHE,
max_len=cache_size,
expiry_ms=EVICTION_TIMEOUT_SECONDS * 1000,
iterable=True,
reset_expiry_on_get=True,
Expand Down
3 changes: 1 addition & 2 deletions synapse/storage/data_stores/main/client_ips.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@
from synapse.metrics.background_process_metrics import wrap_as_background_process
from synapse.storage import background_updates
from synapse.storage._base import Cache
from synapse.util.caches import CACHE_SIZE_FACTOR

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -366,7 +365,7 @@ class ClientIpStore(ClientIpBackgroundUpdateStore):
def __init__(self, db_conn, hs):

self.client_ip_last_seen = Cache(
name="client_ip_last_seen", keylen=4, max_entries=50000 * CACHE_SIZE_FACTOR
name="client_ip_last_seen", keylen=4, max_entries=50000
)

super(ClientIpStore, self).__init__(db_conn, hs)
Expand Down
12 changes: 4 additions & 8 deletions synapse/storage/data_stores/main/state.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
from synapse.storage.data_stores.main.events_worker import EventsWorkerStore
from synapse.storage.engines import PostgresEngine
from synapse.storage.state import StateFilter
from synapse.util.caches import get_cache_factor_for, intern_string
from synapse.util.caches import intern_string
from synapse.util.caches.descriptors import cached, cachedList
from synapse.util.caches.dictionary_cache import DictionaryCache
from synapse.util.stringutils import to_ascii
Expand Down Expand Up @@ -246,14 +246,10 @@ def __init__(self, db_conn, hs):
# We size the non-members cache to be smaller than the members cache as the
# vast majority of state in Matrix (today) is member events.

self._state_group_cache = DictionaryCache(
"*stateGroupCache*",
# TODO: this hasn't been tuned yet
50000 * get_cache_factor_for("stateGroupCache"),
)
# TODO: this hasn't been tuned yet
self._state_group_cache = DictionaryCache("*stateGroupCache*", 50000)
self._state_group_members_cache = DictionaryCache(
"*stateGroupMembersCache*",
500000 * get_cache_factor_for("stateGroupMembersCache"),
"*stateGroupMembersCache*", 500000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this no longer have a cache factor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because now DictionaryCache does the cache factor math, not the things that call it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not following where its doing that? AFAICT get_cache_factor_for is only called for a single cache?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, but now you can configure any of the cache sizes, so it's done in the caches itself

)

@defer.inlineCallbacks
Expand Down
Loading