Implement the lazy_load_members room state filter parameter #2970

ara4n · 2018-03-12T01:46:26Z

This is a first cut at filtering out room_members from sync responses unless they're actually needed to render the timeline (as proposed at https://docs.google.com/document/d/11yn-mAkYll10RJpN0mkYEVqraTbU3U4eQx9MNrzqX1U/edit#)

My hope is to get this merged so that client developers can experiment with lazy_loading and see how much it speeds up their clients, and to check how badly clients handle members trickling in on demand.

Must-have todo:

Lower priority:

being smarter about which members are needed for a given block of timeline (atm we just pull in those for the senders of the events in the timeline, but need to consider invite targets etc too)
supporting /context

and fix include_other_types thinko

turt2live · 2018-03-12T02:03:39Z

Another thing that may need supporting is the context API if I'm not mistaken. Does this work as-is with Riot or should I hold off on deploying this to my test environment?

ara4n · 2018-03-12T08:26:25Z

i haven’t tested it against riot yet, but in theory it should work. good call on /context. i would be very interested in a “this sped up initial sync from X to Y, and reduced the json from P to Q” stat.

richvdh

looks plausible to me

when type filter includes wildcards on state_key

ara4n · 2018-03-13T22:40:06Z

now updated to try to always add in the necessary member events for a chunk of timeline (although sytest is reporting 500s so clearly needs more work)

ara4n · 2018-03-14T18:28:09Z

So I just put this live with a custom sync worker against my @matthew2:matrix.org account. matthew2's vital stats are:

total state events: 57227
total member events: 56085
total timeline events: 2479
total rooms: 135
possible speedup factor: 32.859658778205834x

Before (matrix-org-hotfixes branch)

cold initial sync: 36.7s
warm initial sync: 8.5s
sync size: 7197kB
v8 heap in riot/web: 252MB

After (matthew/filter_members branch)

cold initial sync: 12.0s
warm initial sync: 5.2s
sync size: 1746k
v8 heap after: 110MB

So, it looks like (with this impl at least) we're seeing a roughly 2-3x improvement on various metrics.

(more datapoints from @turt2live at https://gist.github.com/turt2live/a689cdf3cb0f2ddf3c93aa20f2440c16)

turt2live · 2018-03-15T00:33:12Z

It looks like this isn't sending the senders of events on incremental syncs. It's not the end of the world yet (as it doesn't actually break anything as far as I can tell), it just looks bad in Riot.

(I also realize this isn't near complete yet - just lodging the bug now for consideration)

To counteract the behaviour currently being demonstrated in matrix-org/synapse#2970

ara4n · 2018-03-15T11:06:57Z

yeah, incremental syncs are borked. my next step here is to write some sytests to try to get some level of confidence it actually works properly.

ara4n · 2018-07-24T19:40:21Z

@richvdh ptal for a final time, i hope

richvdh

As we are proving with 15 rounds of back-and-forth here, this stuff is finicky and hard to get right by inspection. Please add some tests to check the holes I am identifying.

richvdh · 2018-07-25T06:26:37Z

synapse/storage/state.py

+
+            if (
+                state_key is None or
+                filtered_types is not None and typ not in filtered_types


lucky for you, and has higher precedence than or. I had to go and look it up though. Parens please.

richvdh · 2018-07-25T06:31:10Z

synapse/storage/state.py

            if valid_state_keys is None:
                return True
            if state_key in valid_state_keys:
                return True
            return False

-        got_all = is_all or not missing_types


can't we just write is_all or (not missing_types and filtered_types is not None) rather than special-casing over a particular bug elsewhere in the algorithm?

i find is_all or (not missing_types and filtered_types is not None) incredibly cryptic to reason about, to the extent i'm failing to convince myself it's even right.

For instance, if this is called with types=[] and filtered_types=[whatever], then is_all could well be false (if the cache is empty), and missing_types would be falsey, and filtered_types would not be None... so got_all would be true, which is the wrong answer.

Which is why I very deliberately spelt out the special case we're handling here where types=[], so we can't trust missing_types, which feels a lot easier to reason about.

ok, but the fact that you are special-casing the empty list makes me suspect that there are bugs elsewhere.

and sorry, it should have been is_all or (not missing_types and filtered_types is None). Maybe it should be:

got_all = is_all if not got_all: # the cache is incomplete. We may still have got all the results we need, if # we don't have any wildcards in the match list. if not missing_types and filtered_types is None: got_all = True

richvdh · 2018-07-25T06:33:09Z

synapse/storage/state.py

+            # filtered_types list.  missing_types will always be empty, so we ignore it.
+            got_all = is_all
+        else:
+            got_all = is_all or not missing_types

        return {


worth noting that missing_types isn't actually used in the result. Suggest removing it; I think it might open some clearer options in how to implement this function

it feels like a useful thing for the function to be returning tbh, even if it isn't used, but given the filtered_types stuff means that we can't easily enumerate the types which are missing from the result, i've removed it.

richvdh · 2018-07-25T06:33:41Z

synapse/storage/state.py

@@ -460,7 +511,7 @@ def _get_state_group_for_events(self, event_ids):

        defer.returnValue({row["event_id"]: row["state_group"] for row in rows})

-    def _get_some_state_from_cache(self, group, types):
+    def _get_some_state_from_cache(self, group, types, filtered_types=None):


I still don't think this is right; I don't think e22700c actually fixed the case I mentioned.

Prove me wrong with a test!

i strongly suspect i'd have missed it in a UT too.

Well, can you add one now, please.

ara4n · 2018-07-25T16:28:04Z

@richvdh ptal.

richvdh

57 commits later... \o/

for the love of god please don't forget to squash when merging.

richvdh · 2018-07-25T21:38:59Z

synapse/storage/state.py

+        if not got_all:
+            # the cache is incomplete. We may still have got all the results we need, if
+            # we don't have any wildcards in the match list.
+            if not missing_types and filtered_types is None:


and now that we aren't returning missing_types, I wonder why we are bothering to build a set rather than just using a boolean. But let's just land the damn thing.

ara4n · 2018-07-25T23:07:04Z

As per #synapse-dev:

* Matthew sighs
ironically GH didn't give me a squash merge button then
so i clicked merge anyway assuming it would prompt in a second phase
but it didn't
* Matthew wonders how to unpick that
given i assume a revert will make even more of a mess of the history
and force-pushing to develop will be even more antisocial.

ara4n added 4 commits March 11, 2018 20:01

WIP experiment in lazyloading room members

9b334b3

typos

8713365

fix sqlite where clause

97c0496

correctly handle None state_keys

fdedcd1

and fix include_other_types thinko

ara4n requested a review from richvdh March 12, 2018 01:46

ara4n assigned richvdh Mar 12, 2018

richvdh reviewed Mar 12, 2018

View reviewed changes

richvdh assigned ara4n and unassigned richvdh Mar 12, 2018

ara4n added 9 commits March 13, 2018 17:52

fix bug #2926

1b1c137

PR feedbackz

52f7e23

build where_clause sanely

b2aba9e

disable optimisation for searching for state groups

865377a

when type filter includes wildcards on state_key

typoe

afbf4d3

ensure we always include the members for a given timeline block

14a9d2f

merge proper fix to bug 2969

12350e3

remove comment now #2969 is fixed

f0f9a06

make it work

ccca028

ara4n and others added 4 commits March 13, 2018 23:46

oops

c9d72e4

add copyright to nudge CI

4d0cfef

pep8

9f77001

Merge branch 'develop' into matthew/filter_members

056a6df

turt2live added a commit to turt2live/evelium that referenced this pull request Mar 15, 2018

Don't explode if we don't have membership events

679d546

To counteract the behaviour currently being demonstrated in matrix-org/synapse#2970

make incr syncs work

3bc5bd2

Merge branch 'develop' into matthew/filter_members

454f59b

ara4n assigned richvdh and unassigned ara4n Jul 24, 2018

richvdh assigned ara4n and unassigned richvdh Jul 24, 2018

handle the edge case for _get_some_state_from_cache where types is []

cb5c37a

ara4n assigned richvdh and unassigned ara4n Jul 24, 2018

richvdh suggested changes Jul 25, 2018

View reviewed changes

richvdh assigned ara4n and unassigned richvdh Jul 25, 2018

ara4n added 4 commits July 25, 2018 16:33

incorporate more review.

7d9fb88

add tests for _get_some_state_from_cache

0a7ee0a

flake8

0620d27

Merge branch 'develop' into matthew/filter_members

2565804

ara4n assigned richvdh and unassigned ara4n Jul 25, 2018

richvdh approved these changes Jul 25, 2018

View reviewed changes

richvdh assigned ara4n and unassigned richvdh Jul 25, 2018

switch missing_types to be a bool

bc7944e

ara4n merged commit 1bcd049 into develop Jul 25, 2018

ara4n mentioned this pull request Sep 4, 2018

Initial /sync isn't as fast as it should be when LL is enabled #3720

Closed

hawkowl deleted the matthew/filter_members branch September 20, 2018 14:01

turt2live mentioned this pull request May 28, 2019

Spec lazy-loading room members matrix-org/matrix-spec-proposals#2035

Merged

This was referenced Apr 27, 2020

We should reap kicked users from the membership list (SYN-9) #1203

Closed

Prune 'left' members of rooms in CS API from initial /sync? (SYN-487) #1379

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement the lazy_load_members room state filter parameter #2970

Implement the lazy_load_members room state filter parameter #2970

ara4n commented Mar 12, 2018 •

edited

Loading

turt2live commented Mar 12, 2018

ara4n commented Mar 12, 2018

richvdh left a comment

ara4n commented Mar 13, 2018

ara4n commented Mar 14, 2018 •

edited

Loading

turt2live commented Mar 15, 2018

ara4n commented Mar 15, 2018

ara4n commented Jul 24, 2018

richvdh left a comment

richvdh Jul 25, 2018

ara4n Jul 25, 2018

richvdh Jul 25, 2018

ara4n Jul 25, 2018

richvdh Jul 25, 2018

ara4n Jul 25, 2018

richvdh Jul 25, 2018

ara4n Jul 25, 2018

richvdh Jul 25, 2018

ara4n commented Jul 25, 2018

richvdh left a comment

richvdh Jul 25, 2018

ara4n commented Jul 25, 2018

Implement the lazy_load_members room state filter parameter #2970

Implement the lazy_load_members room state filter parameter #2970

Conversation

ara4n commented Mar 12, 2018 • edited Loading

turt2live commented Mar 12, 2018

ara4n commented Mar 12, 2018

richvdh left a comment

Choose a reason for hiding this comment

ara4n commented Mar 13, 2018

ara4n commented Mar 14, 2018 • edited Loading

turt2live commented Mar 15, 2018

ara4n commented Mar 15, 2018

ara4n commented Jul 24, 2018

richvdh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ara4n commented Jul 25, 2018

richvdh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ara4n commented Jul 25, 2018

ara4n commented Mar 12, 2018 •

edited

Loading

ara4n commented Mar 14, 2018 •

edited

Loading