Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Complement Existing members see new members' presence is flaky #13199

Closed
reivilibre opened this issue Jul 6, 2022 · 20 comments · Fixed by matrix-org/complement#563
Closed

Complement Existing members see new members' presence is flaky #13199

reivilibre opened this issue Jul 6, 2022 · 20 comments · Fixed by matrix-org/complement#563
Assignees
Labels
A-Testing Issues related to testing in complement, synapse, etc T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. Z-Dev-Wishlist Makes developers' lives better, but doesn't have direct user impact Z-Flake Tests that give intermittent failures

Comments

@reivilibre
Copy link
Contributor

It took me 363 runs (~3 hours of CI) to reproduce this, so maybe it's not the highest priority flake around.

The issue seems to be that Bob's presence information is not coming down sync.

I believe the reason we expect Bob's presence to come down sync is because /join involves sending an event and that should update his presence.

I can't see the code that should trigger this in Synapse, but since this succeeds most of the time it must be there.
In the test logs for this test, I didn't spot any requests to /_synapse/replication/presence_set_state/ for Bob, which I would have expected to see. (I do see some for Alice.)

For this reason, I hesitate to brush it off as 'slow replication'. I also note that the requests in the log don't seem to be that slow around the time of the failure, so I don't suspect CPU contention.

@reivilibre reivilibre added the T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. label Jul 6, 2022
@reivilibre reivilibre self-assigned this Jul 6, 2022
@reivilibre reivilibre added the Z-Flake Tests that give intermittent failures label Jul 6, 2022
@DMRobertson
Copy link
Contributor

Seen again at https://github.com/matrix-org/synapse/runs/7541314250?check_suite_focus=true#step:4:3959 (on a monolith sqlite deployment?)

@DMRobertson
Copy link
Contributor

@richvdh richvdh changed the title TestMembersLocal » 'Existing members see new members' presence' Complement test is flaky when testing against workers Complement Existing members see new members' presence is flaky Aug 11, 2022
@richvdh
Copy link
Member

richvdh commented Aug 11, 2022

maybe this is happening more frequently now? Also seen at https://github.com/matrix-org/synapse/runs/7783161292?check_suite_focus=true (on a monolith)

@richvdh
Copy link
Member

richvdh commented Aug 11, 2022

keyword: TestMembersLocal/Parallel/Existing_members_see_new_members'_presence

@DMRobertson
Copy link
Contributor

@DMRobertson
Copy link
Contributor

@clokep
Copy link
Member

clokep commented Sep 19, 2022

@DMRobertson
Copy link
Contributor

@DMRobertson
Copy link
Contributor

@DMRobertson
Copy link
Contributor

@DMRobertson
Copy link
Contributor

@DMRobertson
Copy link
Contributor

@DMRobertson
Copy link
Contributor

@DMRobertson
Copy link
Contributor

@DMRobertson
Copy link
Contributor

@DMRobertson
Copy link
Contributor

@DMRobertson
Copy link
Contributor

@DMRobertson
Copy link
Contributor

@squahtx
Copy link
Contributor

squahtx commented Nov 28, 2022

On SQLite in monolith mode, the test fails reliably if modified so that Bob's join appears in an initial sync. Bob's presence is offline(!) and it's only sent during incremental syncs, so this appears to be an issue with the test setup, or Synapse not setting Bob's presence to online upon a join.

@squahtx
Copy link
Contributor

squahtx commented Nov 28, 2022

We appear to only bump presence when Bob sends a regular message.

if event.type == EventTypes.Message:
# We don't want to block sending messages on any presence code. This
# matters as sometimes presence code can take a while.
run_in_background(self._bump_active_time, requester.user)

That condition has been around since f70e622 from 2014.

However, b31ec21 modified the bump to only transition from unavailable to online and changes nothing when a user is offline. So it seems that users intentionally do not go from offline to online when sending messages.

if prev_state.state == PresenceState.UNAVAILABLE:
new_fields["state"] = PresenceState.ONLINE

@squahtx squahtx added Z-Dev-Wishlist Makes developers' lives better, but doesn't have direct user impact A-Testing Issues related to testing in complement, synapse, etc labels Nov 28, 2022
squahtx added a commit to matrix-org/complement that referenced this issue Nov 28, 2022
Depending on timings, the "Existing members see new members' presence"
test in `TestMembersLocal` may see the expected join and presence in an
initial or incremental sync. Both types of sync elicit different
behaviour in homeserver implementations, such as Synapse. Split the test
into initial and incremental sync variants to eliminate the race
condition.

In addition, fix the initial sync case by explicitly setting Bob's
presence to online. The spec does not guarantee that Bob will have a
non-offline presence after merely joining a room, nor does the spec
guarantee that offline presences will show up in initial syncs.

Fixes matrix-org/synapse#13199.
@squahtx squahtx self-assigned this Nov 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Testing Issues related to testing in complement, synapse, etc T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. Z-Dev-Wishlist Makes developers' lives better, but doesn't have direct user impact Z-Flake Tests that give intermittent failures
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants