Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry /publicRooms a few times to allow for propagation delay before a new room appears in the room directory #415

Merged
merged 2 commits into from
Jul 28, 2022

Conversation

reivilibre
Copy link
Contributor

@reivilibre reivilibre commented Jul 21, 2022

This is needed in Synapse because the room directory is updated in the background. Part of matrix-org/synapse#13161.
It's particularly worse when using workers, because the room directory is updated in a different process and CPU contention increases the likelihood of hitting the race.

@reivilibre reivilibre force-pushed the rei/publicrooms_propagation branch 2 times, most recently from 535257b to 816cfa3 Compare July 21, 2022 11:42
@reivilibre reivilibre marked this pull request as ready for review July 21, 2022 12:15
@reivilibre reivilibre requested review from a team July 21, 2022 12:15
Copy link
Contributor

@DMRobertson DMRobertson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by thoughts. No idea what a Waiter is (https://pkg.go.dev/github.com/aws/aws-sdk-go/private/waiter#Waiter?) but if it's more idiomatic then that might be the way to go.

internal/client/client.go Outdated Show resolved Hide resolved
internal/client/client.go Outdated Show resolved Hide resolved
@DMRobertson DMRobertson requested a review from a team July 21, 2022 13:35
Copy link
Member

@kegsay kegsay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we want is something which will repeat the action until some condition is satisfied, at which point it will return the response. This response can then have assertions applied to it. Currently, this PR conflates the two: the presence of the room ID is both the condition to satisfy and the assertion. This doesn't compose well because:

  • You cannot use assertion helpers (like matchers) or anything which will fail the test as part of your code to repeat until the condition is satisfied, as the first failure will fail the test.
  • You cannot say things like "do this request until you see this field, then give me the response so I can inspect the fields more deeply".

internal/client/client.go Outdated Show resolved Hide resolved
internal/client/client.go Outdated Show resolved Hide resolved
authedClient.MustDoFuncUntil(t, "GET", []string{"_matrix", "client", "v3", "publicRooms"}, func(res *http.Response) bool {
foundRoom := false

must.MatchResponse(t, res, match.HTTPResponse{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't use MatchResponse in this way as it will fail the test if the response does not pass the given matchers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(NB Diff is outdated but basic structure remains the same, so will reply to comment)

I believe this is what we want, though (or at least I'm used to it from SyTest); if the response is ever malformed then we want to fail the test immediately without waiting for it to be correct.

Put another way: The retries are waiting for the expected data to appear in the response, but that doesn't mean that any responses before that are allowed to omit required fields.
As a result, you end up with 3 outcomes: success, temporary fail (i.e. retry) and permanent fail.

As for how these currently seem to map to code:
Success = return true
Temporary fail = return false
Permanent fail = t.FailNow/t.Fatalf

kegsay added a commit that referenced this pull request Jul 21, 2022
This will repeat a request until the condition is satisfied.

Intended use: #415
kegsay added a commit that referenced this pull request Jul 21, 2022
…417)

This will repeat a request until the condition is satisfied.

Intended use: #415
Copy link
Member

@erikjohnston erikjohnston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems legit to me. Will defer to @kegsay that this is the Right Way

foundRoom = true
return nil
}
return nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a slight misuse of matchers, given you always return nil it will always succeed even if the room isn't found, and then you side-effect foundRoom to determine retry-ness. It's not something that the matchers were designed for, but in practice it seems to read pretty well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, though I will say 'don't blame me, it was there before!' :-P

I suspect this sort of pattern will crop up again, so if you get any better ideas then it may be worth putting them in

Copy link
Member

@kegsay kegsay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@reivilibre reivilibre merged commit 73de567 into main Jul 28, 2022
@reivilibre reivilibre deleted the rei/publicrooms_propagation branch July 28, 2022 09:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants