Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft] investigate #3750

Closed
wants to merge 10 commits into from
Closed

Conversation

oncilla
Copy link
Contributor

@oncilla oncilla commented May 7, 2020

This change is Reviewable

@oncilla oncilla changed the title Pub investigate [Draft] investigate May 7, 2020
@oncilla oncilla force-pushed the pub-investigate branch 2 times, most recently from 0ece947 to dea5391 Compare May 7, 2020 11:46
@oncilla oncilla force-pushed the pub-investigate branch from 4c0bc2a to 190cd5e Compare May 8, 2020 12:02
@oncilla oncilla mentioned this pull request May 8, 2020
@oncilla
Copy link
Contributor Author

oncilla commented May 8, 2020

close in favor of: #3758

@oncilla oncilla closed this May 8, 2020
oncilla added a commit to oncilla/scion that referenced this pull request May 8, 2020
Add RPC retry with exponential back-off in case of quic `server busy`
error.
Reduce synchronous part of accept loop.

With the bump to quic-go v0.15.5 (scionproto#3732) the quic library now enforces a
max accept queue length. With our approach of every RPC is a new quic
connection, and some work being done synchronously in the accept loop
this bites on CI heavily.

With the exponential back-off and immediately spawning a go routine, we
can improve failure rate significantly.

But this should not be the final solution. Applications need to be
able to deal with requests failing. Currently, we only do one shot
testing, which might be too strict.

For investigation results, see the buildkite builds associated with
the draft PR scionproto#3750.
oncilla added a commit that referenced this pull request May 11, 2020
Add RPC retry with exponential back-off in case of quic `server busy`
error.
Reduce synchronous part of accept loop.

With the bump to quic-go v0.15.5 (#3732) the quic library now enforces a
max accept queue length. With our approach of every RPC is a new quic
connection, and some work being done synchronously in the accept loop
this bites on CI heavily.

With the exponential back-off and immediately spawning a go routine, we
can improve failure rate significantly.

But this should not be the final solution. Applications need to be
able to deal with requests failing. Currently, we only do one shot
testing, which might be too strict.

For investigation results, see the buildkite builds associated with
the draft PR #3750.
@oncilla oncilla deleted the pub-investigate branch August 7, 2020 12:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant