-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement dynamic HTTP/2 window scaling #54755
Conversation
Note regarding the This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, to please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change. |
Tagging subscribers to this area: @dotnet/ncl Issue DetailsFixes #43086 by introducing automatic scaling of the HTTP/2 stream receive window based on measuring RTT of PING frames. The algorithm has been tested and benchmarked against various bandwidth-RTT combos. The HTTP/2 download speed still does not match HTTP/1, but this is also true for a large static window, we need to do a separate investigation on this. The PR introduces 3 tuning options as environment variables/AppContext switches:
Resolves #53372. I'm still looking at a failing test ( /cc @geoffkizer @ManickaP @halter73 @JamesNK @brporter
|
# Conflicts: # src/libraries/System.Net.Http/src/System/Net/Http/DiagnosticsHandler.cs
src/libraries/System.Net.Http/tests/FunctionalTests/HttpClientHandlerTest.Http2.cs
Outdated
Show resolved
Hide resolved
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/Http2Connection.cs
Outdated
Show resolved
Hide resolved
I hesitate a little bit to expose DOTNET_SYSTEM_NET_HTTP_SOCKETSHTTPHANDLER_FLOWCONTROL_MAXSTREAMWINDOWSIZE and DOTNET_SYSTEM_NET_HTTP_SOCKETSHTTPHANDLER_FLOWCONTROL_STREAMWINDOWSCALETHRESHOLDMULTIPLIER because once we expose them, someone will depend on them and it will be nearly impossible to remove these. How important do we think these really are? |
In case they will become obsolete, they will be just NOP-s, so I don't think this is a problem.
I can't judge if there are customers who are really interested tweaking them, although in our internal discussions we assumed that there will be some. TBH, if someone really wants to tweak this stuff for faster download, they should just bump the initial window size. The question is if there's anyone interested overriding them for lower memory footprint. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Posting the feedback I have so far, more to come later.
src/libraries/Common/src/System/Net/Http/HttpHandlerDefaults.cs
Outdated
Show resolved
Hide resolved
src/libraries/Common/tests/System/Net/Http/Http2LoopbackConnection.cs
Outdated
Show resolved
Hide resolved
src/libraries/Common/tests/System/Net/Http/Http2LoopbackConnection.cs
Outdated
Show resolved
Hide resolved
src/libraries/Common/tests/System/Net/Http/HttpClientHandlerTest.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/Http2Connection.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/HttpConnectionSettings.cs
Outdated
Show resolved
Hide resolved
// First check for the AppContext switch, giving it priority over the environment variable. | ||
if (AppContext.TryGetSwitch(Http3DraftSupportAppCtxSettingName, out bool allowHttp3)) | ||
// Disallow negative values: | ||
if (value < 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0 is valid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no practical functional difference between 0
and double.Epsilon
, these are mathematically valid values which will practically disable the scaling. Should we bother setting an arbitrary minimum?
(The question only matters if we want to keep the knob.)
src/libraries/System.Net.Http/tests/FunctionalTests/HttpClientHandlerTest.Http2.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Net.Http/tests/FunctionalTests/SocketsHttpHandlerTest.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Net.Http/tests/FunctionalTests/SocketsHttpHandlerTest.cs
Show resolved
Hide resolved
src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/SocketsHttpHandler.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/SocketsHttpHandler.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Net.Http/tests/FunctionalTests/HttpClientHandlerTest.Http2.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Net.Http/tests/FunctionalTests/HttpClientHandlerTest.Http2.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Net.Http/tests/FunctionalTests/HttpClientHandlerTest.Http2.cs
Outdated
Show resolved
Hide resolved
internal bool _disableDynamicHttp2WindowSizing = DisableDynamicHttp2WindowSizing; | ||
internal int _maxHttp2StreamWindowSize = MaxHttp2StreamWindowSize; | ||
internal double _http2StreamWindowScaleThresholdMultiplier = Http2StreamWindowScaleThresholdMultiplier; | ||
internal int _initialHttp2StreamWindowSize = Http2Connection.DefaultInitialWindowSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't Http2Connection.DefaultInitialWindowSize
be in HttpHandlerDefaults
then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should, yeah.
I also wonder whether 64K is still the right value for this. Should we make this larger?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it wasn't optimal in any of my runs. Data from 0ms RTT / 5Gbps
(loopback communication) benchmarks suggests it is worth to set it to 256K, however the window will scale faster to larger than optimal maximum sizes in all cases then.
I'm a bit hesitant to change the defaults without having more data. Maybe we should invest a couple of days to find the right defaults for the August preview?
|
||
private int _deliveredBytes; | ||
private int _streamWindowSize; | ||
private long _lastWindowUpdate; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could potentially use ValueStopwatch
here. It remembers _startTimestamp
and has public TimeSpan GetElapsedTime()
. I haven't used it extensively but @MihaZupan did for telemetry/counters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I stole the the unit conversion logic from ValueStopWatch
because it seemed to be too heavy 😆 We don't need _startTimestamp
here, we only need differences compared to the last timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the _startTimestamp
would be your _lastWindowUpdate
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit hesitant to change & rebenchmark at this point, may try on Monday.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that:
TimeSpan dt = _stopwatch.GetElapsedTime();
_stopwatch = ValueStopwatch.StartNew();
will call Stopwatch.GetTimestamp()
(OS call) twice. Although this wouldn't be a noticeable perf issue, I was actually afraid that someone will complain about this in the code review out of perfectionism, so went with manual timestamping instead of ValueStopwatch
😄 Don't want to change this now, want to fix my test and the remaining nits, and get this merged ASAP.
...libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/Http2StreamWindowManager.cs
Outdated
Show resolved
Hide resolved
...libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/Http2StreamWindowManager.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Net.Http/tests/FunctionalTests/HttpClientHandlerTest.Http2.cs
Show resolved
Hide resolved
...libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/Http2StreamWindowManager.cs
Outdated
Show resolved
Hide resolved
...libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/Http2StreamWindowManager.cs
Show resolved
Hide resolved
...libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/Http2StreamWindowManager.cs
Show resolved
Hide resolved
...libraries/System.Net.Http/src/System/Net/Http/SocketsHttpHandler/Http2StreamWindowManager.cs
Show resolved
Hide resolved
Co-authored-by: Geoff Kizer <geoffrek@microsoft.com>
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
||
private int _deliveredBytes; | ||
private int _streamWindowSize; | ||
private long _lastWindowUpdate; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the _startTimestamp
would be your _lastWindowUpdate
...
src/libraries/Common/src/System/Net/Http/HttpHandlerDefaults.cs
Outdated
Show resolved
Hide resolved
One of my new tests failed on the CI because of timing issues. I will try to harden it on Monday, if it's not possible I will delete it. Unfortunately there is no way to test this feature without extensive use of |
src/libraries/Common/tests/System/Net/Http/Http2LoopbackConnection.cs
Outdated
Show resolved
Hide resolved
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
All CI failures are unrelated. I had to remove the test @geoffkizer @ManickaP if there's nothing critical left, I'll merge this within ~36 hours. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for all the research and benchmarking to do this 🚀
Nice! Is there a before and after benchmark of this? I'd like to talk about it as a performance improvement when using gRPC and it would be good to have numbers to back that up. e.g. before it took X seconds to download 100 megabytes, now it takes Y seconds. |
@JamesNK it's really a function of network delay and bandwidth. If the peers communicate over localhost, the difference is negligible. If there is a delay of several 10s of milliseconds with a relatively high bandwidth, it can be an order of magnitude or even more. I will share my spreadsheet with you tomorrow. Might make sense to make a GH-friendly report out of it, but I just started my vacation, may do it when I'm back :) |
I get that it is variable. One or two examples to give people an idea of the kind of performance improvement it can get is what I'm looking for. Devs love this kind of stuff. No rush. |
Completely redesign tests for HTTP/2 KeepAlive PING, so they: - Work well with RTT pings introduced in Implement dynamic HTTP/2 window scaling #54755 - Run sequentially, reducing the chance of failing because of timing issues caused by parallel workloads - Are better organized: multiple test cases for different scenarios, instead of one theory with complex branches on parameters Instead of reading / reacting to frames inline, there is a separate Task for processing incoming frames, responding to PING immediately and pushing other frames to a Channel<Frame>. Fixes #41929
Fixes #43086 by introducing automatic scaling of the HTTP/2 stream receive window based on measuring RTT of PING frames.
The algorithm has been tested and benchmarked against various bandwidth-RTT combos. The HTTP/2 download speed still does not match HTTP/1, but this is also true for a large static window, we need to do a separate investigation on this.
The PR introduces 3 tuning options as environment variables/AppContext switches:
DOTNET_SYSTEM_NET_HTTP_SOCKETSHTTPHANDLER_HTTP2FLOWCONTROL_DISABLEDYNAMICWINDOWSIZING
to disable the dynamic window alogithm and the RTT PINGDOTNET_SYSTEM_NET_HTTP_SOCKETSHTTPHANDLER_FLOWCONTROL_MAXSTREAMWINDOWSIZE
to control the maximum stream window sizeDOTNET_SYSTEM_NET_HTTP_SOCKETSHTTPHANDLER_FLOWCONTROL_STREAMWINDOWSCALETHRESHOLDMULTIPLIER
to enable fine-tuning the balance between optimal receive window VS download speedResolves #53372.
I'm still looking at a failing test (
ResponseStreamFrames_DataAfterHeadersAndContinuationWithoutEndHeaders_ConnectionError
) and I want to re-validate in an environment with actual physical (non-simulated) delay, but generally the PR is ready for review.@geoffkizer I addressed the naming concerns you raised in #52862, hope it looks better now.
/cc @ManickaP @halter73 @JamesNK @brporter