-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kestrel always outputs data by chunks of 4096 (HTTP 1.1) and 65536 (HTTP 2) bytes maximum #30545
Comments
Kestrel internally uses a 4K buffer size but that doesn't necessarily mean you'll see 4K buffers on the client side. There's also no buffering by default so things will be written to the output as fast as they can be. My guess is that combination of things is why you're seeing 4K chunks on the client side in this particular case.
Which ones and did you run the same test where client and server were in the same process? If so, can I see the test?
You may be able to see bigger chunks in your scenario by using the BodyWriter ( PS: I can reproduce it with your sample, I'm going to do some digging and get back to you. |
Note for HTTP/2 the client can limit the max frame size with the SETTINGS_MAX_FRAME_SIZE connection setting. The default is 16kb. |
Hi, thanks for your quick answer. For the other servers, my real code is more complicated and is out-of-process (I built the inproc only for the reproducing code). I actually spent quite a time to search for the reason. Other tests I've done:
Beyond the fact I would be interested to get some throughput performance w/o any specific code, I've tried to use a custom feature deriving from IHttpResponseBodyFeature, but I seemed to me the Writer property (PipeWriter) wasn't set and is null (the Stream property isn't). I created a custom PipeWriter and returned it in my IHttpResponseBodyFeature but it's never used / called (contrary to the Stream), so I'm unsure how I'm supposed to use a custom writer. Doc is a bit sparse on all this IMHO. I was wondering if this could be related to this: dotnet/runtime#43480 Note I've also tested the "old" WebClient, just in case, and it does the same. |
@smourier when I use the built in FileStreamResult on HTTP/1.1 I get this:
Can you share your custom FileStreamResultExecutor implementation? When using the BodyWriter, we wrap each 4K buffer chunk with the chunked encoding prefix and suffix (size and new line). That might be what you're seeing. // stream it out
await Response.StartAsync();
var stream = System.IO.File.OpenRead(path);
await stream.CopyToAsync(Response.BodyWriter); That ends up looking like this:
To solve this, set the content length (if you have it): var stream = System.IO.File.OpenRead(path);
Response.ContentLength = stream.Length;
await Response.StartAsync();
await stream.CopyToAsync(Response.BodyWriter); This will turn off chunking and work as expected. You can also write a more controlled copy if you want to affect the buffer size but that's up to your implementation. |
If I use your code w/o content length I get this:
If I use your code with content length I get this:
I never saw something like the result in the top of your last answer. That kind of result would be fine for me. I didn't put the custom FileStreamResultExecutor code because it doesn't change anything (HTTP 1.1, HTTP2) I use the code as I've posted it with the built-in FileStreamResult (I mean the one from .NET 5 Microsoft.AspNetCore.Mvc.Core.dll). But I thought you said you could reproduce. |
@smourier I was wrong. I tweaked the code slightly and it doesn't reproduce unless you use the |
This is what I see with your sample and no code changes:
|
OK turns out I'm not telling the truth again. Your code does reproduce the problem. I'm looking at what I changed... |
AHHH I removed the HTTPS redirect! |
Yes, that does it! The doc says it's recommended to use: https://docs.microsoft.com/en-us/aspnet/core/security/enforcing-ssl If you have some spare cycle to share your thoughts on this, otherwise it's good for me :-) |
This isn't good 😢 . I'm still debugging. |
Thanks for contacting us. |
I think I've narrowed it down to the client side. Checking to see if I can reproduce it with SSlStream directly. |
I have a simple repro just writing a 64K byte[] to the client side. There's good news and there's bad news... 😄 This is what it looks like in wire shark in one of the runs:
The TLS frames are 4120 bytes in size: There are 2 inefficiencies here:
That's why you're seeing 4K reads on the client side when HTTPS is turned on. The good news is, the data is all buffered on the client side, the bad news is you end up going through more reads to gets the same amount of data. TLS records are 16K max so it doesn't make sense on the server side trying to write with a single write bigger than that (though the implementation will chunk those writes for you). On the client side today, the best you can do with SSLStream connections is read 16K. I'm not sure if that helps your scenario though. This what wire shark looks like when you do 16K writes from the server side: |
@wfurt we should be bulk reading TLS records so we can consume as much as possible to fill the provided buffer. I’ll open an issue. |
The issue is here dotnet/runtime#49000 |
we may. I did some of it for handshake but the encrypt/decrypt is still open for improvements. I'm not sure if we would be able to process partial frames or really process more than one frame at the time. But we may be able to loop internally and process as much data as we can. For the reading, the trade of is memory. We can read more at once but that would need more memory per connection. That is probably cheap enough for the client side but the server may see bigger aggregate impact. Of course Kestrel can also read as much as it can from kernel and SslStream could consume it without context switches. |
SSLStream can reduce memory usage by mimicking what WebSockets do here. Only have enough internal buffer to represent the message header, then use the user provided buffer to populate the payload a decrypt in place. That way you'd only need to store enough buffer to do the handshake and to read TLS record headers. |
SslStream does in place encrypt/decrypt but only for one freame. This is certainly seething I can look it. Do you have some good repro from your experiment @davidfowl? |
SSL Stream currently encrypts into its own internal buffer and then copies to the user provided buffer. You might be able to skip that altogether and just use the user buffer directly.
Yep the issue here dotnet/runtime#49000 has a self contained runnable sample.
I think there are different things to solve. You can do a single write to the underlying transport but it would mean adding the new virtual method to stream to allow for passing an array of buffers (see the other linked issue). We need to solve both issues because they do affect both server and client performance (upload vs download). |
I think one of the issues for SSL Stream about decrypting in place when I originally looked at it was "breaking" the stream contract. The reason? no other stream that is "read" destroys the data you sent into a stream, eg FileStream.Read(byteBuffer, 0 , byteBuffer.Length); after the call byteBuffer is not what you passed in. I will take a look at my notes. |
@Drawaes I don't know if that's true here since the purpose of the caller passing in a buffer is to have it written into. |
The concern was around encrypting writes in place and destroying data in the write buffer. Mutating the read buffer should be fine. |
Right, that isn't the problem here. |
@smourier Do you have any measurements? We understand the fix here but don't have a good benchmark to verify if it solves the problem. |
We're going to use your sample as a starting point. |
BTW loopback is somewhat weird. I'm not sure about windows (as there is no real interface) but on Linux MTU is 64K. So either side can write as much in single step. The question is how that really impacts the throughput - especially when latency and round-trip is not in the play like it would be on real network. I guess doing this on large file (10G+) and measuring download time would be good approximation, right @davidfowl? |
Yep! If you need help setting up this scenario. Also YARP had a really interesting scenario we were looking at this morning: Client < - > YARP < -> gRPC streaming In the above scenario the gRPC streams were 100B in size therefore writing 100B over the network in potentially bigger than 100B chunks (kestrel does write behind buffering in order to batch writes). As a result YARP tries to copy 65K response chunks to the client but because of this issue, ends up reading 100 byte TLS frames and writes that to the client instead. The cost here is that when the async read yields, we do more work (write to the client) in much smaller frames than we could have if SSL Stream tried to fill our buffer. |
Closing this because the root cause is dotnet/runtime#49000 |
I'm using a .NET 5 (5.0.3) on a Windows 10 20H2 19042.844.
Whatever I try, my ASP.NET Core Kestrel server always writes packets to the wire by a maximum size which is 4096 (HTTP 1.1) and 65536 (HTTP 2).
The issue is the overall "download"" performance is bad (for a set of machines + network that would support bigger chunk size), especially with HTTP 1.1. Sometimes the packets size is only a few bytes.
Here is a reproducing code:
This is my launchSettings.json:
Things I've tried:
I think this comes from the internal ConcurrentPipeWriter but I'm unsure why.
Is this expected? Or is this a configuration problem? Is it an OS problem? How can I get bigger chunks?
The text was updated successfully, but these errors were encountered: