Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Win32 FileStream turns async reads into sync reads #16341

Closed
stephentoub opened this issue Feb 10, 2016 · 14 comments · Fixed by #48813
Closed

Win32 FileStream turns async reads into sync reads #16341

stephentoub opened this issue Feb 10, 2016 · 14 comments · Fixed by #48813
Assignees
Labels
area-System.IO enhancement Product code improvement that does NOT require public API changes/additions os-windows tenet-performance Performance related issue wishlist Issue we would like to prioritize, but we can't commit we will get to it yet
Milestone

Comments

@stephentoub
Copy link
Member

When filling the internal buffer, Win32FileStream does this as part of ReadAsync:

Task<int> readTask = ReadInternalCoreAsync(_buffer, 0, _bufferSize, 0, cancellationToken);
_readLen = readTask.GetAwaiter().GetResult();

Ugh!

There's a large comment about how this is done to avoid concurrent use of the buffer when concurrent read operations are issued, but we should be able to work around that using something similar to what I previously did for WriteAsync and FlushAsync, with the HasActiveBufferOperation mechanism (dotnet/corefx#2929).

As it currently stands, it appears that when async reads are performed against a Win32FileStream and those reads are smaller than the file stream's buffer, all such reads will be made synchronous, either because they're pulling from the buffer or because they're blocking waiting for the buffer to be filled.

cc: @ericstj, @ianhays, @JeremyKuhne

@stephentoub
Copy link
Member Author

I spent a day on this and found it's going to be more complicated than I'd initially thought. It's going to need a large overhaul of the code base as well as likely compromising on some of the behavioral details of the type. In particular, there are assumptions throughout the code that reads using the buffer are always going to be synchronous such that code to synchronous methods like SetLength and ReadByte and whatnot can rely on the current values of buffer-related state, like _readPos. I think the basic approach will be:

  • Use _activeBufferOperation to track whether there's currently any operation using the buffer and its related state
  • Update async reads and writes such that if there isn't an active operation, the call is immediately invoked and _activeBufferOperation is set to the resulting task for the whole operation or whichever part of it actually uses the buffer and its state, and if there is an active operation, make the new call the continuation of whatever was tracked by _activeBufferOperation
  • Update all synchronous operations to appropriately delegate to async counterparts if there is an active buffered operation
  • Determine how we want to handle the fact that today you can rely on the FileStream's position (both tracked by FileStream and in the handle) after these async operations return to the synchronous caller, and how that'll be impacted by multiple outstanding async operations that may be continuations and thus may not have actually updated the state yet.

@stephentoub
Copy link
Member Author

stephentoub commented Feb 5, 2018

@danmosemsft, @JeremyKuhne, @pjanotti, any chance we can get this fixed for 2.1? It's pretty nasty.

@danmoseley
Copy link
Member

So far as I can tell this goes back roughly to when async support was added, in 2010. To help prioritize, do we have data on the impact, and any customer reports of it?

@stephentoub
Copy link
Member Author

The impact is async isn't async. I don't have concrete data other than "ugh" :)

@danmoseley
Copy link
Member

OK, let's consider it when @JeremyKuhne completes file enumeration work. We know that wont be the last IO win to make.

@benaadams
Copy link
Member

To help prioritize, do we have data on the impact, and any customer reports of it?

Kestrel is much faster at serving application content than IIS; but much slower at serving file content (which use the .NET file apis), it would be good to redress this balance. This is likely to be one of the many links in the chain.

@danmoseley
Copy link
Member

@benaadams thanks. Do you see indications in profiles of your own server that the issue might be here? I guess it would look like high inclusive time in ReadAsyncInternal

@benaadams
Copy link
Member

An async function calling an async function then performing a sync wait on it is just trollin', surely?

I'll see if I can get some data :)

@danmoseley
Copy link
Member

Yeah, it's obviously not right

@danmoseley
Copy link
Member

Did you happen to end up getting data @benadams? Totally fine if you didn't.

As it is this isn't likely to happen for 2.1.

@benaadams
Copy link
Member

No sorry, haven't had time

@danmoseley
Copy link
Member

OK !

@benaadams
Copy link
Member

O_o

Read and Write share a buffer, so if there is anything in the buffer for Write it needs to flush it first. To do so it queues a Flush (which it also does in other places):

// Handle buffering.
if (_writePos > 0) FlushWriteBuffer();
if (_readPos == _readLength)
{
    if (destination.Length < _bufferLength)
    {
        Task<int> readTask = ReadNativeAsync(new Memory<byte>(GetBuffer()), 0, cancellationToken);
        _readLength = readTask.GetAwaiter().GetResult();

FlushWriteBuffer() to ensure its completed also does synchronous blocking

Task writeTask = FlushWriteAsync(CancellationToken.None);
if (!calledFromFinalizer)
{
    writeTask.GetAwaiter().GetResult();
}

@JeremyKuhne JeremyKuhne changed the title Win32FileStream turns async reads into sync reads Win32 FileStream turns async reads into sync reads Jan 15, 2020
@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 5.0 milestone Jan 31, 2020
@jasonmalinowski
Copy link
Member

jasonmalinowski commented Feb 10, 2020

I ran across our large comment in Roslyn that points to this bug and was curious to see what ever came of this. Roslyn still has a pretty fun workaround for this:

https://github.com/dotnet/roslyn/blob/0dc28dc46c494189d1aca48224fa51eca548e8bd/src/Workspaces/Core/Portable/Workspace/FileTextLoader.cs#L106-L171

The impact is async isn't async. I don't have concrete data other than "ugh" :)

When we originally discovered this, it was because a customer had Visual Studio hanging for minutes at a time while the thread pool got starved. We had to ship a private hotfix for that customer with that workaround. So at the impact was worse than just "ugh". We're past it and haven't been bitten since so I'm not pushing for this to prioritized, but just wanted to illustrate a fun case where the impact was pretty bad. 😄

@adamsitnik adamsitnik self-assigned this Mar 1, 2021
@adamsitnik adamsitnik modified the milestones: Future, 6.0.0 Mar 1, 2021
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Mar 1, 2021
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Mar 17, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Apr 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.IO enhancement Product code improvement that does NOT require public API changes/additions os-windows tenet-performance Performance related issue wishlist Issue we would like to prioritize, but we can't commit we will get to it yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants