Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Breaking change] FileStream does not synchronize file offset with OS anymore #50860

Closed
adamsitnik opened this issue Apr 7, 2021 · 5 comments
Labels
area-System.IO breaking-change Issue or PR that represents a breaking API or functional change over a prerelease.
Milestone

Comments

@adamsitnik
Copy link
Member

adamsitnik commented Apr 7, 2021

Before .NET 6, FileStream was using expensive syscall called SetFilePointer to synchronize it's private offset with Windows OS.

A blog post from Windows Performance Team from 2008 calls it an anachronism:

The old DOS SetFilePointer API is an anachronism. One should specify the file offset in the overlapped structure even for synchronous I/O. It should never be necessary to resort to the hack of having private file handles for each thread.

In order to improve the performance of async reads and writes and solve the following issues:

FileStream is no longer doing that (#49975) and the offset is just kept in memory. This has allowed for up to two times faster ReadAsync and up to five time faster WriteAsync! See #49975 for full details.

FileStream.Position always returns the current offset, but if the user obtains the file handle via call to FileStream.SafeFileHandle and asks the OS for the current file offset of the given handle by performing a syscall, the value will return 0.

[DllImport("kernel32.dll")]
private static extern bool SetFilePointerEx(SafeFileHandle hFile, long liDistanceToMove, out long lpNewFilePointer, uint dwMoveMethod);

byte[] bytes = new byte[10_000];
string path = Path.Combine(Path.GetTempPath(), Path.GetTempFileName());

using (FileStream fs = new FileStream(path, FileMode.Create, FileAccess.ReadWrite, FileShare.None, bufferSize: 4096, useAsync: true))
{
    SafeFileHandle handle = fs.SafeFileHandle;

    await fs.WriteAsync(bytes, 0, bytes.Length);
    Console.WriteLine(fs.Position);

    if (SetFilePointerEx(handle, 0, out long currentOffset, 1 /* get current offset */))
    {
        Console.WriteLine(currentOffset);
    }
}

Pre-change behavior:

10000
10000

Post-change behavior:

10000
0

It works in the other direction as well: if the user obtains filehandle and performs a syscall that moves the file offset, the offset returned by FileStream.Position won't be changed.

To enable the .NET 5 behavior, users can specify an AppContext switch or an environment variable:

{
    "configProperties": {
        "System.IO.UseNet5CompatFileStream": true
    }
}
set DOTNET_SYSTEM_IO_USENET5COMPATFILESTREAM=1

@dotnet/compat @stephentoub @jozkee @carlossanlop @jeffhandley

FWIW I've ensured that this change is not breaking ASP.NET (dotnet/aspnetcore#31441), WinForms (dotnet/winforms#4756) and the SDK (dotnet/sdk#16684)

@adamsitnik adamsitnik added area-System.IO breaking-change Issue or PR that represents a breaking API or functional change over a prerelease. labels Apr 7, 2021
@adamsitnik adamsitnik added this to the 6.0.0 milestone Apr 7, 2021
@ghost
Copy link

ghost commented Apr 7, 2021

Tagging subscribers to this area: @carlossanlop
See info in area-owners.md if you want to be subscribed.

Issue Details

Before .NET 6, FileStream was using expensive syscall called SetFilePointer to synchronize it's private offset with Windows OS.

A blog post from Windows Performance Team from 2008 calls it an anachronism:

The old DOS SetFilePointer API is an anachronism. One should specify the file offset in the overlapped structure even for synchronous I/O. It should never be necessary to resort to the hack of having private file handles for each thread.

In order to improve the performance of async reads and writes and solve the following issues:

FileStream is no longer doing that (#49975) and the offset is just kept in memory. This has allowed for up to two times faster ReadAsync and up to five time faster WriteAsync! See #49975 for full details.

FileStream.Position always returns the current offset, but if the user obtains the file handle via call to FileStream.SafeFileHandle and asks the OS for the current file offset of the given handle by performing a syscall, the value will return 0.

[DllImport("kernel32.dll")]
private static extern bool SetFilePointerEx(SafeFileHandle hFile, long liDistanceToMove, out long lpNewFilePointer, uint dwMoveMethod);

byte[] bytes = new byte[10_000];
string path = Path.Combine(Path.GetTempPath(), Path.GetTempFileName());

using (FileStream fs = new FileStream(path, FileMode.Create, FileAccess.ReadWrite, FileShare.None, bufferSize: 4096, useAsync: true))
{
    await fs.WriteAsync(bytes, 0, bytes.Length);
    Console.WriteLine(fs.Position);

    if (SetFilePointerEx(fs.SafeFileHandle, 0, out long currentOffset, 1 /* get current offset */))
    {
        Console.WriteLine(currentOffset);
    }
}

Pre-change behavior:

10000
10000

Post-change behavior:

10000
0

It works in the other direction as well: if the user obtains filehandle and performs a syscall that moves the file offset, the offset returned by FileStream.Position won't be changed.

To enable the .NET 5 behavior, users can specify an AppContext switch or an environment variable:

{
    "configProperties": {
        "System.IO.UseNet5CompatFileStream": true
    }
}
set DOTNET_SYSTEM_IO_USENET5COMPATFILESTREAM=1

@dotnet/compat @stephentoub @jozkee @carlossanlop @jeffhandley

FWIW I've ensured that this change is not breaking ASP.NET (dotnet/aspnetcore#31441), WinForms (dotnet/winforms#4756) and the SDK (dotnet/sdk#16684)

Author: adamsitnik
Assignees: -
Labels:

area-System.IO, breaking-change

Milestone: 6.0.0

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Apr 7, 2021
@adamsitnik adamsitnik removed the untriaged New issue has not been triaged by the area owner label Apr 7, 2021
@carlossanlop
Copy link
Member

carlossanlop commented Apr 20, 2021

@adamsitnik since both #16354 and #25905 got fixed, is there anything else we should address before we close this issue?

Since this is a breaking change that needs to be documented, I opened this issue in dotnet/docs: dotnet/docs#23857

@jozkee
Copy link
Member

jozkee commented May 5, 2021

The current code doesn't actually shows that the Position is not being synchronized with the OS because you are accessing the SafeFileHandle which issues a Seek before returning the handle and thus, the OS knows that you changed the position.

We should expose the handle before writing:

using (FileStream fs = new FileStream(path, FileMode.Create, FileAccess.ReadWrite, FileShare.None, bufferSize: 4096, useAsync: true))
{
+    SafeFileHandle handle = fs.SafeFileHandle;

    await fs.WriteAsync(bytes, 0, bytes.Length);
    Console.WriteLine(fs.Position);

+    if (SetFilePointerEx(handle, 0, out long currentOffset, 1 /* get current offset */))
-    if (SetFilePointerEx(fs.SafeFileHandle, 0, out long currentOffset, 1 /* get current offset */))
    {
        Console.WriteLine(currentOffset);
        Console.WriteLine(fs.Position);
    }
}

@adamsitnik
Copy link
Member Author

@jozkee great point! It just shows how hard it is to hit the breaking change scenario ;)

@adamsitnik
Copy link
Member Author

Closing (dotnet/docs#24060)

@ghost ghost locked as resolved and limited conversation to collaborators Sep 10, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.IO breaking-change Issue or PR that represents a breaking API or functional change over a prerelease.
Projects
None yet
Development

No branches or pull requests

3 participants