Check advisory locks in CopyFile #42458

filipnavara · 2019-11-07T22:44:04Z

Fixes #42455, regression introduced in #37583.

… EINTR to make sure locks get broken

filipnavara · 2019-11-08T01:19:23Z

The macOS test failure is relevant:

   System.IO.Tests.FileInfo_CopyTo_str_b.CopyFileWithData_MemberData(data: ['6', 0x0080, 'Ã‡', 'Ã˜', 0x0018, ...], readOnly: False) [FAIL]
      System.IO.IOException : The process cannot access the file '/var/folders/d9/xhjwf2ls6bd3nkq5gzp900g40000gy/T/FileInfo_CopyTo_str_b_jgu0izd5.gdp/CopyFileWithData_MemberData_115_0c3599b5' because it is being used by another process.
      Stack Trace:
        /_/src/System.Private.CoreLib/shared/System/IO/FileStream.Unix.cs(120,0): at System.IO.FileStream.Init(FileMode mode, FileShare share, String originalPath)
        /_/src/System.Private.CoreLib/shared/System/IO/FileStream.cs(252,0): at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options)
        /_/src/System.Private.CoreLib/shared/System/IO/FileStream.cs(175,0): at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share)
        /_/src/System.IO.FileSystem/src/System/IO/File.cs(259,0): at System.IO.File.OpenRead(String path)
        /_/src/System.IO.FileSystem/tests/File/Copy.cs(136,0): at System.IO.Tests.File_Copy_str_str.CopyFileWithData_MemberData(Char[] data, Boolean readOnly)

filipnavara · 2019-11-08T12:47:05Z

The Windows build failure is unrelated.

stephentoub · 2019-11-08T16:27:05Z

@filipnavara, thanks for working to fix this. We're duplicating more and more code now, though.

What would it look like if we reverted the previous change and still opened the FileStream in managed, getting all of these various checks "for free". If we didn't use clonefile, everything else would continue to operate the same. If we did use clonefile, then we would just end up overwriting the FileStream we opened in managed, which already did all of the appropriate access checks and the like, and on the clonefile path we'd just clobber it. It would of course mean an extra file getting opened/closed, but if that approach works it feels much cleaner.

Thoughts?

filipnavara · 2019-11-08T16:42:17Z

I am also unhappy about the code duplication but the solution is not easy.

It boils down to the fact that clonefile expects the destination file not to exist, while the other code path expects it to exists (or be created by the open call). We cannot blindly open the file using the FileStream logic and then fallback to clonefile because at that point the destination file would always exist and we would not even be able to delete it (short of closing the handle and creating a hacky code path on the managed side to account for that).

Any solution I came up with resulted in pretty bad trade offs in terms of file access API calls and/or number of P/Invokes.

stephentoub · 2019-11-08T16:43:44Z

we would not even be able to delete it

Why not? Having an open file descriptor to the file shouldn't prevent us from unlinking it.

My suggestion was basically to unlink and then clonefile. The only part of the previous change to managed code we'd need to keep would be changing the DllImport to also pass the destination path (in addition to the destination file descriptor).

filipnavara · 2019-11-08T16:47:08Z

Why not? Having an open file descriptor to the file shouldn't prevent us from unlinking it.

I can try it when I get home but I was quite convinced the clonefile would fail if we try to do that. Maybe I just come from Windows world and my expectation is incorrect.

filipnavara · 2019-11-08T16:48:47Z

Also, if we unlink it and clonefile fails then there's no way to fallback to the old code path.

stephentoub · 2019-11-08T16:53:12Z

Also, if we unlink it and clonefile fails then there's no way to fallback to the old code path.

That's... true. We could do something like:

only attempt the clonefile if a non-null destination path is specified
if the clonefile fails, return false back to managed indicating failure
if the managed code sees a false return, it reopens the destination and passes null for a second attempt

Or something like that. Presumably clonefile failing will be relatively rare and the extra retry cost thus not very impactful.

I'm just trying to keep the code as simple as possible and avoid duplicating all of FileStream's logic in native. Let's explore a few options and see what turns out the best.

filipnavara · 2019-11-08T17:27:55Z

Presumably clonefile failing will be relatively rare and the extra retry cost thus not very impactful.

I think that's a wrong assumption. Depending on machine configuration and use case it could fail quite often, or not at all. It only works within the same APFS volume. Machines configured with HFS+ will fail every time. Copying on NTFS/FAT will fail every time, and so will copying between different volumes.

I know this sounds bleak and you may even ask whether clonefile is worth it at all? It helps couple of important scenarios. For me, specifically, it's MSBuild copying files stored on my main system volume. This happens really often during development.

I'm just trying to keep the code as simple as possible and avoid duplicating all of FileStream's logic in native. Let's explore a few options and see what turns out the best.

The return false to managed code and retry doesn't sound like an improvement to me. At that point you can structure it like this:

using (var src = new FileStream(sourceFullPath, FileMode.Open, FileAccess.Read, FileShare.Read, DefaultBufferSize, FileOptions.None))
{
    if (!CloneFile(src.SafeFileHandle, sourceFullPath, destFullPath, overwrite))
    {
        using (var dst = new FileStream(destFullPath, overwrite ? FileMode.Create : FileMode.CreateNew, FileAccess.ReadWrite, FileShare.None, DefaultBufferSize, FileOptions.None))
        {
            Interop.CheckIo(Interop.Sys.CopyFile(src.SafeFileHandle, dst.SafeFileHandle));

The downside is more P/Invokes. The CloneFile would have to do FStat to check the existence of destination file, then do access check and unlink, and finally, call Interop.Sys.CloneFile. For non-macOS it could skip all that and just return false.

lpereira · 2019-11-14T00:58:14Z

src/Native/Unix/System.Native/pal_io.c

@@ -1315,7 +1353,8 @@ int32_t SystemNative_CopyFile(intptr_t sourceFd, const char* srcPath, const char
    if (!copied && CopyFile_ReadWrite(inFd, outFd) != 0)
    {
        tmpErrno = errno;
-        close(outFd);
+        while ((ret = flock(outFd, LOCK_UN)) < 0 && errno == EINTR);
+        while ((ret = close(outFd)) < 0 && errno == EINTR);


Retrying close(2) in Linux is a race condition (please see the notes section in the man page for details).

Thanks for pointing that out. In the last iteration it may not be necessary.

lpereira · 2019-11-14T01:01:33Z

src/Native/Unix/System.Native/pal_io.c

@@ -1265,6 +1285,23 @@ int32_t SystemNative_CopyFile(intptr_t sourceFd, const char* srcPath, const char
    fcntl(outFd, F_SETFD, FD_CLOEXEC);
 #endif

+    while ((ret = flock(outFd, LOCK_EX | LOCK_NB)) < 0 && errno == EINTR);


For Linux, it might be a better idea to use O_TMPFILE to create a temporary file while the copy is being performed, and when it's done, hardlink it with the new name. Then there's no need to use advisory file locking, which isn't really that useful.

The advisory locking is used by the managed FileStream so whatever is used it has to be done on both sides and match.

I would still like us to try hard to find ways around duplicating all this logic.

stephentoub · 2019-11-14T12:29:09Z

I know this sounds bleak and you may even ask whether clonefile is worth it at all?

I am asking that ;)

Can you please share the impact of the change on such a scenario? We had been using fcopyfile. What is the perf impact of using clonefile instead?

filipnavara · 2019-11-14T12:35:32Z

Can you please share the impact of the change on such a scenario? We had been using fcopyfile. What is the perf impact of using clonefile instead?

fcopyfile has no perf improvement over the fallback code whatsoever. I'll try to produce some perf numbers for clonefile. The anecdotal evidence is the following:

Original Mono PR by Miguel: "On my Mac with APFS, copying a 5 gig file (Xcode) goes from taking
7.2 seconds to copy on an SSD to 0.5 seconds."
My MSBuild times on Mono were noticeably better. That may not directly apply to "dotnet build" because deps.json cuts down the number of file copies significantly.

filipnavara · 2019-11-14T12:46:00Z

The big difference is that clonefile (or reflink on Linux) basically just adds a reference so under ideal conditions it avoids copying the data altogether. The new file becomes a copy-on-write reference.

https://dev.to/robogeek/reflinks-vs-symlinks-vs-hard-links-and-how-they-can-help-machine-learning-projects-1cj4

stephentoub · 2019-11-14T12:49:41Z

basically just adds a reference

Why would it take half a second then to copy 5gb? Shouldn't it be almost instantaneous "copying" any file, regardless of size?

filipnavara · 2019-11-14T12:53:06Z

Shouldn't it be almost instantaneous "copying" any file, regardless of size?

Not my number so I cannot comment on it. It should be near instantaneous. I will do the test and post back numbers, no need to speculate :)

baulig · 2019-11-22T12:29:08Z

The return false to managed code and retry doesn't sound like an improvement to me. At that point you can structure it like this:

using (var src = new FileStream(sourceFullPath, FileMode.Open, FileAccess.Read, FileShare.Read, DefaultBufferSize, FileOptions.None))
{
    if (!CloneFile(src.SafeFileHandle, sourceFullPath, destFullPath, overwrite))
    {
        using (var dst = new FileStream(destFullPath, overwrite ? FileMode.Create : FileMode.CreateNew, FileAccess.ReadWrite, FileShare.None, DefaultBufferSize, FileOptions.None))
        {
            Interop.CheckIo(Interop.Sys.CopyFile(src.SafeFileHandle, dst.SafeFileHandle));

I really dislike the two FileStream approach because you lose the ability of using the

int open(const char *pathname, int flags, mode_t mode);

variant to explicitly specify the file mode of the new file. The advantage of using this system call is that it does the right thing when you don't have permission to actually set the permissions to the ones of the source file. Whereas the fchmod() approach puts you into that awkward spot as to what to do when it fails due to insufficient permissions.

For instance, fchmod() will fail on Android as well in certain scenarios on all Unix-based systems, such as writing to a network share, setgid directories that you don't own etc.

maryamariyan · 2019-12-02T19:43:01Z

Thank you for your contribution. As announced in #27549 the dotnet/runtime repository will be used going forward for changes to this code base. Closing this PR as no more changes will be accepted into master for this repository. If you’d like to continue working on this change please move it to dotnet/runtime.

Check flock on CopyFile

987d822

Dotnet-GitSync-Bot added the area-System.IO label Nov 7, 2019

Add test case

c6fd217

filipnavara force-pushed the flock branch from af21769 to c6fd217 Compare November 7, 2019 23:13

filipnavara closed this Nov 8, 2019

Fix lock checks, truncate after checking the lock, retry closes after…

16e6eff

… EINTR to make sure locks get broken

filipnavara reopened this Nov 8, 2019

One more EINTR fix

b40dd9a

filipnavara marked this pull request as ready for review November 8, 2019 09:12

filipnavara added 3 commits November 8, 2019 10:48

Bail out early if open fails

73cb482

Add explicit unlocks

42849d1

Add one more explicit unlock

ba0d450

lpereira reviewed Nov 14, 2019

View reviewed changes

ViktorHofer added the * NO MERGE * The PR is not ready for merge yet (see discussion for detailed reasons) label Nov 14, 2019

filipnavara mentioned this pull request Nov 22, 2019

[DRAFT]: Alternative SystemNative_FileCopy() implementation for Unix. mono/mono#17870

Closed

maryamariyan closed this Dec 2, 2019

karelz added this to the 5.0 milestone Dec 19, 2019

This was referenced Mar 3, 2020

File.Copy no longer respects FileShare.None on Linux/Mac in 5.0 dotnet/runtime#31429

Closed

Use copy_file_range(2) to implement System.Native.CopyFile() dotnet/runtime#1495

Closed

MichaIng mentioned this pull request Oct 21, 2020

exFAT import permissions failure on Linux Sonarr/Sonarr#3364

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check advisory locks in CopyFile #42458

Check advisory locks in CopyFile #42458

filipnavara commented Nov 7, 2019 •

edited

Loading

filipnavara commented Nov 8, 2019

filipnavara commented Nov 8, 2019

stephentoub commented Nov 8, 2019

filipnavara commented Nov 8, 2019 •

edited

Loading

stephentoub commented Nov 8, 2019 •

edited

Loading

filipnavara commented Nov 8, 2019

filipnavara commented Nov 8, 2019

stephentoub commented Nov 8, 2019 •

edited

Loading

filipnavara commented Nov 8, 2019 •

edited

Loading

lpereira Nov 14, 2019

filipnavara Nov 14, 2019

lpereira Nov 14, 2019

filipnavara Nov 14, 2019

stephentoub Nov 14, 2019

stephentoub commented Nov 14, 2019

filipnavara commented Nov 14, 2019

filipnavara commented Nov 14, 2019 •

edited

Loading

stephentoub commented Nov 14, 2019

filipnavara commented Nov 14, 2019

baulig commented Nov 22, 2019

maryamariyan commented Dec 2, 2019

Check advisory locks in CopyFile #42458

Check advisory locks in CopyFile #42458

Conversation

filipnavara commented Nov 7, 2019 • edited Loading

filipnavara commented Nov 8, 2019

filipnavara commented Nov 8, 2019

stephentoub commented Nov 8, 2019

filipnavara commented Nov 8, 2019 • edited Loading

stephentoub commented Nov 8, 2019 • edited Loading

filipnavara commented Nov 8, 2019

filipnavara commented Nov 8, 2019

stephentoub commented Nov 8, 2019 • edited Loading

filipnavara commented Nov 8, 2019 • edited Loading

lpereira Nov 14, 2019

Choose a reason for hiding this comment

filipnavara Nov 14, 2019

Choose a reason for hiding this comment

lpereira Nov 14, 2019

Choose a reason for hiding this comment

filipnavara Nov 14, 2019

Choose a reason for hiding this comment

stephentoub Nov 14, 2019

Choose a reason for hiding this comment

stephentoub commented Nov 14, 2019

filipnavara commented Nov 14, 2019

filipnavara commented Nov 14, 2019 • edited Loading

stephentoub commented Nov 14, 2019

filipnavara commented Nov 14, 2019

baulig commented Nov 22, 2019

maryamariyan commented Dec 2, 2019

filipnavara commented Nov 7, 2019 •

edited

Loading

filipnavara commented Nov 8, 2019 •

edited

Loading

stephentoub commented Nov 8, 2019 •

edited

Loading

stephentoub commented Nov 8, 2019 •

edited

Loading

filipnavara commented Nov 8, 2019 •

edited

Loading

filipnavara commented Nov 14, 2019 •

edited

Loading