Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileStream.Unix: improve DeleteOnClose when FileShare.None. #55327

Merged
merged 9 commits into from
Oct 5, 2021

Conversation

tmds
Copy link
Member

@tmds tmds commented Jul 8, 2021

Use FileShare.None lock to detect when another Handle that has
DeleteOnClose deleted the file, and then re-open it.

@adamsitnik @stephentoub ptal.

Use FileShare.None lock to detect when another Handle that has
DeleteOnClose deleted the file, and then re-open it.
@ghost
Copy link

ghost commented Jul 8, 2021

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Issue Details

Use FileShare.None lock to detect when another Handle that has
DeleteOnClose deleted the file, and then re-open it.

@adamsitnik @stephentoub ptal.

Author: tmds
Assignees: -
Labels:

area-System.IO

Milestone: -

@tmds
Copy link
Member Author

tmds commented Jul 8, 2021

On Linux, NuGet doesn't clean up its lock files because DeleteOnClose has concurrency issues (NuGet/NuGet.Client#431).

I was looking introducing some specific cleanup logic in NuGet/NuGet.Client#4123.
This PR should fix it in the framework instead and allow DeleteOnClose to be used for cleanup of NuGet lock files.

cc @zivkan

@stephentoub
Copy link
Member

stephentoub commented Jul 8, 2021

On Linux, NuGet doesn't clean up its lock files because DeleteOnClose has concurrency issues

What are the issues?

EDIT: nevermind, I see your description in the nuget pr.

@marcin-krystianc
Copy link
Contributor

FWIW, I think it makes a lot of sense to fix it in the runtime. The intent of:

using (new FileStream(path, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None, 4096, FileOptions.DeleteOnClose))
{
}

is very clear and there is no doubt what it wants to achieve. So it is quite dangerous that currently this works on Windows but it is broken on Linux.

@tmds
Copy link
Member Author

tmds commented Jul 13, 2021

The test I've added is failing. I need to take a closer look.

@zivkan
Copy link
Member

zivkan commented Jul 15, 2021

FWIW, the nuget.org server team coincidentally contacted me asking if it's possible for client to make duplicate HTTP requests to the same URL. This got me thinking about this PR.

Is it possible that two processes/threads opening a file with FileShare.None can BOTH successfully open the file? The CDN logs show it's happening on Linux.

edit: 90%+ of instances are on Linux, but it's also happening on Windows. NuGet guards simultaneous downloads of the same package via a lock file, hence why I thought of this PR.

@tmds
Copy link
Member Author

tmds commented Jul 16, 2021

CI failures seem unrelated now.

Is it possible that two processes/threads opening a file with FileShare.None can BOTH successfully open the file? The CDN logs show it's happening on Linux.

No. One will fail the lock and throw:

// The only error we care about is EWOULDBLOCK, which indicates that the file is currently locked by someone
// else and we would block trying to access it. Other errors, such as ENOTSUP (locking isn't supported) or
// EACCES (the file system doesn't allow us to lock), will only hamper FileStream's usage without providing value,
// given again that this is only advisory / best-effort.
Interop.ErrorInfo errorInfo = Interop.Sys.GetLastErrorInfo();
if (errorInfo.Error == Interop.Error.EWOULDBLOCK)
{
throw Interop.GetExceptionForIoErrno(errorInfo, path, isDirectory: false);
}

@terrajobst terrajobst added the community-contribution Indicates that the PR has been added by a community member label Jul 19, 2021
@jeffhandley
Copy link
Member

@adamsitnik This PR is assigned to you for follow-up/decision before the RC1 snap.

Copy link
Member

@adamsitnik adamsitnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of calling unlink before unlocking the file. We could also let close() unlock it for us and call flock(LOCK_UN) only when close() fails.

But with the proposed solution I am not sure if we are able to guarantee that FileShare.None combined with DeleteOnClose can be used as a safe global lock.

In case of NuGet, they should rather use FileMode.CreateNew instead of FileMode.OpenOrCreate

https://github.com/NuGet/NuGet.Client/blob/0ae80f5495c8d6fad2ec6e2c708727675ff08fb6/src/NuGet.Core/NuGet.Common/ConcurrencyUtilities.cs#L217

it's mapped to O_EXCL:

case FileMode.CreateNew:
flags |= (Interop.Sys.OpenFlags.O_CREAT | Interop.Sys.OpenFlags.O_EXCL);

From https://man7.org/linux/man-pages/man2/open.2.html:

Ensure that this call creates the file: if this flag is
specified in conjunction with O_CREAT, and pathname
already exists, then open() fails with the error EEXIST.

Which combined with DeleteOnClose should serve as pretty good global lock. Please correct me if I am wrong.

@adamsitnik adamsitnik modified the milestones: 6.0.0, 7.0.0 Aug 10, 2021
@tmds
Copy link
Member Author

tmds commented Aug 17, 2021

Which combined with DeleteOnClose should serve as pretty good global lock.

The trouble is when the file already exists. There is no way to open it and be sure you and the other process are referring to the same file entry.

The change does this by performing Stat and comparing it with FStat after obtaining the lock.

@tmds
Copy link
Member Author

tmds commented Aug 19, 2021

@adamsitnik can you take another look?

@tmds
Copy link
Member Author

tmds commented Sep 2, 2021

ping @adamsitnik

@carlossanlop carlossanlop self-assigned this Sep 23, 2021
Copy link
Member

@carlossanlop carlossanlop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the patience, @tmds . I left a few comments.

Co-authored-by: Carlos Sanchez <1175054+carlossanlop@users.noreply.github.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.IO community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants