Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API Proposal: tell if a filesystem entry is any type of link #53577

Open
Tracked by #57205 ...
carlossanlop opened this issue Jun 2, 2021 · 9 comments
Open
Tracked by #57205 ...

API Proposal: tell if a filesystem entry is any type of link #53577

carlossanlop opened this issue Jun 2, 2021 · 9 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.IO
Milestone

Comments

@carlossanlop
Copy link
Member

carlossanlop commented Jun 2, 2021

Background and Motivation

We recently got some APIs approved to provide the ability to create a symbolic link, and to return the target of a symbolic link.

The discussion continued after the approval to request an additional API that would tell if a file is a link or not.

There were a few particular things that were emphasized during the discussion:

  • The API can be opaque: it does not need to tell the type of link, only if it is any link.
  • There are 3 types of links that we should support: symbolic link, junction (NTFS) and AppExecLink (NTFS).
  • We can later decide if we want to expose all the reparse tags, for a more fine-grained control of reparse points. We have an issue open to discuss the design of such APIs.

Also, there are some platform-specific properties of these link types that we need to keep in mind:

  • Windows differentiates between a symbolic link to a file or to a directory. Unix does not.
  • NTFS supports Junctions and AppExecLinks. It's OS independent, so if a Unix machine has the proper NTFS driver installed, these link types should be detected there too.
  • Junctions only apply to directories.
  • AppExecLinks only apply to files.
  • When a FileInfo wraps a link that points to a directory, there is no exception thrown, but Exists returns false. Same if a DirectoryInfo wraps a link to a file. Our API should have a similar behavior.

Proposed API

namespace System.IO.FileSystem
{
    public abstract class FileSystemInfo
    {
+       public bool IsLink { get; }
    }
}

Usage Examples

FileInfo file = new FileInfo("/path/to/file-link");
Console.WriteLine(file.IsLink); // Prints true if file is a symlink to a file

DirectoryInfo directory = new DirectoryInfo("/path/to/dir-link");
Console.WriteLine(directory.IsLink); // Prints true if file is a symlink to a directory

FileInfo wrongfile = new FileInfo("/path/to/dir-link");
Console.WriteLine(wrongfile.IsLink); // Prints false because FileInfo is wrapping a directory link

FileInfo wrongdir = new FileInfo("/path/to/file-link");
Console.WriteLine(wrongdir.IsLink); // Prints false because DirectoryInfo is wrapping a file link

Optional additional designs

If desired during the review, we can also add static methods so users don't have to rely on FileSystemInfo instances:

namespace System.IO.FileSystem
{
    public static class File
    {
+       public static bool IsLink(string path);
    }
}
namespace System.IO.FileSystem
{
    public static class Directory
    {
+       public static bool IsLink(string path);
    }
}

cc @jozkee @mklement0 @tmds @iSazonov

@carlossanlop carlossanlop added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Jun 2, 2021
@carlossanlop carlossanlop added this to the 6.0.0 milestone Jun 2, 2021
@dotnet-issue-labeler dotnet-issue-labeler bot added area-System.IO untriaged New issue has not been triaged by the area owner labels Jun 2, 2021
@ghost
Copy link

ghost commented Jun 2, 2021

Tagging subscribers to this area: @carlossanlop
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and Motivation

We recently got some APIs approved to provide the ability to create a symbolic link, and to return the target of a symbolic link.

The discussion continued after the approval to request an additional API that would tell if a file is a link or not.

There were two particular things that were emphasized during the discussion:

  1. The API can be opaque: it does not need to tell the type of link, only if it is any link.
  2. There are 3 types of links that we should support: symbolic link, junction (NTFS) and AppExecLink (NTFS).

Also, there are some platform-specific properties of these link types that we need to keep in mind:

  • Windows differentiates between a symbolic link to a file or to a directory. Unix does not.
  • NTFS supports Junctions and AppExecLinks. It's OS independent, so if a Unix machine has the proper NTFS driver installed, these link types should be detected there too.
  • Junctions only apply to directories.
  • AppExecLinks only apply to files.
  • When a FileInfo wraps a link that points to a directory, there is no exception thrown, but Exists returns false. Same if a DirectoryInfo wraps a link to a file. Our API should have a similar behavior.

Proposed API

namespace System.IO.FileSystem
{
    public abstract class FileSystemInfo
    {
+       public bool IsLink { get; }
    }
}

Usage Examples

FileInfo file = new FileInfo("/path/to/file-link");
Console.WriteLine(file.IsLink); // Prints true if file is a symlink to a file

DirectoryInfo directory = new DirectoryInfo("/path/to/dir-link");
Console.WriteLine(directory.IsLink); // Prints true if file is a symlink to a directory

FileInfo wrongfile = new FileInfo("/path/to/dir-link");
Console.WriteLine(wrongfile.IsLink); // Prints false because FileInfo is wrapping a directory link

FileInfo wrongdir = new FileInfo("/path/to/file-link");
Console.WriteLine(wrongdir.IsLink); // Prints false because DirectoryInfo is wrapping a file link

Optional additional designs

If desired during the review, we can also add static methods so users don't have to rely on FileSystemInfo instances:

namespace System.IO.FileSystem
{
    public static class File
    {
+       public static bool IsLink(string path);
    }
}
namespace System.IO.FileSystem
{
    public static class Directory
    {
+       public static bool IsLink(string path);
    }
}

cc @jozkee @mklement0 @tmds @iSazonov

Author: carlossanlop
Assignees: carlossanlop, Jozkee
Labels:

api-suggestion, area-System.IO, untriaged

Milestone: 6.0.0

@carlossanlop carlossanlop removed the untriaged New issue has not been triaged by the area owner label Jun 2, 2021
@hamarb123
Copy link
Contributor

Just checking that IsLink would essentially just return info.ReparseTag != ReparseTag.None && (info.ReparseTag & 0x20000000 != 0) (but with only 1 call), wouldn't it?

@mklement0
Copy link

mklement0 commented Jun 2, 2021

@hamarb123, assuming that .ReparseTag is implemented as proposed in #1908 - which includes artificially mapping a symlink on Unix to ReparseTag.SymLink - that seems like a reasonable implementation.

However, we need to get clarity on what .IsLink returning true is meant to signal:

  • (a) Does it indicate whether the file-system itself considers something a link, as signaled on NTFS via the name-surrogate bit in the reparse-tag value?

  • (b) Or does it indicate that .NET not only knows that it is a link, but also knows how to explicitly resolve the link to its target? That is, can members such as .LinkTarget or .ResolveLinkTarget() predictably be called?

Given how NTFS works, (b) is invariably a subset of (a), given that any application can come along and define a new link type (as signaled by the name-surrogate flag) using application-specific non-standardized data to store the link information (plus a matching file-system filter).

As argued before, the .NET APIs cannot and should not be expected to keep up with such custom link types.

Perhaps the distinction will not matter much in practice: while the introduction of new reparse points in general seems likely, such as for third-party rehydrate-on-demand remote storage solutions, introducing new reparse points that are true links seems much less likely (Microsoft's own AppX reparse points are the only example that come to mind).

@tmds
Copy link
Member

tmds commented Jun 2, 2021

@mklement0 I don't know much about the windows reparse points. On Unix, besides links, there are also mounts. I wouldn't expect IsLink to be true for those. A mount is when another filesystem is put at some place in the file hierarchy. Maybe some of these reparse points are more similar to mounts than they are to links.

@mklement0
Copy link

Agreed, @tmds: Unix mount points appear as regular directories, not as (symbolic) links.

NTFS volume mount points aka mounted folders are similar, though I presume they do present as IO_REPARSE_TAG_MOUNT_POINT, which, somewhat confusingly, is also reported for junctions, which unequivocally are links.

  • Junctions (limited to targeting directories on local volumes), as you would expect from a link, require definition in terms of a user-visible file-system path.

  • Volume mount points, by contrast, require a (local) volume's GUID path as the target (\\?\Volume{GUID}\), and such paths are not directly user-visible, from what I can tell, and may or may not refer to a volume that has a drive letter associated with it.

In other words: Conceptually, it also makes sense not to treat NTFS volume mount points, i.e. IO_REPARSE_TAG_MOUNT_POINT reparse points whose target is a volume GUID path, as links, given that their targets aren't other user-visible file-system paths.

@hamarb123
Copy link
Contributor

You are correct @mklement0, both junctions and mount points use IO_REPARSE_TAG_MOUNT_POINT, see https://en.wikipedia.org/wiki/NTFS_reparse_point#Volume_mount_points
Personally, for mount points I think that on unix, since it is often the "normal" / "correct" version of the path to the volume, we should not consider it as a link. On Windows, I think that the point of forcing the GUID path is because the label / drive letter can either both exist, neither exist, or only one exist (and can also change at any point), so this makes sure that it is actually referring to the correct volume (also, see the purpose of \\?\ here). I think that the code should not consider mount points any different as junctions on Windows since they seem to be essentially the same thing (with different allowable target paths). I think that they should show as links since there is a more "normal" version of the path, but maybe not since the path could be sum unnamed, unlettered drive.
For the meaning of .IsLink, I think that it should probably be whether it represents another file on the filesystem, but I would also be fine with (b), but if it is (b), I think that the API should instead be called .IsSupportedLink. Maybe we could have both?

@mklement0
Copy link

Thanks for confirming the dual use of IO_REPARSE_TAG_MOUNT_POINT, @hamarb123.

I think that the code should not consider mount points any different as junctions on Windows since they seem to be essentially the same thing

They are essentially the same thing from an implementation detail perspective, which, unfortunately, is at odds with the abstract link API proposed in #24271 (comment):

  • The API is based on the assumption that a link points to another user-visible file-system path, which doesn't apply to volume GUID paths.

There are two conceivable resolutions:

  • Consider volume mount points links too, and report the volume GUID path via .LinkTarget(), but - of necessity - return null from .ResolveLinkTarget()

  • Do not consider volume mount points links, and let users detect and inspect them via the reparse-point API being discussed in Expose reparse point tags #1908 (comment)

but I would also be fine with (b), but if it is (b), I think that the API should instead be called .IsSupportedLink. Maybe we could have both?

I'm now leaning toward (a):

  • Let .IsLink report everything that the file system thinks is a link, even if .NET doesn't understand the particular NTFS link type and cannot query its target,
  • combined with with making .LinkTarget and .ResolveLinkTarget() simply return null for unknown link types; again, this NTFS-specific scenario could then be handled via the reparse-point API (assuming the caller understands the link format).

@adamsitnik
Copy link
Member

@jozkee @carlossanlop @jeffhandley moving to 7.0

@tmds
Copy link
Member

tmds commented Aug 17, 2021

I'm now leaning toward (a):
Let .IsLink report everything that the file system thinks is a link, even if .NET doesn't understand the particular NTFS link type and cannot query its target,
combined with with making .LinkTarget and .ResolveLinkTarget() simply return null for unknown link types; again, this NTFS-specific scenario could then be handled via the reparse-point API (assuming the caller understands the link format).

+1

I think it's proper when FileInfo properties return information about the underlying struct WIN32_FILE_ATTRIBUTE_DATA/struct stat, and avoid retrieving information beyond that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.IO
Projects
None yet
Development

No branches or pull requests

7 participants