Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Path.is_dir returns false for folders in OneDrive using "Files On Demand" #46484

Closed
roblourens opened this issue Dec 4, 2017 · 11 comments · Fixed by #47956
Closed

Path.is_dir returns false for folders in OneDrive using "Files On Demand" #46484

roblourens opened this issue Dec 4, 2017 · 11 comments · Fixed by #47956
Labels
C-bug Category: This is a bug. O-windows Operating system: Windows

Comments

@roblourens
Copy link

Files On Demand is a new OneDrive feature that only downloads files to the local disk when needed.

  • Install the latest Windows and OneDrive updates
  • Open OneDrive settings, enable Files On Demand
  • Run this code with the path to your OneDrive folder:
    let metadata = Path::new("C:\\Users\\roblourens\\OneDrive").metadata().expect("failed");

    println!("{:?}", metadata.file_type());
    println!("{:?}", metadata.is_dir());
    println!("{:?}", metadata.is_file());
    println!("{:?}", metadata.len());

Output:

FileType(ReparsePoint)
false
false
4096

Here's a PR in libuv that fixed an issue with Files On Demand, in case there's a similar issue in Rust: https://github.com/libuv/libuv/pull/1522/files

Ref: BurntSushi/ripgrep#705 and microsoft/vscode#35433

@jonas-schievink
Copy link
Contributor

Are you sure this is a bug? Seems like those folders are indeed reparse points and only look like folders to the user.

@roblourens
Copy link
Author

I guess I'm not sure what the expected result is. Node says it's a directory:

require('fs').lstatSync('C:\\Users\\roblourens\\OneDrive').isDirectory(); // => true

@retep998
Copy link
Member

retep998 commented Dec 4, 2017

When given a reparse point metadata is supposed to return information on the target of the reparse point, not the reparse point itself. To quote the documentation "This function will traverse symbolic links to query information about the destination file." Therefore the fact that this doesn't follow the reparse point and simply says that it is a reparse point is a bug.

@TimNN TimNN added C-bug Category: This is a bug. O-windows Operating system: Windows labels Dec 5, 2017
@roblourens
Copy link
Author

Is there anything I can help with on this bug? It's the root cause of a nasty bug that prevents vscode from searching inside some OneDrive folders. I could attempt a PR, if someone can give me some pointers.

@retep998
Copy link
Member

@roblourens You could poke around with calling GetFileInformationByHandle directly on that directory and figure out how to determine that it is indeed a directory.

@roblourens
Copy link
Author

roblourens commented Dec 16, 2017

Don't know if this helps:

let metadata = Path::new(onedrive_file).metadata().expect("failed");
println!("{:?}", metadata.file_attributes()); // 1056

let metadata2 = Path::new(onedrive_dir).metadata().expect("failed");
println!("{:?}", metadata2.file_attributes()); // 1040

The file attributes for a file and a directory in onedrive - both have FILE_ATTRIBUTE_REPARSE_POINT. The file has FILE_ATTRIBUTE_ARCHIVE (?). The directory has FILE_ATTRIBUTE_DIRECTORY which seems to make sense.

https://msdn.microsoft.com/en-us/library/windows/desktop/gg258117.aspx

@parkovski
Copy link

The implementation of FileType actually has a lot of problems. Compare the Unix implementation to the Windows implementation. Let's ignore the fact that there are a lot more factors on the Windows side for now. The particular issue in this thread is caused by line 523, where to be considered a directory, we expect the reparse point flag to be false and the dir flag to be true. That's, um, bad. But pretty much every function in that impl is wrong - it should work the same way the Unix version does, where it stores the native mode and each is_* checks for the existence of the appropriate flag. Maybe someone more familiar with Unix can say whether a symlink to a file should be treated as SYMLINK | FILE or just SYMLINK?

Now, is_symlink_dir is a little harder to solve. Actually, if you say that mount points are symlinks which this does, you're kind of in messy territory here because reparse points are totally up to the FS driver. So one way or another, symlink detection is broken here because you are basically saying a symlink is a reparse point that resolves to another FS location, but you can't make that judgment just from the IO_REPARSE_TAG_* flags. All those tell you is that it's a particular thing that Microsoft implemented, but there's no reason you can't have other implementations that act like symlinks. To actually figure this stuff out correctly you need to do what libuv is doing and try to get the FS driver to treat the reparse point as what you're looking for.

For the purpose of getting ripgrep to work, I would just refactor the methods to check for native flags like I described in the first paragraph. There should not be a SYMLINK_DIR mode - you should check for the existence of a symlink flag and a dir flag, but I'd make a doc comment above that function noting that it's broken and actually only tells you if the thing is a Win32 symlink or a mount point. How it should work may need some discussion, I'm not sure.

@roblourens
Copy link
Author

Thanks for the notes. I may try to work around this downstream just for fixing search in VS Code, but I think that coming up with the right fix for Rust is outside my area of expertise. But I'm hoping that someone is able to look at this because I think it's quite a bad bug.

@BurntSushi
Copy link
Member

@pitdicker Do you happen to have any context on what led to this change with respect to this bug?

BurntSushi added a commit to BurntSushi/walkdir that referenced this issue Feb 2, 2018
This commit fixes a bug on Windows where walkdir refused to traverse
directories that resided on OneDrive via its "file on demand" strategy.
The specific bug is that Rust's standard library treats a reparse point
(which is what OneDrive uses) as distinct from a file or directory, which
wreaks havoc on any code that uses FileType::{is_file, is_dir}. We fix
this by checking the directory status of a file by looking only at whether
its directory bit is set.

This bug was originally reported in ripgrep:
BurntSushi/ripgrep#705

It has also been filed upstream:
rust-lang/rust#46484

And has a pending fix:
rust-lang/rust#47956
BurntSushi added a commit to BurntSushi/ripgrep that referenced this issue Feb 2, 2018
This commit fixes a bug on Windows where directory traversals were
completely broken when attempting to scan OneDrive directories that use
the "file on demand" strategy.

The specific problem was that Rust's standard library treats OneDrive
directories as reparse points instead of directories, which causes
methods like `FileType::is_file` and `FileType::is_dir` to always return
false, even when retrieved via methods like `metadata` that purport to
follow symbolic links.

We fix this by peppering our code with checks on the underlying file
attributes exposed by Windows. We consider an entry a directory if and
only if the directory bit is set on the attributes. We are careful to
make sure that the code remains the same on non-Windows platforms.

Note that we also bump the dependency on `walkdir`, which contains a
similar fix for its traversals.

This bug is recorded upstream:
rust-lang/rust#46484

Upstream also has a pending PR:
rust-lang/rust#47956

Fixes #705
@pitdicker
Copy link
Contributor

pitdicker commented Feb 2, 2018

I sort-of remember writing that code, but forgot the details. It was an effort to get symlinks on Windows in a mostly working state. Would have to dive into the details again to be of much use to be honest.

As @retep998 commented file types on Windows behave pretty different from Unix. I would not call the logic 'wrong' as @parkovski does, but more a best effort to a cross-platform interface. I think the easiest solution is to change (_, true, _) => FileType::ReparsePoint, to something like (not tested):

(false, true, _) => FileType::File,
(true, true, _) => FileType::Dir,

So just act as if it really is a file or directory if we don't recognize the reparse point as a symlink or directory junction.

bors added a commit that referenced this issue Feb 17, 2018
This is the ideal FileType on Windows. You may not like it, but this is what peak performance looks like.

Theoretically this would fix #46484

The current iteration of this PR should not cause existing code to break, but instead merely improves handling around reparse points. Specifically...

* Reparse points are considered to be symbolic links if they have the name surrogate bit set. Name surrogates are reparse points that effectively act like symbolic links, redirecting you to a different directory/file. By checking for this bit instead of specific tags, we become much more general in our handling of reparse points, including those added by third parties.
* If something is a reparse point but does not have the name surrogate bit set, then we ignore the fact that it is a reparse point because it is actually a file or directory directly there, despite having additional handling by drivers due to the reparse point.
* For everything which is not a symbolic link (including non-surrogate reparse points) we report whether it is a directory or a file based on the presence of the directory attribute bit.
* Notably this still preserves invariant that when `is_symlink` returns `true`, both `is_dir` and `is_file` will return `false`. The potential for breakage was far too high.
* Adds an unstable `FileTypeExt` to allow users to determine whether a symbolic link is a directory or a file, since `FileType` by design is incapable of reporting this information.
@ssokolow
Copy link

ssokolow commented Jul 19, 2018

@roblourens For the record, FILE_ATTRIBUTE_ARCHIVE is an old DOS-era piece of metadata meant for backup tools.

The FAT driver sets the Archive flag whenever you modify a file and a DOS-era backup tool operating in incremental backup mode can then go through, save copies of any files marked "Archive", and then clear the bit to indicate to its future self that they haven't changed since the previous incremental backup.

(In other words, the "archive" part is short for "archival needed" or "please archive")

BurntSushi added a commit to BurntSushi/ripgrep that referenced this issue Aug 22, 2018
This commit undoes a work-around for a bug in Rust's standard library
that prevented correct file type detection on Windows in OneDrive
directories. We remove the work-around because we are moving to a
latest-stable Rust version policy, which has included this fix for a while
now.

ref #705, rust-lang/rust#46484
BurntSushi added a commit to BurntSushi/ripgrep that referenced this issue Aug 22, 2018
This commit undoes a work-around for a bug in Rust's standard library
that prevented correct file type detection on Windows in OneDrive
directories. We remove the work-around because we are moving to a
latest-stable Rust version policy, which has included this fix for a while
now.

ref #705, rust-lang/rust#46484
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. O-windows Operating system: Windows
Projects
None yet
8 participants