Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inconsistent realpath on Windows #30588

Closed
stevengj opened this issue Jan 4, 2019 · 12 comments
Closed

inconsistent realpath on Windows #30588

stevengj opened this issue Jan 4, 2019 · 12 comments
Labels
filesystem Underlying file system and functions that use it help wanted Indicates that a maintainer wants help on an issue or pull request system:windows Affects only Windows

Comments

@stevengj
Copy link
Member

stevengj commented Jan 4, 2019

On MacOS and Linux, realpath(p) throws an error if p does not exist, but on Windows there is no error.

Furthermore, on case-insensitive case-preserving filesystems (Mac and Windows), arguably realpath should canonicalize the case of the path. It does so on MacOS, but not on Windows.

(In Python, there is a separate normcase function for normalizing case — its realpath function does not normalize case on MacOS, but also does not throw an exception for nonexistent files. My preference would just be to make our Windows realpath act like it does on other operating systems.)

(A related problem is determining whether two paths refer to the same file in a portable way. Python has a samefile function for this. See also #9436.)

@stevengj stevengj added system:windows Affects only Windows filesystem Underlying file system and functions that use it labels Jan 4, 2019
@stevengj
Copy link
Member Author

stevengj commented Jan 4, 2019

Not sure what the right Windows API function is here. GetLongPathNameW ?

Right now we are calling GetFullPathNameW in realpath, for which the documentation says This function does not verify that the resulting path and file name are valid. The documentation also says Multithreaded applications and shared library code should not use the GetFullPathName function because it uses global state, which is another reason to avoid it.

@StefanKarpinski StefanKarpinski added the help wanted Indicates that a maintainer wants help on an issue or pull request label Jan 4, 2019
@stevengj
Copy link
Member Author

stevengj commented Jan 4, 2019

We have an undocumented Base.Filesystem.longpath function that calls GetLongPathNameW on Windows, but it doesn't seem to do what we want:

julia> Base.Filesystem.longpath(uppercase(pwd()))
"C:\\Users\\STEVEN G. JOHNSON\\Desktop\\JULIA-1.0.0"


julia> Base.Filesystem.longpath(lowercase(pwd()))
"c:\\Users\\steven g. johnson\\Desktop\\julia-1.0.0"

Weirdly, it seems to canonicalize the case of some portions of the path but not others! (Thank you, Microsoft… 😝)

@stevengj
Copy link
Member Author

stevengj commented Jan 4, 2019

Python's normcase is no help here: they just convert the path to lowercase.

See also this question on StackOverflow and also this one, which suggests various solutions, including apparently incorrect suggestions to call GetLongPathName or GetFullPathName.

@stevengj
Copy link
Member Author

stevengj commented Jan 4, 2019

Note also that #13542 by @malmaud relies on longpath returning the case-preserved filename, which now seems incorrect, so this may lead to bugs in the case-sensitivity of package loading. 😢

@stevengj
Copy link
Member Author

stevengj commented Jan 4, 2019

See also this stackoverflow thread, which says you need to call GetLongPathNameW on the result of GetShortPathNameW for reasons that I don't comprehend.

@stevengj
Copy link
Member Author

stevengj commented Jan 4, 2019

Success! I verified that calling longpath(shortpath(path)) seems to give the desired case regardless of the input case!

Well, with one exception — it doesn't change the case of the drive letter. 😠 I suppose we can capitalize that manually?

(shortpath here is identical Base.Filesystem.longpath except that it calls GetShortPathNameW.)

@malmaud
Copy link
Contributor

malmaud commented Jan 4, 2019

There is this comment on SO that seems to indicate this approach can still fail though https://stackoverflow.com/questions/2113822/python-getting-filename-case-as-stored-in-windows/2114975#comment57405497_2114975, if I'm understanding things correctly.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Jan 4, 2019

Perhaps use GetFinalPathNameByHandleW and ask for the FILE_NAME_NORMALIZED?

@stevengj
Copy link
Member Author

stevengj commented Jan 4, 2019

@vtjnash, to get a file handle, does the file need to have read permission?

@stevengj
Copy link
Member Author

stevengj commented Jan 5, 2019

Ah, you can get a handle that you don't have read access to: the CreateFile documentation says that if you pass 0 for the dwDesiredAccess parameter the application can query certain metadata such as file, directory, or device attributes without accessing that file or device, even if GENERIC_READ access would have been denied.

So getting a handle and calling GetFinalPathNameByHandleW might be the way to go.

@stevengj
Copy link
Member Author

stevengj commented Jan 5, 2019

The following seems to do the trick:

function myrealpath(path::AbstractString)
    h = ccall(:CreateFileW, stdcall, Int, (Cwstring, UInt32, UInt32, Ptr{Cvoid}, UInt32, UInt32, Int),
              path, 0, 0x07, C_NULL, 3, 0x02000000, 0)
    h == -1 && error(Libc.FormatMessage())
    try
        len = ccall(:GetFinalPathNameByHandleW, stdcall, UInt32, (Int, Ptr{UInt16}, UInt32, UInt32),
                        h, C_NULL, 0, 0x0)
        iszero(len) && error(Libc.FormatMessage())
        buf = Array{UInt16}(undef, len)
        len = ccall(:GetFinalPathNameByHandleW, stdcall, UInt32, (Int, Ptr{UInt16}, UInt32, UInt32),
                        h, buf, len, 0x0)
        iszero(len) && error(Libc.FormatMessage())
        resize!(buf, len) # strip NUL terminator
        if 4 < len < 264 && 0x005c == buf[1] == buf[2] == buf[4] && 0x003f == buf[3]
            Base._deletebeg!(buf, 4) # omit \\?\ prefix for paths < MAXPATH in length
        end
        return transcode(String, buf)
    finally
        ccall(:CloseHandle, stdcall, Cint, (Int,), h)
    end
end

@StefanKarpinski
Copy link
Sponsor Member

Bravo 👏🏼

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
filesystem Underlying file system and functions that use it help wanted Indicates that a maintainer wants help on an issue or pull request system:windows Affects only Windows
Projects
None yet
Development

No branches or pull requests

4 participants