Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using dunce for paths on Windows (not just for display) #5440

Closed
paveldikov opened this issue Jul 25, 2024 · 19 comments · Fixed by #5446
Closed

Consider using dunce for paths on Windows (not just for display) #5440

paveldikov opened this issue Jul 25, 2024 · 19 comments · Fixed by #5446
Assignees
Labels
bug Something isn't working

Comments

@paveldikov
Copy link
Contributor

As originally mentioned in #1491 (comment)

Consider the scenario where the Python interpreter is distributed via a network share path. e.g. \\some-host\some-share\...\python.exe. Obviously this is not very common but the same sort of path notation can be used for local disk drives, e.g. \\localhost\c$\ -- which is what I use for this reproducer:

$ uv venv --python "//localhost/c$/Program Files/Python310/python.exe" test310
Using Python 3.10.11 interpreter at: \\localhost\c$\Program Files\Python310\python.exe
Creating virtualenv at: test310
Activate with: source test310\Scripts\activate
$ cat test310/pyvenv.cfg
home = \\?\UNC\localhost\c$\Program Files\Python310
implementation = CPython
uv = 0.2.29
version_info = 3.10.11
include-system-site-packages = false
$ test310/Scripts/python.exe
Python 3.10.11 (tags/v3.10.11:7d4cc5a, Apr  5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlite3
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "\\?\UNC\localhost\c$\Program Files\Python310\lib\sqlite3\__init__.py", line 57, in <module>
    from sqlite3.dbapi2 import *
  File "\\?\UNC\localhost\c$\Program Files\Python310\lib\sqlite3\dbapi2.py", line 27, in <module>
    from _sqlite3 import *
ImportError: DLL load failed while importing _sqlite3: The parameter is incorrect.

dunce'ing the path by hand in pyvenv.cfg fixes it for me:

$ cat test310/pyvenv.cfg
home = \\localhost\c$\Program Files\Python310
implementation = CPython
uv = 0.2.29
version_info = 3.10.11
include-system-site-packages = false
$ test310/Scripts/python.exe
Python 3.10.11 (tags/v3.10.11:7d4cc5a, Apr  5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sqlite3
>>> sqlite3
<module 'sqlite3' from '\\\\localhost\\c$\\Program Files\\Python310\\lib\\sqlite3\\__init__.py'>

This issue is reproducible with 3.10.11, but not 3.11. (presumably this bad handling of UNC paths got fixed in CPython down the line?)

@charliermarsh charliermarsh self-assigned this Jul 25, 2024
@charliermarsh charliermarsh added the bug Something isn't working label Jul 25, 2024
@charliermarsh
Copy link
Member

Hmm, the values in pyvenv already go through dunce:

python_home.simplified_display().to_string(),

@charliermarsh
Copy link
Member

Can you test a debug build if I push a branch?

@charliermarsh
Copy link
Member

My only guess is that we need to avoid calling the non-safe canonicalize on the excecutable in the first place (e.g., change canonicalize_executable to use path.simple_canonicalize() instead of fs_err::canonicalize(path).

@paveldikov
Copy link
Contributor Author

Can you test a debug build if I push a branch?

Happy to test.

My only guess is that we need to avoid calling the non-safe canonicalize on the excecutable in the first place (e.g., change canonicalize_executable to use path.simple_canonicalize() instead of fs_err::canonicalize(path).

Yeah, and I wonder if this will actually make the code simpler? (assuming that this is going to be a near-total revert of #1086. If so, win-win!)

@charliermarsh
Copy link
Member

Can we try #5446 to start, see if the venv is any different?

@paveldikov
Copy link
Contributor Author

No change.

$ cargo run -- venv --python "//localhost/c$/Program Files/Python310/python.exe" test310
Using Python 3.10.11 interpreter at: \\localhost\c$\Program Files\Python310\python.exe
Creating virtualenv at: test310
Activate with: source test310\Scripts\activate
$ cat test310/pyvenv.cfg
home = \\?\UNC\localhost\c$\Program Files\Python310
implementation = CPython
uv = 0.2.29
version_info = 3.10.11
include-system-site-packages = false

@charliermarsh
Copy link
Member

Can you try with the update I just pushed? Trying to understand where the prefix is coming from, confused.

@paveldikov
Copy link
Contributor Author

$ cargo run -- venv --python "//localhost/c$/Program Files/Python310/python.exe" test310
     Running `target\debug\uv.exe venv --python '//localhost/c$/Program Files/Python310/python.exe' test310`
Using Python 3.10.11 interpreter at: \\localhost\c$\Program Files\Python310\python.exe
Creating virtualenv at: test310
base_python: "\\\\?\\UNC\\localhost\\c$\\Program Files\\Python310\\python.exe"
sys_executable: "\\\\localhost\\c$\\Program Files\\Python310\\python.exe"
sys_base_executable: Some("\\\\localhost\\c$\\Program Files\\Python310\\python.exe")
Activate with: source test310\Scripts\activate

pyvenv.cfg unchanged

$ cat test310/pyvenv.cfg
home = \\?\UNC\localhost\c$\Program Files\Python310
implementation = CPython
uv = 0.2.29
version_info = 3.10.11
include-system-site-packages = false

@charliermarsh
Copy link
Member

My guess is that this path is actually not safe to strip though I'm not intimately familiar with the rules:

fn is_safe_to_strip_unc(path: &Path) -> bool {
    let mut components = path.components();
    match components.next() {
        Some(Component::Prefix(p)) => match p.kind() {
            Prefix::VerbatimDisk(..) => {},
            _ => return false, // Other kinds of UNC paths
        },
        _ => return false, // relative or empty
    }

    for component in components {
        match component {
            Component::RootDir => {},
            Component::Normal(file_name) => {
                // it doesn't allocate in most cases,
                // and checks are interested only in the ASCII subset, so lossy is fine
                if !is_valid_filename(file_name) || is_reserved(file_name) {
                    return false;
                }
            }
            _ => return false, // UNC paths take things like ".." literally
        };
    }

    if windows_char_len(path.as_os_str()) > 260 { // However, if the path is going to be used as a directory it's 248
        return false;
    }
    true
}

@charliermarsh
Copy link
Member

Will look deeper when I can, but need to prioritize a few other things first today. Sorry!

@paveldikov
Copy link
Contributor Author

No worries!

Judging from the log lines it must be safe to strip in at least some contexts:

sys_executable: "\\\\localhost\\c$\\Program Files\\Python310\\python.exe"
sys_base_executable: Some("\\\\localhost\\c$\\Program Files\\Python310\\python.exe")

@charliermarsh
Copy link
Member

I think that's the original, unmodified path we get from sys.executable in Python. Then we pass it to uv_fs::canonicalize_executable(interpreter.sys_executable()) to get base_python... which adds the prefix, which dunce then tries to strip if it can (but seemingly can't).

@paveldikov
Copy link
Contributor Author

paveldikov commented Jul 25, 2024

Naive question: do we really need to canonicalise on Windows? (we aren't following symlinks as we are on Linux)

Comment in code says:

We canonicalize the path to ensure that it's real and consistent, though we don't expect any symlinks on Windows.

We generally know that it's real beforehand -- though perhaps not 'real' in the POSIX sense. See the Using Python .... interpreter at: printout (although this happens in a very different place in the control flow)

Also perhaps it's something to do with the dollar sign? I am in the predicament of trying to replicate this issue in my home environment and I don't really happen to have network shares laying around :-/ so I thought I'd fake this by using this notation on my C: drive...

@charliermarsh
Copy link
Member

Hmm. We probably don't.

@paveldikov
Copy link
Contributor Author

P.S. it's not the dollar sign. Turns out I can get an arbitrary \\ path using a shared folder. No special characters involved, but still gets UNC'd.

home = \\?\UNC\DESKTOP\Users\Pavel\share\Python310

@charliermarsh
Copy link
Member

Yeah. I assume dunce is correct that it's not unequivocally safe to strip that (maybe leads to an ambiguity) but again mercifully I'm not an expert in Windows paths. We can experiment with not canonicalizing paths on Windows at all.

@charliermarsh
Copy link
Member

I pushed a new version to that same branch / PR that avoids canonicalizing on Windows, if you want to try it.

@paveldikov
Copy link
Contributor Author

paveldikov commented Jul 25, 2024

Third time's the charm! 👍

home = \\localhost\c$\Program Files\Python310

Looking at the dunce code you quoted -- this all tracks. It seems to only consider as 'safe' paths starting with a drive letter e.g. C:, else gives up completely:

    match components.next() {
        Some(Component::Prefix(p)) => match p.kind() {
            Prefix::VerbatimDisk(..) => {},
            _ => return false, // Other kinds of UNC paths
        },
        _ => return false, // relative or empty
    }

IMHO this is overly conservative, but I am likewise not a Windows expert -- just someone that has to live with its quirks! Also found this conversation @ someone's dunce bug report. I guess the problem here is that once (mis-)canonicalisation adds the prefix, it is impossible to safely to strip it back away.

So... a dunce issue that is effectively WONTFIX?

charliermarsh added a commit that referenced this issue Jul 26, 2024
## Summary

If you have an executable path on a network share path (like
`\\some-host\some-share\...\python.exe`), canonicalizing it adds the
`\\?` prefix, but dunce cannot safely strip it.

This PR changes the Windows logic to avoid canonicalizing altogether. We
don't really expect symlinks on Windows, so it seems unimportant to
resolve them.

Closes: #5440.
@paveldikov
Copy link
Contributor Author

Thanks a million for the quick resolution 🙌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants