Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improper parsing of a file:// uri on Windows #325

Closed
gordonwatts opened this issue Apr 5, 2021 · 0 comments · Fixed by #328
Closed

Improper parsing of a file:// uri on Windows #325

gordonwatts opened this issue Apr 5, 2021 · 0 comments · Fixed by #328
Labels
bug The problem described is something that must be fixed

Comments

@gordonwatts
Copy link

Python includes a number of URL parsing utilities. One benefit is that the format for URLs in python is standard across libraries. It currently looks like the URL parsing code for the file scheme in uproot4 is not conformant on Windows (but works fine on Linux).

An example of a valid windows URL: file:///g:/mydir/file.root. The above code will parse this, on windows, to /g:/mydir/file.root which cannot be opened by python's open function. However, g:/mydir/file.root works just fine.

There is another possible issue - in that URLs aren't unquoted during processing by the code (though I did not test this).

To work around this problem I've used some discussion on stack overflow to help me with the following code which works in python 3+, and is cross-platform:

from urllib.parse import urlparse, unquote
from urllib.request import url2pathname

p = urlparse(file_url)
file_path = url2pathname(unquote(p.path))

Note that even that discussion is a bit confused: it is important to read the answer comments to interpret the answers. The above example was pulled from the last answer, and then the python2 code was stripped out.

If I might be so bold, one way to do testing on this is using pathlib. For example:

from pathlib import Path

test_file = Path('./test_data/myfile.root')   # Should work on all platforms.
test_file_url = test_file.absolute().as_uri()

data = uproot4.open(test_file_url)

Or similar. But this will make sure your file URL parsing works on all platforms and uses python "standards" to make sure it all works.

@gordonwatts gordonwatts added the bug (unverified) The problem described would be a bug, but needs to be triaged label Apr 5, 2021
@jpivarski jpivarski added bug The problem described is something that must be fixed and removed bug (unverified) The problem described would be a bug, but needs to be triaged labels Apr 8, 2021
@jpivarski jpivarski linked a pull request Apr 8, 2021 that will close this issue
jpivarski added a commit that referenced this issue Apr 8, 2021
* Fixes #325, file:// on Windows, and handles %-encoded URIs.

* Avoid Unicode check on Python 2.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The problem described is something that must be fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants