-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add pyodide support for jupyter-lite for files opened via HTTP #868
Conversation
src/uproot/source/http.py
Outdated
try: | ||
shell = get_ipython().__class__.__name__ | ||
# True if running in a jupyter lite notebook | ||
# False if running in a jupyter noteboook ('ZMQInteractiveShell') or IPython ('TerminalInteractiveShell') | ||
return shell == "Interpreter" | ||
except NameError: | ||
return False # Python interpreter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether we should use
try: | |
shell = get_ipython().__class__.__name__ | |
# True if running in a jupyter lite notebook | |
# False if running in a jupyter noteboook ('ZMQInteractiveShell') or IPython ('TerminalInteractiveShell') | |
return shell == "Interpreter" | |
except NameError: | |
return False # Python interpreter | |
return sys.platform == 'emscripten' |
This would not detect IPython (or notebooks), but I think we specifically care about the emscripten part more than IPython?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would have the same result, the function would only return True
if it is running in a jupyter lite environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the suggestion, I swapped out the function with only this check as it will be less code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a fairly minimal change, and looks like a clean switch based on sys.platform == "emscripten"
. (I don't know how general that is, though you've already talked about it with @agoose77.)
This only modifies the HTTPSource, and it does so internally, in a way that users can't inspect. However, the choice it's making is between a single background thread (CPython) and a non-thread (Pyodide), which are roughly equivalent, just a concession for Pyodide not supporting threads.
What about when the HTTP server doesn't support multi-part GET requests? In that case, HTTPSource falls back to MultithreadedHTTPSource (the _fallback
attribute of HTTPSource). That launches async
could be used, but that's a more radical change to the MultithreadedSource, and should be a separate PR, if at all.
This happens automatically when an HTTP server doesn't support multi-part GET, but it can also be forced manually by setting the http_handler
in uproot.open
.
We're also not addressing local files, but I saw that you tried and had some troubles with this. As I understand from @agoose77, that part of Pyodide is in flux and maybe we should wait for it to settle down. I highly doubt that memory-mapping will work (the default), and the alternative is MultithreadedFileSource. The MultithreadedHTTPSource and MultithreadedFileSource (and MultithreadedXRootDSource) share a lot of code through a superclass, so converting it to use async
or writing an alternative that uses async
could happen in one place.
An async
alternative to the multithreaded sources would be welcome in CPython as well. What the multithreaded sources are doing is essentially building async functions by hand (so it would work in Python 2). @nsmith- built an Uproot subset based entirely on Python 3 async
, and if I remember right, it worked better in high-latency network environments. Oh, I just noticed that the Python asyncio package is not available for Pyodide. That's odd—that's how I would have expected a JavaScript-like environment to work. (JavaScript is very asyncy!)
None of these things are holding back this PR. I'm just pointing out that if people are using this on Jupyter-Lite, there's a chance we'll see these other issues eventually.
No description provided.