-
-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make outputs go to correct cell when generated in threads/asyncio #1186
Changes from 2 commits
6d97970
9e9c40e
5956899
e1258de
ebf9f28
60436aa
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,6 +2,7 @@ | |
|
||
import asyncio | ||
import builtins | ||
import gc | ||
import getpass | ||
import os | ||
import signal | ||
|
@@ -14,6 +15,7 @@ | |
import comm | ||
from IPython.core import release | ||
from IPython.utils.tokenutil import line_at_cursor, token_at_cursor | ||
from jupyter_client.session import extract_header | ||
from traitlets import Any, Bool, HasTraits, Instance, List, Type, observe, observe_compat | ||
from zmq.eventloop.zmqstream import ZMQStream | ||
|
||
|
@@ -22,6 +24,7 @@ | |
from .compiler import XCachingCompiler | ||
from .debugger import Debugger, _is_debugpy_available | ||
from .eventloops import _use_appnope | ||
from .iostream import OutStream | ||
from .kernelbase import Kernel as KernelBase | ||
from .kernelbase import _accepts_parameters | ||
from .zmqshell import ZMQInteractiveShell | ||
|
@@ -66,6 +69,10 @@ def _get_comm_manager(*args, **kwargs): | |
comm.create_comm = _create_comm | ||
comm.get_comm_manager = _get_comm_manager | ||
|
||
import threading | ||
|
||
threading_start = threading.Thread.start | ||
|
||
|
||
class IPythonKernel(KernelBase): | ||
"""The IPython Kernel class.""" | ||
|
@@ -151,6 +158,11 @@ def __init__(self, **kwargs): | |
|
||
appnope.nope() | ||
|
||
if hasattr(gc, "callbacks"): | ||
# while `gc.callbacks` exists since Python 3.3, pypy does not | ||
# implement it even as of 3.9. | ||
gc.callbacks.append(self._clean_thread_parent_frames) | ||
|
||
help_links = List( | ||
[ | ||
{ | ||
|
@@ -341,6 +353,12 @@ def set_sigint_result(): | |
# restore the previous sigint handler | ||
signal.signal(signal.SIGINT, save_sigint) | ||
|
||
async def execute_request(self, stream, ident, parent): | ||
"""Override for cell output - cell reconciliation.""" | ||
parent_header = extract_header(parent) | ||
self._associate_identity_of_new_threads_with(parent_header) | ||
await super().execute_request(stream, ident, parent) | ||
|
||
async def do_execute( | ||
self, | ||
code, | ||
|
@@ -706,6 +724,58 @@ def do_clear(self): | |
self.shell.reset(False) | ||
return dict(status="ok") | ||
|
||
def _associate_identity_of_new_threads_with(self, parent_header): | ||
"""Intercept the identity of any thread started after this method finished, | ||
|
||
and associate the thread's output with the parent header frame, which allows | ||
to direct the outputs to the cell which started the thread. | ||
|
||
This is a no-op if the `self._stdout` and `self._stderr` are not | ||
sub-classes of `OutStream`. | ||
""" | ||
stdout = self._stdout | ||
stderr = self._stderr | ||
|
||
def start_closure(self: threading.Thread): | ||
"""Wrap the `threading.Thread.start` to intercept thread identity. | ||
|
||
This is needed because there is no "start" hook yet, but there | ||
might be one in the future: https://bugs.python.org/issue14073 | ||
""" | ||
|
||
threading_start(self) | ||
for stream in [stdout, stderr]: | ||
if isinstance(stream, OutStream): | ||
stream._thread_parents[self.ident] = parent_header | ||
|
||
threading.Thread.start = start_closure # type:ignore[method-assign] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was wondering if there could be a way to not monkey-patch There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My understanding as per https://bugs.python.org/issue14073 is that there is no other way, but we could mention our use case as another situation motivating introduction of start/exit hooks/callbacks to threads (which is something Python committers have considered in the past but I presume it was not a sufficiently high priority until now). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We crossed 'comments', see my other thread on this exact line for (what I think is) a better solution. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think it thread-safe. If a thread starts a new thread, that new thread will start outputting in the last executed cell. I think this approach might work better. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To summarize the strategy, the ctor ( There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Correct (there is no such a problem with asyncio side of things).
Thank you for the link, I will take a look! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Taking a closer look at solara, from threading import Thread
from time import sleep
def child_target():
for i in range(iterations):
print(i, end='', flush=True)
sleep(interval)
def parent_target():
thread = Thread(target=child_target)
sleep(interval)
thread.start()
Thread(target=parent_target).start() but still not with: def parent_target():
sleep(interval)
Thread(target=child_target).start()
Thread(target=parent_target).start() do I see this right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the end my implementation converged to overriding the same methods as yours after all in e1258de :) |
||
|
||
def _clean_thread_parent_frames( | ||
self, phase: t.Literal["start", "stop"], info: t.Dict[str, t.Any] | ||
): | ||
"""Clean parent frames of threads which are no longer running. | ||
This is meant to be invoked by garbage collector callback hook. | ||
|
||
The implementation enumerates the threads because there is no "exit" hook yet, | ||
but there might be one in the future: https://bugs.python.org/issue14073 | ||
|
||
This is a no-op if the `self._stdout` and `self._stderr` are not | ||
sub-classes of `OutStream`. | ||
""" | ||
# Only run before the garbage collector starts | ||
if phase != "start": | ||
return | ||
active_threads = {thread.ident for thread in threading.enumerate()} | ||
for stream in [self._stdout, self._stderr]: | ||
if isinstance(stream, OutStream): | ||
thread_parents = stream._thread_parents | ||
for identity in list(thread_parents.keys()): | ||
if identity not in active_threads: | ||
try: | ||
del thread_parents[identity] | ||
except KeyError: | ||
pass | ||
|
||
|
||
# This exists only for backwards compatibility - use IPythonKernel instead | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious why the ContextVar and the _thread_parents dict? Is a threading.local not more standard?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because contextvar works well for asyncio edge cases. There is some more explanation in the PEP 567, but the gist is:
That said, I will take another look at using
threading.local
instead of_thread_parents
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that we need ContextVar now, because also when we create new tasks we want to output to the right output cell, didn't think of that!
I think if we use ContextVar, we do not need the thread local storage, it's a superset of threading.local. In combination with overriding the Thread ctor, we can drop _thread_parents.