Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python debugging support #3075

Merged
merged 12 commits into from
Feb 3, 2023
Merged

Conversation

niloc132
Copy link
Member

@niloc132 niloc132 commented Nov 7, 2022

This commit adds a configuration option to customize engine-created
threads, and provides a default implementation that will register those
threads for debugging with pydevd if python is enabled.

As of this commit, pydevd debugging seems to work correctly, but
VSCode's debugging doesn't work for all threads yet.

Partial #2997

Comment on lines +14 to +15
/* private */ String[] CONFIGURED_INITIALIZATION_TYPES =
Configuration.getInstance().getStringArrayFromProperty("thread.initialization");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work if the property doesn't exist or is empty?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It fails if it does not exist, but succeeds if empty (with an empty array).

Due to how defaults work (or rather, don't work), I was under the impression we wanted to generally avoid using them, so dh-defaults.prop now contains this property.

py/server/deephaven/__init__.py Outdated Show resolved Hide resolved
Comment on lines +27 to +30
// First call in to create a custom function that has the same name as the Java thread (plus a prefix)
PyObject runnableResult = py_deephaven.create_thread_entry(Thread.currentThread().getName());
// runnable.run();
// Invoke that function directly from Java, so that we have only this one initial frame
runnableResult.call("__call__", runnable);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could do this with a single method. The naming would probably change:

py_deephaven.run_for_java_thread(Thread.currentThread().getName(), runnable) or similar

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially, and that's how the first implementation worked. However, if we wanted the top-most frame to have a specific name (so that you can tell what Java thread something was running on by looking at the py stack trace, which doesn't come with thread info), we would need to create a custom def to return in the call - the debugger at least loses the __name__ and __qualname__ properties that are assigned to rename defs. As such, we may need to return a pyobject so that it can be called again.

And, if we just use the runnable.run() above (presently commented out), then we don't need to pass runnable at all, so the signature stays the same. In that case however it appears to the debugger that any pydevd.settrace() call in the web IDE actually pauses in the create_change_list function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this provide the same benefits?

def run_for_java_thread(name, runnable):
    ...
    def JavaThread():
        runnable.run()
    JavaThread()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Changing the function's name (via declaring JavaThread not by name, but through eval) so the thread appears: technically yes, but by adding two frames instead of one, so the topmost frame isn't named in this way.
  • Adding one py frame earlier in the stack than the current code: technically yes, but again, because it adds two frames instead of one, we might be making it harder to understand instead of easier

Remember, these run once per thread, when the thread is created, so on the order of 10ish times on startup and once when the first python command is executed from the web console, so paying the extra jni/py round trip is pretty cheap.

Comment on lines +27 to +30
// First call in to create a custom function that has the same name as the Java thread (plus a prefix)
PyObject runnableResult = py_deephaven.create_thread_entry(Thread.currentThread().getName());
// runnable.run();
// Invoke that function directly from Java, so that we have only this one initial frame
runnableResult.call("__call__", runnable);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This technique of ensuring the java frame includes python in the Runnable stack does rely on the fact that python drops the GIL when calling into java. jpy-consortium/jpy#48 nothing actionable, but good to note it wouldn't work otherwise.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but the fact that DHaaL works at all relies on that fact as well, the entire library would deadlock on the gil and ugp if we didn't make this assumption.

public interface ThreadInitializationFactory {
/* private */ String[] CONFIGURED_INITIALIZATION_TYPES =
Configuration.getInstance().getStringArrayFromProperty("thread.initialization");
/* private */ List<ThreadInitializationFactory> INITIALIZERS = Arrays.stream(CONFIGURED_INITIALIZATION_TYPES)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering, do we think there are occasions for more than 1 ThreadInitializationFactory?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have assumed for now that each factory should be created once, but it if wants per-thread state, it should do so by putting that state in the createInitializer method itself, or in the returned Runnable. By only instantiating these once, we grant control to each implementation what kind of scope it wants to have.

For example, the current DebuggingInitializer never closes the deephaven module (in its current form, it should), but could be modified to instead avoid paying the cost of re-creating that proxy for each thread (not just UGP threads, but also the DeephavenApiServerModule scheduler threads too).

Comment on lines +35 to +42
static Runnable wrapRunnable(Runnable runnable) {
Runnable acc = runnable;
for (ThreadInitializationFactory INITIALIZER : INITIALIZERS) {
acc = INITIALIZER.createInitializer(acc);
}
return acc;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be sure I understand, the reason for this pattern is to allow for initializer-specific cleanup on exit? Otherwise, a pattern of iterative runnable invocations would seem less error-prone. I suppose either version allows arbitrary hijacking of the intended run method; the current pattern could skip calling the wrapped runnable, but the iterative approach could just never return or throw an exception on termination.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't that we need to do cleanup on the way out per se, but that we want to insert a stack frame in python. If we end up dropping this frame (which has some weird side effects when you "step out" of the apparent top-level frame), then we could just run one after the other, no wrapping.

The current impl has its option of how it wants to do it - I played with an API that would offer a DSL for either wrapping or prefixing, but didn't really love it, I can try bringing that back if you'd prefer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, in light of the arm64 issues, a wrapper is necessary. The question then will be how we name that, if JavaThread is clear and simple enough, or if we want to use it to get some more info to the user.

// python not enabled, don't accidentally start it
return runnable;
}
DeephavenModule py_deephaven = (DeephavenModule) PyModule.importModule("deephaven")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note from @rcaudy : I haven't reviewed the bottom of this file past this point, or py/server/deephaven/__init__.py yet.

@niloc132 niloc132 changed the title Py debugging in ugp Python debugging support Feb 3, 2023
Copy link
Member

@rcaudy rcaudy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving for Python server codeowners, no interface code here.

@niloc132 niloc132 merged commit 46ef898 into deephaven:main Feb 3, 2023
@github-actions github-actions bot locked and limited conversation to collaborators Feb 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants