-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Are servers culled if there are busy kernels? #10
Comments
Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗 |
Thanks for opening this issue, @stevenstetzler! I would like to have this functionality here too. Right now, the culler makes API requests only to JupyterHub, and not to the individual notebooks. We could potentially change this, and have it make requests to each notebook. That gives us more flexibility to do things like this. However, as of now, we'll have to find a way to get this info the JupyterHub 'last activity' API reported by jupyterhub.singleuser for the culler to know about this. You could do some of this with the notebook config - once the notebook process dies, the pod can be garbage collected. However, not sure if that's a good long term solution. @minrk would know more. |
Short answer: Correct, it's not available now, and qualified "yes" for feasibility, depending on your experience. To do this, you would need to write a new culler that retrieves activity data directly from single-user servers instead of considering only the information in the Hub API. This is doable, but requires:
Long answer: For fine-grained culling, I do think the notebook server itself has the best control since it can do things like cull idle kernels, consider active connections and execution_state as activity-sources or not, etc.. I actually don't think using the notebook config for culling is a bad solution, but working in concert with the hub culler, it's best if the notebook's internal culler is strictly more aggressive than the Hub activity culler, since the Hub has only a single timestamp to consider, while the notebook's internal logic has more fine-grained parameters. The single-user server does publish the notebook's own last_activity, collected here so this is an input to the Hub culler. I don't believe there is currently a mechanism to treat long-term 'busy' kernels as activity that propagates, though. I opened jupyterhub/jupyterhub#3101 because I think we can update singleuser's activity-posting to publish more generic metrics, which would enable this feature at the hub-culler level, in a more flexible way, but that's a longer-term project, I think. |
This issue has been mentioned on Jupyter Community Forum. There might be relevant details there: |
I did use the notebook's internal culler configs as suggested here because I wanted the culling functionality similar to @stevenstetzler 's.
When I try using an image built on top of docker-stacks' notebooks, the TerminalManager no longer gets initialized and I cannot see any logs for TerminalManager. This is the modified image I'm using
Am I doing something wrong which causes the TerminalManager to not get initialized in the modified image and therefore leads to terminals(and subsequently, pods) not getting terminated. |
@dipen-epi I think the functionality was added in notebook version 6.1: jupyter/notebook#5372 and I see you're building from 6.0.3 so it's quite logical it would not work |
I'm too interested on not dropping containers if there is activity on them. Colleages are claiming to be able to leave ML processes running in background, but those are actually being dropped. |
The topic of this issue is a question, but I think there are related action points to it. The actual action points are already represented by concrete issues about them though.
With these, one can implement custom metrics and take actions based on them, or opt to configure the notebook server's internal culler mechanisms etc. I'd like to close this issue as it has no concrete action point by its own as I see it. If it did have one that I missed, I suggest a new issue is opened focused on that. |
When using a JupyterHub, users will sometimes start a long-running computation in a Jupyter notebook and leave notebook servers inactive (closing their laptop or shutting the JupyterHub web page). I am hoping to have servers culled only if there isn't a busy kernel running. Does the jupyterhub idle culler take into account that there may be no server activity from the user, but there may still be a running kernel?
I've traced how the latest activity from the server is computed starting with the server activity being sent to the JupyterHub (jupyterhub.singleuser), using the max of the latest activity from the server API, kernel activity, and terminal activity (notebook.notebookapp), and how the kernel activity is updated only when there is a kernel communication (notebook.services.kernels.kernelmanager). It doesn't seem to me based on this that the culler will take into account whether the kernel is still active when deciding to cull a server (instead deciding not to cull if the kernel has been interacted with).
Could anyone confirm that this feature isn't available in the idle culler? If it isn't, would it be feasible to implement?
Kernel status is available through the notebook REST API:
which includes an
execution_state
key. Additionally, it looks like theserver
object returned from the JupyterHub REST API as used in the idle culler has aserver
key to generate the above<server-url>
:so I can see how it might be implemented. If there's interest, is this the right path to go towards implementing this behavior for a pull request?
The text was updated successfully, but these errors were encountered: