Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Adding taint during image pre-pulling to prevent user pod being scheduled #3011

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

xcompass
Copy link
Contributor

@xcompass xcompass commented Feb 3, 2023

This is to solve the issue in #3010.

This PR is to add two additional init-containers (first one and last one) for image puller. The first init-container is used to add a taint to prevent user pods being scheduled while the single user image is being pulled. The last init-container removes the taint added by the first one so the user pods can be scheduled on the node. Both init-containers are using the same binary called taintmanager with different parameter for adding and removing the taints.

A new service account is also added to grant permission to call K8s API to manage taint on the node. The taintmanager is using in-cluster service account token to access K8s API. The taint can be customized through helm value.

@welcome
Copy link

welcome bot commented Feb 3, 2023

Thanks for submitting your first pull request! You are awesome! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please make sure you followed the pull request template, as this will help us review your contribution more quickly.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also a intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

@consideRatio
Copy link
Member

consideRatio commented Feb 3, 2023

Thank you for your work on this @xcompass!! ❤️ 🌻

Can you what you wish to accomplish, and how how it is accomplished in the PR description?

Also, the docker image - where is the Dockerfile and its bundled source code defined?

This will be a quite advanced PR to review, so we will need all the help we can get to consider this.

@xcompass
Copy link
Contributor Author

xcompass commented Feb 7, 2023

@consideRatio Please checkout the PR description. The docker image is linked in the description as well. Let me know if there is anything else you would like me to add.

@xcompass
Copy link
Contributor Author

xcompass commented Feb 7, 2023

Tagging @yuvipanda @donotpush @Lastcysa for visibility and possible feedbacks

@xcompass
Copy link
Contributor Author

xcompass commented Feb 7, 2023

I'll work on failed tests next a few days.

@yuvipanda
Copy link
Collaborator

Ok, so I absolutely love the idea! I think if we can implement this cleanly, it's a major win to cold-start problems that exist. I too can't seem to find the source for the images, etc though - they should probably live in this repo as well, and be built by chartpress.

The two things to test are:

  1. How does the autoscaler react when the taint is present? Does it go on an infinite loop of adding new nodes because the user nodes can't be scheduled on the tainted nodes? I think figuring this out is going to be important.
  2. What is the failure mode? What happens when the code crashes, and leaves the taint in after the pull too? If the situation resolves itself soon (probably by autoscaler claiming the failed node), I think this is fine.
  3. Is there a latency penalty where the user pod is scheduled onto the node before the tainting can happen?

Anyway, thank you so much for working on this!

@consideRatio
Copy link
Member

Thank you @xcompass, now I understand the strategy and think your PR description helps others understand the strategy.

I'd like review effort to remain at a very high level still. I'll switch back to #3010 to help narrow down exactly what you see as the problem that you want to address with this. I don't think the issue description captures that at the moment.

@yuvipanda
Copy link
Collaborator

I would say that importing https://github.com/ubc/taintmanager into https://github.com/jupyterhub/zero-to-jupyterhub-k8s/tree/main/images, and shaping it to be as similar to https://github.com/jupyterhub/zero-to-jupyterhub-k8s/tree/main/images/image-awaiter as possible would also help a lot with getting this reviewed. We have limited go expertise in the jupyterhub community, and making this new go thing be as close to the existing go thing we have will help a lot!

@yuvipanda
Copy link
Collaborator

The trick that I'm happy to see here and I hadn't considered before is that because initContainers execute sequentially, we can use it to:

  1. Taint nodes,
  2. Pull user image
  3. untaint node

Which massively simplifies the process vs a full controller as I was thinking of before.

@yuvipanda
Copy link
Collaborator

sorry, I didn't see your comments @consideRatio! I'll let @xcompass provide more info now, as I'd assume they have a more up-to-date understanding of the problem right now than I do :)

Copy link
Collaborator

@yuvipanda yuvipanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding those here! I am going to be off for a few days, but left a couple preliminary comments to simplify this further, keeping in mind our ability to maintain this long term with limited go expertise.

I think by falling back to kubeconfig auth if we aren't in cluster, we can get rid of the tiltfile as well as all the extra YAML bits needed for testing and do quick tests locally. We will need to add some tests that exercise a z2jh config with this tainting though.

Excited to see progress on this!

@@ -0,0 +1,25 @@
# syntax=docker/dockerfile:1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Atleast to start with and to keep the amount of required go knowledge to a minimum, what do you think about doing this build the same way as https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/main/images/image-awaiter/Dockerfile? I think primarily that means using scratch as the final target rather than distroless.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reason I'm using distroless is to run taintmanager as nonroot user to be more secure. If it is not a concern, I can change it to scratch.

}

// creates the in-cluster config
config, err := rest.InClusterConfig()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you fallback to kubeconfig here if inclusterconfig is not available? That would allow us to test this locally much more easily, and get rid of all the extra YAML bits and tiltfile required to simply test it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. out-cluster kubeconfig and node name parameter are added.

Copy link
Collaborator

@yuvipanda yuvipanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! My next question is how do you think we can write a unit / integration test for this here?

@@ -0,0 +1,8 @@
apiVersion: rbac.authorization.k8s.io/v1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given we can test this by running the taintmanager locally, are these test files still needed?

@@ -559,6 +559,9 @@ scheduling:
operator: Equal
value: core
effect: NoSchedule
- key: hub.jupyter.org/imagepulling
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image-pull-in-progress maybe?

required: [enabled]
description: |
Manage the taints during the single user image prepulling. It will add
sepcified taint before image pulling and remove it after it's done.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"sepcified" -> "specified"

@mnp
Copy link

mnp commented Mar 20, 2024

Did this get dropped? We probably would benefit.

@xcompass
Copy link
Contributor Author

I'll bring it up-to-date people are still interested in it.

@yuvipanda
Copy link
Collaborator

Yeah i still think this would be pretty useful if we can get it to work correctly!

The taint manager is a go binary to add/remove taint for a node.
This is to add support for out-cluster development and testing.
Also removed Titlefile and deployment manifest as this can be tested
outside the cluster
@xcompass xcompass changed the title Adding taint during image pre-pulling to prevent user pod being scheduled [WIP] Adding taint during image pre-pulling to prevent user pod being scheduled Apr 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants