Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate to create a Custom Scheduler to schedule TaskRun pods #3052

Closed
jlpettersson opened this issue Aug 3, 2020 · 11 comments
Closed
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@jlpettersson
Copy link
Member

Feature request

Investigate if it would be doable to create a Custom Scheduler for scheduling TaskRun pods, e.g. co-scheduling pods that share workspace PVC volume.

Use case

When the affinity assistant was introduced it solved problems with concurrent access to workspace volumes and deadlock if pods were scheduled to different AZ.

Using pod-affinity to achieve Node Affinity for TaskRun pods was the least complex solution that was evaluated.

The current solution works for common cases, but it is not a perfect solution. E.g. there may be problems when TaskRun require different amount of resources and the Nodes need to be autoscaled up as in #3049

Adding a custom scheduler will probably introduce more complexity and code. But it probably solve the problem in a more generic way than using the Affinity Assistant.

@jlpettersson jlpettersson added the kind/feature Categorizes issue or PR as related to a new feature. label Aug 3, 2020
@dlorenc
Copy link
Contributor

dlorenc commented Aug 7, 2020

It seems like writing a custom scheduler is pretty straightforward: https://github.com/kelseyhightower/scheduler

but dealing with edge cases would probably be a lot of effort. I wonder if it would be possible to write a best-effort scheduler that runs first, but bails out to the real one in complex situations.

@denkensk
Copy link

I suggest that we can evaluate that if Affinity Assistant can be implement by the Scheduling Framework.
FYI:https://github.com/kubernetes/enhancements/blob/master/keps/sig-scheduling/20180409-scheduling-framework.md

And we maybe can enhance the coscheduling to support the requirement. https://github.com/kubernetes-sigs/scheduler-plugins/tree/master/pkg/coscheduling
@jlpettersson @dlorenc

@vincent-pli
Copy link
Member

@denkensk
That's very interesting, I would like to take a try.

@imjasonh
Copy link
Member

I think a good concrete next step here would be for someone to experiment/prototype with the scheduler framework and report back to the community with any findings/demos, and help us concretely understand what the code would look like to, for instance, replace AA with custom scheduling.

Based on those findings we could start a design doc to more concretely outline requirements and next steps, or maybe determine that delving into scheduling really isn't worth the effort and shouldn't be pursued at this time.

@vincent-pli is that something you'd be interested in exploring and driving?

@vincent-pli
Copy link
Member

@imjasonh
Yes, as @dlorenc mentioned, the custom scheduler is straightforward and I can copy/paste from https://github.com/kubernetes-sigs/scheduler-plugins/tree/master/pkg/coscheduling as @denkensk said.

Anyway, I will make a demo and back here.

@vincent-pli
Copy link
Member

vincent-pli commented Oct 3, 2020

@imjasonh @denkensk
I have make a very draft implements here: https://github.com/vincent-pli/coscheduler-same-node

Please take a look.

@jlpettersson
Copy link
Member Author

This looks cool @vincent-pli
What is the next step? Can I help?

We should probably imitate the logic of the Affinity Assistant in a scheduler and add it to the experimental repository. So that we eventually can replace the Affinity Assistant with the scheduler.

@vincent-pli
Copy link
Member

@jlpettersson @imjasonh
I think we can add it to the experimental repository firstly then make some further discussion for further implements.
I'm glad to create a PR for it if needed.

@imjasonh
Copy link
Member

@jlpettersson @imjasonh
I think we can add it to the experimental repository firstly then make some further discussion for further implements.
I'm glad to create a PR for it if needed.

That would be great, I'd be happy to do any reviews and approve any PRs to add it to experimental. If we decide to try to move it into Tekton core we'd need a TEP, but it sounds like it should be usable without that in the near term at least.

Thanks!

@imjasonh imjasonh reopened this Oct 30, 2020
@vincent-pli
Copy link
Member

Great, let's add it to the experimental firstly.

vincent-pli added a commit to vincent-pli/experimental that referenced this issue Oct 30, 2020
…same volume but maybe run on different nodes.

This is a draft version, try to introduce `Scheduler framework` to handle the issue, for now we adopt `affinity assistant` but has issue to measure total resource requirements.

Details please check issue: tektoncd/pipeline#3052

I think we will enhance it soon based on further discussion, thanks.
vincent-pli added a commit to vincent-pli/experimental that referenced this issue Oct 31, 2020
…same volume but maybe run on different nodes.

This is a draft version, try to introduce `Scheduler framework` to handle the issue, for now we adopt `affinity assistant` but has issue to measure total resource requirements.

Details please check issue: tektoncd/pipeline#3052

I think we will enhance it soon based on further discussion, thanks.
vincent-pli added a commit to vincent-pli/experimental that referenced this issue Nov 2, 2020
…same volume but maybe run on different nodes.

This is a draft version, try to introduce `Scheduler framework` to handle the issue, for now we adopt `affinity assistant` but has issue to measure total resource requirements.

Details please check issue: tektoncd/pipeline#3052

I think we will enhance it soon based on further discussion, thanks.
tekton-robot pushed a commit to tektoncd/experimental that referenced this issue Nov 6, 2020
…same volume but maybe run on different nodes.

This is a draft version, try to introduce `Scheduler framework` to handle the issue, for now we adopt `affinity assistant` but has issue to measure total resource requirements.

Details please check issue: tektoncd/pipeline#3052

I think we will enhance it soon based on further discussion, thanks.
@jlpettersson
Copy link
Member Author

When analyzing this a bit deeper with the Design doc: Task parallelism when using workspace and the following discussions in the API WG in december - I don't see that a custom scheduler is helping us that much with the problems, but it adds more code, complexity and perhaps introduce new problems.

I think we can close this. I think the alternative described in #3638 might help us more.

@vincent-pli let me know if you have a different standpoint after been contributing to this.

Closing this for now.

chandanikumari pushed a commit to chandanikumari/experimental that referenced this issue Jan 27, 2021
…same volume but maybe run on different nodes.

This is a draft version, try to introduce `Scheduler framework` to handle the issue, for now we adopt `affinity assistant` but has issue to measure total resource requirements.

Details please check issue: tektoncd/pipeline#3052

I think we will enhance it soon based on further discussion, thanks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

5 participants