Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce topology into the runtimeClass API #75744

Merged
merged 9 commits into from
May 31, 2019

Conversation

yastij
Copy link
Member

@yastij yastij commented Mar 26, 2019

What type of PR is this?

/kind api-change
/kind feature

What this PR does / why we need it:

Which issue(s) this PR fixes: Fixes #72413

Special notes for your reviewer:

/assign @tallclair

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Mar 26, 2019
@yastij
Copy link
Member Author

yastij commented Mar 26, 2019

/sig scheduling
/sig node
/priority important-soon

@k8s-ci-robot k8s-ci-robot added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/node Categorizes an issue or PR as relevant to SIG Node. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Mar 26, 2019
@yastij yastij force-pushed the runtimeclass-scheduling-api branch 2 times, most recently from 212e4e5 to 352d9d5 Compare March 26, 2019 21:43
@yastij yastij changed the title introduce topology into the runtimeClass API WIP: introduce topology into the runtimeClass API Mar 26, 2019
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 26, 2019
@yastij yastij force-pushed the runtimeclass-scheduling-api branch from 352d9d5 to d253246 Compare March 26, 2019 23:41
@k82cn
Copy link
Member

k82cn commented Mar 27, 2019

any KEP or discussion about that?

@yastij
Copy link
Member Author

yastij commented Mar 27, 2019

Ref kubernetes/enhancements#909

This is mostly a placeholder/wip as this is still on reviewv

@tallclair
Copy link
Member

Please update this to match the KEP (NodeSelectorTerm -> NodeSelector)

@yastij yastij force-pushed the runtimeclass-scheduling-api branch from d253246 to 8f5e8fd Compare April 4, 2019 20:16
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Apr 4, 2019
@yastij yastij force-pushed the runtimeclass-scheduling-api branch 3 times, most recently from 84b7afc to bd2b7aa Compare April 4, 2019 22:23
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Apr 19, 2019
@yastij yastij force-pushed the runtimeclass-scheduling-api branch from e00cf99 to 02288bc Compare April 19, 2019 23:52
@yastij yastij force-pushed the runtimeclass-scheduling-api branch from b6685a9 to 3285db8 Compare April 20, 2019 00:41
@yastij
Copy link
Member Author

yastij commented Apr 20, 2019

/retest

@yastij yastij mentioned this pull request May 3, 2019
Copy link
Member

@saad-ali saad-ali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meeting Notes From the API review meeting today with @thockin

  • Why not something more generic? Maybe like a node capabilities struct (maybe on node resource), where pod requests node capability of "runtime windows?
    • Not considered in this context. Considered SchedulingPolicy context.
    • Don't want everything complicated (bespoke features) getting dumped on scheduler. In this case Scheduler will have new predicate for RuntimeClass.
  • RuntimeClass comes with list of tolerations. Why is that not sufficient for this scheduling?
    • Taint all windows nodes as windows? Because you could have overlap. Could have nodes run kata, gvisor, and some overlap that runC. Need tolerations for union.
  • Why is Node Selector not good enough?
    • If every pod had RuntimeClass and you have labels set up you don't need taints and tolerations. But you could make same argument for general taints and tolerations (if labels are set up on everything).
    • Taints and tolerations let you set default instead of just match or not match.
  • Why is it insufficient to label pod with node selector that says OS==Windows?
    • Original motivator in case of gvisor nodes -- don't want non-gvisor pods (non sandboxed) -- taint helps repel default pods.
  • If we had defaulting could get rid of tolerations?
    • Yes, all *Class types should have default. Worth looking in to how StorageClass does it.
  • Talking about adding scheduler logic to add selector to intersect with node selector?
    • Yes. Let us have a nice error message -- if you can't schedule due to what's on pod vs what's on runtime class.
  • Back to original question: why not demand make this more generic. Node capabilities, and make RuntimeClass in to a node capabilities. Worth thinking about.
    • Concept of error message is nice, but if we make it general, we don't need to do this again.
  • What is time line?
    • Hoping to get it in to 1.15 as beta. But we could call it alpha and put it behind a feature gate.
    • General solution may be not that much more work. Let's set up a follow up discussion with SIG-Node @dchen1107.
  • Why label selector on nodes -- instead of first class thing on nodes?
    • Using labels means smaller change. Requires you to put selector in RuntimeClass so you have to agree on what label is on node.
  • How does label get on node?
    • Part of node setup or config (whoever is responsible for configuring node does it).
    • Another reason, regular gvisor vs debug gvisor runtime. Add new runtime class, but don't want to update labels, want to reuse the same labels. "runc overhead" is the more interesting example.

Open questions:

  1. Generalize or not? Capability or not?
  2. Why use word topology -- it has lots of baggage?
    • Not tied to it, copied from StorageClass. Stay away from that word unless you mean physical structure of things, which this isn't representing.

@tallclair
Copy link
Member

Updated based on API review feedback, according to kubernetes/enhancements#1069

@tallclair tallclair force-pushed the runtimeclass-scheduling-api branch from 4d5a0cc to 2e38485 Compare May 22, 2019 15:49
Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 31, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: thockin, yastij

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 31, 2019
@k8s-ci-robot k8s-ci-robot merged commit 6273a7a into kubernetes:master May 31, 2019
@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Jun 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-review Categorizes an issue or PR as actively needing an API review. approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
Status: API review completed, 1.16
Development

Successfully merging this pull request may close these issues.

runtime-aware scheduling
8 participants