-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add user docs for pod priority and preemption #5328
Conversation
Deploy preview ready! Built with commit 812db65 https://deploy-preview-5328--kubernetes-io-master-staging.netlify.com |
bcee54e
to
8c5ebe3
Compare
__alpha__ feature. It can be enabled by a command-line flag: | ||
|
||
``` | ||
--feature-gates=PodPriority=true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please also specify for which componment shall I specify this parameter? I think we should enable it for both scheduler and apiserver.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't it the default that feature-gates are shared across all master components?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No matter what it is, I agree it should be mentioned explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for specifying which components it needs to be specified on. Since the flag is used in API server and scheduler, I assume those are the components ,as @gyliu513 says.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
||
The following sections provide more information about these steps. | ||
|
||
## Enable Priority and Preemption |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please also mention that we should enable admission controller plugin here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct, there has to be --admisstion-control=...,Priority
command line option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but we should always have ResourceQuota
as last one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The admission controller is already enabled, but checks the feature gate. Enabling feature gate should be enough to activate the admission controller as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also need --runtime-config=scheduling.k8s.io/v1alpha1=true
, right? This could have worked because we are enabling this alpha API by default in current code base. However, the convention was to have all alpha APIs disabled by default, IIRC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bsalamat can you please show me the detail where did we add it to the list of default admission controllers? I did not find the code where it was added, am I missing anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this PR: https://github.com/kubernetes/kubernetes/pull/49322/files
I am not sure if it was the right thing to do given the downgrade issue we are observing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see, thanks @bsalamat , I think that when downgrade, the downgrade script should have some logic to delete this admission control plugin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I would expect as well, but I am not sure if the downgrade process actually removes them, otherwise #52226 shouldn't have happened.
objects can have any 32-bit integer value smaller than or equal to 1 billion. Larger | ||
numbers are reserved for system use. | ||
|
||
PriorityClass also has two optional fields: `globaleDefault` and `description`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
globaleDefault -> globalDefault
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For symmetry, I would suggest to say the names of the required fields (name
and value
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name is under metadata. I added "value" to the previous paragraph.
PriorityClass also has two optional fields: `globaleDefault` and `description`. | ||
`globalDefault` indicates that the value of this PriorityClass should be used for | ||
pods without a `PriorityClassName`. Only one PriorityClass with `globalDefault` | ||
set to true can exists in the system. If there is no PriorityClass with `globalDefault` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exists -> exist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
The following sections provide more information about these steps. | ||
|
||
## Enable Priority and Preemption |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct, there has to be --admisstion-control=...,Priority
command line option.
all the specified requirements of the pod, the pod is determined infeasible. At this | ||
point preemption logic is triggered for the pending pod. Let's call the pending pod P. | ||
Preemption logic tries to find a node where removal of pods with lower priority than | ||
P helps schedule P. If such node is found, one or more lower priority pods will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
such node ->such a node
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/helps schedule P/would enable P to schedule on that node/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both done.
the node?" | ||
|
||
If the answer is no, that node will not be considered for preemption. If the pending | ||
pod has inter-pod affinity on one or more of those lower priority pods on the node, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on one or more ->with one or more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed "on" to "to".
pods and scheduler will find the pending pod infeasible on the node. As a result, | ||
it will not try to preempt any pods on that node. | ||
Scheduler will try to find other nodes for preemption and could possibly find another | ||
one, but there is no guarantee that such a node will be found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better move line 153-154 to around line 108 ? This may not be a problem specific to this affinity scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these lines blend well with the text at line 108.
equal or higher priority pods. | ||
|
||
#### Cross Node Preemption | ||
When considering a node N for preemption in order to schedule pending pod P, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
schedule pending -> schedule the pending
title: Pod Priority and Preemption | ||
--- | ||
|
||
[Pods](/docs/user-guide/pods) in Kubernetes 1.8 and later can have priority. Priority |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, wasn't priority itself added in 1.7?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think no -- kubernetes/kubernetes#48377 was merged Jul 19.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
David is right. We didn't have pod priority in 1.7.
indicates the importance of a pod. When a pod cannot be scheduled, scheduler tries | ||
to preempt lower priority pods in order to make scheduling of the pending pod possible. | ||
|
||
* TOC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I clicked view - something is not generating correctly for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this file is not displayed correctly by clicking github's "view" button. You should follow the link that k8sio-netlify-preview-bot
leaves on the PR to see the generated page.
__alpha__ feature. It can be enabled by a command-line flag: | ||
|
||
``` | ||
--feature-gates=PodPriority=true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No matter what it is, I agree it should be mentioned explicitly.
They have so much time to finish their work and exit. If they don't, they will be | ||
killed. This graceful termination period creates a time gap between the point that | ||
scheduler preempts pods until the pending pod (P) can be scheduled on the node (N). | ||
When there are multiple victims on node N, they may exit or get terminated at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I don't think this problem is reserved to situation where there is many victims - it's just more likely there.
But in general, even if there is one victim, once this is finished you have no guarantee that the first pod you will be processing, will be the preemptor. So someone else may be scheduled in that place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I agree. I think actually you can just delete everything between "When there are multiple..." and "pod P won't fit on node N anymore" and replace it with "As victims exit, smaller (than P) pods in the pending queue may schedule onto N as space is freed up, making P not possible to schedule onto N. Even a pod the same size as P may schedule onto N, once all of the victims exit, if the scheduler reaches that pod in the pending queue before it re-examines P."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. I fixed the text.
get scheduled for a while. This scenario can cause problems in various clusters, but | ||
is particularly problematic in clusters where many new pods are created all the time. | ||
|
||
We intend to address this problem in beta version of pod preemption. The solution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's be very explici here: "Fixing this is a Beta blocker".
@davidopp - do you agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I agree; that's a bit stronger than "intend to address"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it is a beta blocker, but should we put it in a user doc?
I am going to write a separate design doc on it. I will put the milestones there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Race condition in fixing and commenting.
I changed it to we will address in beta.
are preempted. Current preemption algorithm does not perform preemption of pods | ||
on nodes other than N, when considering N for preemption. | ||
|
||
We may consider adding cross node preemption in future versions if we find an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's be explicit: "Fixing this will NOT be Beta or GA blocker".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I am a bit hesitant to expose users to our release planning in this kind of docs. I added a sentence to say that we cannot promise anything.
later by other pods, which removes the benefits of having the complex logic of | ||
respecting inter-pod affinity to lower priority pods. | ||
|
||
Our recommended solution for this problem is to create inter-pod affinity towards |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's be explicit: "Fixing this will NOT be Beta or GA blocker".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would delete the "Our recommended solution..." sentence; getting into solutions is too much detail for user guide. But put in what @wojtek-t said.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a piece of text to say that we cannot promise that it will be fixed in Beta or GA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are my comments up through the "limitations" section -- will review that now.
|
||
[Pods](/docs/user-guide/pods) in Kubernetes 1.8 and later can have priority. Priority | ||
indicates the importance of a pod. When a pod cannot be scheduled, scheduler tries | ||
to preempt lower priority pods in order to make scheduling of the pending pod possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/preempt/preempt (evict)/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Say something about PodDisuptionBudget not being respected.
(BTW for 1.9, please add consideration of PDB when deciding preemption node and victims. (Preemption should only violate PDB as a last resort if that is the only option.) I think basically scheduler just needs to watch all the PDBs and keep a map and then use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add a sentence here saying "In the future, priority will also affect out-of-resource eviction ordering on the node."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first and third are done.
Regarding adding PDB, I have already a section on it in the limitations of preemption section. Do you think we should add information about PDB here in the overview?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see you mention PDB later in the limitations section. But I think people might not read that section. So I think you should also mention it here. Add something like "Note that preemption does not respect PodDisruptionBudget; see limitations section for more details" here.
--- | ||
|
||
[Pods](/docs/user-guide/pods) in Kubernetes 1.8 and later can have priority. Priority | ||
indicates the importance of a pod. When a pod cannot be scheduled, scheduler tries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say "importance of a pod relative to other pods" but either way is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
approvers: | ||
- davidopp | ||
- wojtek-t | ||
title: Pod Priority and Preemption |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add "(Alpha)" at the end of the title
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
__alpha__ feature. It can be enabled by a command-line flag: | ||
|
||
``` | ||
--feature-gates=PodPriority=true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for specifying which components it needs to be specified on. Since the flag is used in API server and scheduler, I assume those are the components ,as @gyliu513 says.
--feature-gates=PodPriority=true | ||
``` | ||
|
||
Once enabled you can add PriorityClasses and create pods with `PriorityClassName` set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
link PriorityClassName to the next section, so people will know it is defined somewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
of those PriorityClass names in their spec. | ||
The following YAML is an example of a pod configuration that uses the PriorityClass | ||
created above. Priority admission controller checks the spec and resolves the | ||
priority of the pod to 1,000,000. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do "1000000" instead of "1,000,000" because in a lot of places outside the US "," is the same as a US "." in numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
**Note 2:** Addition of a PriorityClass with `globalDefault` set to true does not | ||
change priority of existing pods. The value of such PriorityClass will be used only | ||
for pods created after the PriorityClass is added. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add something to this section about what happens if you delete a PriorityClass object. Also you should mention that if you submit a pod that has a PriorityClassName that doesn't have a corresponding PriorityClass, the pod will be rejected by the admission controller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good points. Added the first one here and the second one to the "Pod Priority" section.
## Preemption | ||
When pods are created, they go to a queue and wait to be scheduled. Scheduler picks a pod | ||
from the queue and tries to schedule it on a node. If no node is found that satisfies | ||
all the specified requirements of the pod, the pod is determined infeasible. At this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/requirements/requirements (predicates)/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, "determined infeasible" isn't really necessary, I would join the two sentences like
"all the specified requirements (predicates) of the pod, preemption logic is triggered for the pending pod."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
all the specified requirements of the pod, the pod is determined infeasible. At this | ||
point preemption logic is triggered for the pending pod. Let's call the pending pod P. | ||
Preemption logic tries to find a node where removal of pods with lower priority than | ||
P helps schedule P. If such node is found, one or more lower priority pods will |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/helps schedule P/would enable P to schedule on that node/
from the queue and tries to schedule it on a node. If no node is found that satisfies | ||
all the specified requirements of the pod, the pod is determined infeasible. At this | ||
point preemption logic is triggered for the pending pod. Let's call the pending pod P. | ||
Preemption logic tries to find a node where removal of pods with lower priority than |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/pods/one or more pods/
(just so people understand that we don't necessarily evict all lower-priority pods)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The next sentence says "one or more pods", but I added it here as well.
approvers: | ||
- davidopp | ||
- wojtek-t | ||
title: Pod Priority and Preemption |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
title: Pod Priority and Preemption | ||
--- | ||
|
||
[Pods](/docs/user-guide/pods) in Kubernetes 1.8 and later can have priority. Priority |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
David is right. We didn't have pod priority in 1.7.
--- | ||
|
||
[Pods](/docs/user-guide/pods) in Kubernetes 1.8 and later can have priority. Priority | ||
indicates the importance of a pod. When a pod cannot be scheduled, scheduler tries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
indicates the importance of a pod. When a pod cannot be scheduled, scheduler tries | ||
to preempt lower priority pods in order to make scheduling of the pending pod possible. | ||
|
||
* TOC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this file is not displayed correctly by clicking github's "view" button. You should follow the link that k8sio-netlify-preview-bot
leaves on the PR to see the generated page.
|
||
[Pods](/docs/user-guide/pods) in Kubernetes 1.8 and later can have priority. Priority | ||
indicates the importance of a pod. When a pod cannot be scheduled, scheduler tries | ||
to preempt lower priority pods in order to make scheduling of the pending pod possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first and third are done.
Regarding adding PDB, I have already a section on it in the limitations of preemption section. Do you think we should add information about PDB here in the overview?
get scheduled for a while. This scenario can cause problems in various clusters, but | ||
is particularly problematic in clusters where many new pods are created all the time. | ||
|
||
We intend to address this problem in beta version of pod preemption. The solution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that it is a beta blocker, but should we put it in a user doc?
I am going to write a separate design doc on it. I will put the milestones there.
the node?" | ||
|
||
If the answer is no, that node will not be considered for preemption. If the pending | ||
pod has inter-pod affinity on one or more of those lower priority pods on the node, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed "on" to "to".
pods and scheduler will find the pending pod infeasible on the node. As a result, | ||
it will not try to preempt any pods on that node. | ||
Scheduler will try to find other nodes for preemption and could possibly find another | ||
one, but there is no guarantee that such a node will be found. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these lines blend well with the text at line 108.
later by other pods, which removes the benefits of having the complex logic of | ||
respecting inter-pod affinity to lower priority pods. | ||
|
||
Our recommended solution for this problem is to create inter-pod affinity towards |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a piece of text to say that we cannot promise that it will be fixed in Beta or GA.
are preempted. Current preemption algorithm does not perform preemption of pods | ||
on nodes other than N, when considering N for preemption. | ||
|
||
We may consider adding cross node preemption in future versions if we find an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I am a bit hesitant to expose users to our release planning in this kind of docs. I added a sentence to say that we cannot promise anything.
#### Starvation of Preempting Pod | ||
When pods are preempted, the victims get their | ||
[graceful termination period](https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods). | ||
They have so much time to finish their work and exit. If they don't, they will be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/so much/that much/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
They have so much time to finish their work and exit. If they don't, they will be | ||
killed. This graceful termination period creates a time gap between the point that | ||
scheduler preempts pods until the pending pod (P) can be scheduled on the node (N). | ||
When there are multiple victims on node N, they may exit or get terminated at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I agree. I think actually you can just delete everything between "When there are multiple..." and "pod P won't fit on node N anymore" and replace it with "As victims exit, smaller (than P) pods in the pending queue may schedule onto N as space is freed up, making P not possible to schedule onto N. Even a pod the same size as P may schedule onto N, once all of the victims exit, if the scheduler reaches that pod in the pending queue before it re-examines P."
get scheduled for a while. This scenario can cause problems in various clusters, but | ||
is particularly problematic in clusters where many new pods are created all the time. | ||
|
||
We intend to address this problem in beta version of pod preemption. The solution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I agree; that's a bit stronger than "intend to address"
|
||
[Pods](/docs/user-guide/pods) in Kubernetes 1.8 and later can have priority. Priority | ||
indicates the importance of a pod. When a pod cannot be scheduled, scheduler tries | ||
to preempt lower priority pods in order to make scheduling of the pending pod possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see you mention PDB later in the limitations section. But I think people might not read that section. So I think you should also mention it here. Add something like "Note that preemption does not respect PodDisruptionBudget; see limitations section for more details" here.
The current implementation of preemption considers a node for preemption only when | ||
the answer to this question is positive: "If all the pods with lower priority than | ||
the pending pod are removed from the node, can the pending pod be scheduled on | ||
the node?" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add this sentence:
(Note that preemption does not always remove all lower-priority pods, e.g. if the pending pod can be scheduled by removing fewer than all lower-priority pods, but this test must always pass for preemption to be considered on a node.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pod has inter-pod affinity on one or more of those lower priority pods on the node, the | ||
inter-pod affinity rule cannot be satisfied in the absence of the lower priority | ||
pods and scheduler will find the pending pod infeasible on the node. As a result, | ||
it will not try to preempt any pods on that node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say "it will not try to schedule the pod onto the node."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the current sentence more accurate? We are talking about preemption here and scheduling will be the next step which may or may not happen on this node.
later by other pods, which removes the benefits of having the complex logic of | ||
respecting inter-pod affinity to lower priority pods. | ||
|
||
Our recommended solution for this problem is to create inter-pod affinity towards |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would delete the "Our recommended solution..." sentence; getting into solutions is too much detail for user guide. But put in what @wojtek-t said.
|
||
#### Cross Node Preemption | ||
When considering a node N for preemption in order to schedule pending pod P, | ||
P may become feasible on N only when pods on other nodes are preempted. For |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/when/if/
(basically same meaning, but slightly stronger implication that it won't happen)
#### Cross Node Preemption | ||
When considering a node N for preemption in order to schedule pending pod P, | ||
P may become feasible on N only when pods on other nodes are preempted. For | ||
example, if there is anti-affinity from existing lower priority pods in a zone |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since anti-affinity is symmetric, "towards" is just confusing. I would rewrite from this sentence to the end of the paragraph to be a little clearer and provide a concrete example. Something like: "For example, P may have zone anti-affinity with some currently-running, lower-priority pod Q. P may not be schedulable on Q's node even if it preempts Q, for example if P is larger than Q so preempting Q does not free up enough space on Q's node and P is not high-priority enough to preempt other pods on Q's node. But P might theoretically be able to schedule on some other node by preempting Q and some pod(s) on this other node (preempting Q removes the anti-affinity violation, and preempting pod(s) on this other node frees up space for P to schedule there). The current preemption algorithm does not detect and execute such preemptions; that is, when determining whether P can schedule onto N, it only considers preempting pods on N."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
I actually think this is very important for users to know. It shows them how they can have effective affinity rules in a cluster where preemption is enabled. |
I guess I addressed all the comments. |
[Pods](/docs/user-guide/pods) in Kubernetes 1.8 and later can have priority. Priority | ||
indicates importance of a pod relative to other pods. When a pod cannot be scheduled, scheduler tries | ||
to preempt (evict) lower priority pods in order to make scheduling of the pending pod possible. | ||
Soon, priority will also affect out-of-resource eviction ordering on the node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Soon" might be confusing. Instead maybe say "In a future Kubernetes release"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to "In a future Kubernetes release", but I am not so sure if @dashpole hasn't added it already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The design doc seems to have been merged just two weeks ago, so I don't think it's in 1.8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dashpole told me that this would be a tiny change that he could do in a day. That's why I am not so sure.
In order to use priority and preemption in Kubernetes 1.8, you should follow these | ||
steps: | ||
|
||
1. Enable Priority and Preemption. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "Enable the feature" so people understand it's a single feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
1. Enable Priority and Preemption. | ||
1. Add one or more PriorityClasses. | ||
1. Create pods with `PriorityClassName` set to one of the added PriorityClasses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add to the end of this something like "(Of course you do not need to create the pods directly; normally you would add PriorirtyClassName
to the pod template of whatever set object is managing your pods, for example a Deployment.)"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
be backward compatible. | ||
|
||
## PriorityClass | ||
PriorityClass is a non-namespaced object that defines a mapping from a PriorityClassName to the integer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC there is no such thing as a PriorityClassName -- it's just the "name" in the metadata of the PriorityClass object. So I would write this as "PriorityClass is a non-namespaced object that defines a mapping from a priority class name (represented in the "name" field of the PriorityClass object's metadata) to the integer value of the priority."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Good point. Done.
specified in `value` field which is required. PriorityClass | ||
objects can have any 32-bit integer value smaller than or equal to 1 billion. Larger | ||
numbers are reserved for critical system pods that should not normally be preempted or | ||
evicted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add to the end of this "A cluster admin should create one PriorityClass object for each such mapping that they want."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
preempt other pods on node N or another node to let P schedule. This scenario may | ||
be repeated again for the second and subsequent rounds of preemption and P may not | ||
get scheduled for a while. This scenario can cause problems in various clusters, but | ||
is particularly problematic in clusters where many new pods are created all the time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe say "in clusters with a high preemption rate" rather than "in clusters where many new pods are created all the time" as I think it's more related to the former?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High creation rate is more of a problem that preemption rate. I changed it to "in clusters with a high creation rate". When creation rate is high, even a single preemption may face this issue.
one, but there is no guarantee that such a node will be found. | ||
|
||
We may address this issue in future versions, but we don't have a clear plan and cannot | ||
promise that it will be fixed in Beta or GA. Part |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @wojtek-t would feel better if after "GA" you say "(i.e. we will not consider it a blocker for Beta or GA)."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Done :)
We may address this issue in future versions, but we don't have a clear plan and cannot | ||
promise that it will be fixed in Beta or GA. Part | ||
of the reason is that finding the set of lower priority pods that satisfy all | ||
inter-pod affinity/anti-affinity rules is computationally expensive and adds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should you remove "anti-affinity" here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Done
later by other pods, which removes the benefits of having the complex logic of | ||
respecting inter-pod affinity to lower priority pods. | ||
|
||
Our recommended solution for this problem is to create inter-pod affinity towards |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/towards/only towards/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pods on N. | ||
|
||
We may consider adding cross node preemption in future versions if we find an | ||
algorithm with reasonable performance, but we cannot promise anything at this point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @wojtek-t would feel better if you add at the end something like "(it will not be considered a blocker for Beta and GA)."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again! PTAL.
[Pods](/docs/user-guide/pods) in Kubernetes 1.8 and later can have priority. Priority | ||
indicates importance of a pod relative to other pods. When a pod cannot be scheduled, scheduler tries | ||
to preempt (evict) lower priority pods in order to make scheduling of the pending pod possible. | ||
Soon, priority will also affect out-of-resource eviction ordering on the node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to "In a future Kubernetes release", but I am not so sure if @dashpole hasn't added it already.
In order to use priority and preemption in Kubernetes 1.8, you should follow these | ||
steps: | ||
|
||
1. Enable Priority and Preemption. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
1. Enable Priority and Preemption. | ||
1. Add one or more PriorityClasses. | ||
1. Create pods with `PriorityClassName` set to one of the added PriorityClasses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
The following sections provide more information about these steps. | ||
|
||
## Enable Priority and Preemption |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
be backward compatible. | ||
|
||
## PriorityClass | ||
PriorityClass is a non-namespaced object that defines a mapping from a PriorityClassName to the integer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Good point. Done.
preempt other pods on node N or another node to let P schedule. This scenario may | ||
be repeated again for the second and subsequent rounds of preemption and P may not | ||
get scheduled for a while. This scenario can cause problems in various clusters, but | ||
is particularly problematic in clusters where many new pods are created all the time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High creation rate is more of a problem that preemption rate. I changed it to "in clusters with a high creation rate". When creation rate is high, even a single preemption may face this issue.
one, but there is no guarantee that such a node will be found. | ||
|
||
We may address this issue in future versions, but we don't have a clear plan and cannot | ||
promise that it will be fixed in Beta or GA. Part |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Done :)
We may address this issue in future versions, but we don't have a clear plan and cannot | ||
promise that it will be fixed in Beta or GA. Part | ||
of the reason is that finding the set of lower priority pods that satisfy all | ||
inter-pod affinity/anti-affinity rules is computationally expensive and adds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Done
later by other pods, which removes the benefits of having the complex logic of | ||
respecting inter-pod affinity to lower priority pods. | ||
|
||
Our recommended solution for this problem is to create inter-pod affinity towards |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pods on N. | ||
|
||
We may consider adding cross node preemption in future versions if we find an | ||
algorithm with reasonable performance, but we cannot promise anything at this point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@bsalamat, The base branch for this PR should be release-1.8. Do you have any objections to changing the base branch? Thanks. |
LGTM BTW @bsalamat if you haven't done it before, changing the base branch to release-1.8 should just require selecting release-1.8 from the drop-down at the top of this Github page where it says "bsalamat wants to merge 2 commits into ..." (the "..." is where the drop-down should be) |
Related with kubernetes/kubernetes#52226 |
@steveperry-53 Just changed the base branch. Thanks for the reminder. |
|
||
**Note:** Alpha features should not be used in production systems! Alpha | ||
features are more likely to have bugs and future changes to them are not guaranteed to | ||
be backward compatible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not guaranteed to be backward compatible.
Do we have doc on the scope of alpha, beta and GA?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
em... maybe you are referring to this? https://kubernetes.io/docs/reference/deprecation-policy/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@k82cn This doc tries to state the features/improvements planned for Beta, but we don't have any other doc at this point.
to the integer value of the priority. The higher the value, the higher the | ||
priority. The value is | ||
specified in `value` field which is required. PriorityClass | ||
objects can have any 32-bit integer value smaller than or equal to 1 billion. Larger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if bigger than 1 billion, will reject or ignore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Priority admission controller rejects them.
@bsalamat, I have some suggestions for the text. Could you check the Allow edits by maintainers box in the right column. Thanks. |
@steveperry-53 Done |
Deploy preview ready! Built with commit 812db65 https://deploy-preview-5328--kubernetes-io-vnext-staging.netlify.com |
@bsalamat, See 12bcac7 for suggested edits. At one point in the doc, I have a TODO that asks this question: TODO: Revise this next example. I don’t understand the example with Node M. I took a stab at it below, but I don’t think I’ve gotten it right. I don’t see why if we start by considering N, we need a third Node M. |
@steveperry-53 Scheduler doesn't do cross node preemption in this version. So, when scheduler considers pod P to run on node N and pod P has anti-affinity to pod Q and pod Q is running on a different node, pod P will be deemed unschedulable on node N no matter how many pods are preempted on N. |
@bsalamat, Are you ready for me to merge this? |
@steveperry-53 Yes, I think this is ready now. |
@steveperry-53 Now that the PR is merged, where is its permanent location that I can link to in our release notes? |
* GC now supports non-core resources * Add two examples about how to analysis audits of kube-apiserver (#4264) * Deprecate system:nodes binding * [1.8] StatefulSet `initialized` annotation is now ignored. * inits the kubeadm upgrade docs addresses /issues/4689 * adds kubeadm upgrade cmd to ToC addresses /issues/4689 * add workload placement docs * ScaleIO - document udpate for 1.8 * Add documentation on storageClass.mountOptions and PV.mountOptions (#5254) * Add documentation on storageClass.mountOptions and PV.mountOptions * convert notes into callouts * Add docs for CustomResource validation add info about supported fields * advanced audit beta features (#5300) * Update job workload doc with backoff failure policy (#5319) Add to the Jobs documentation how to use the new backoffLimit field that limit the number of Pod failure before considering the Job as failed. * Documented additional AWS Service annotations (#4864) * Add device plugin doc under concepts/cluster-administration. (#5261) * Add device plugin doc under concepts/cluster-administration. * Update device-plugins.md * Update device-plugins.md Add meta description. Fix typo. Change bare metal deployment to manual deployment. * Update device-plugins.md Fix typo again. * Update page.version. (#5341) * Add documentation on storageClass.reclaimPolicy (#5171) * [Advanced audit] use new herf for audit-api (#5349) This tag contains all the changes in v1beta1 version. Update it now. * Added documentation around creating the InitializerConfiguration for the persistent volume label controller in the cloud-controller-manager (#5255) * Documentation for kubectl plugins (#5294) * Documentation for kubectl plugins * Update kubectl-plugins.md * Update kubectl-plugins.md * Updated CPU manager docs to match implementation. (#5332) * Noted limitation of alpha static cpumanager. * Updated CPU manager docs to match implementation. - Removed references to CPU pressure node condition and evictions. - Added note about new --cpu-manager-reconcile-period flag. - Added note about node allocatable requirements for static policy. - Noted limitation of alpha static cpumanager. * Move cpu-manager task link to rsc mgmt section. * init containers annotation removed in 1.8 (#5390) * Add documentation for TaintNodesByCondition (#5352) * Add documentation for TaintNodesByCondition * Update nodes.md * Update taint-and-toleration.md * Update daemonset.md * Update nodes.md * Update taint-and-toleration.md * Update daemonset.md * Fix deployments (#5421) * Document extended resources and OIR deprecation. (#5399) * Document extended resources and OIR deprecation. * Updated extended resources doc per reviews. * reverts extra spacing in _data/tasks.yml * addresses `kubeadm upgrade` review comments Feedback from @chenopis, @luxas, and @steveperry-53 addressed with this commit * HugePages documentation (#5419) * Update cpu-management-policies.md (#5407) Fixed the bad link. Modified "cpu" to "CPU". Added more 'yaml' as supplement. * Update RBAC docs for v1 (#5445) * Add user docs for pod priority and preemption (#5328) * Add user docs for pod priority and preemption * Update pod-priority-preemption.md * More updates * Update docs/admin/kubeadm.md for 1.8 (#5440) - Made a couple of minor wording changes (not strictly 1.8 related). - Did some reformatting (not strictly 1.8 related). - Updated references to the default token TTL (was infinite, now 24 hours). - Documented the new `--discovery-token-ca-cert-hash` and `--discovery-token-unsafe-skip-ca-verification` flags for `kubeadm join`. - Added references to the new `--discovery-token-ca-cert-hash` flag in all the default examples. - Added a new _Security model_ section that describes the security tradeoffs of the various discovery modes. - Documented the new `--groups` flag for `kubeadm token create`. - Added a note of caution under _Automating kubeadm_ that references the _Security model_ section. - Updated the component version table to drop 1.6 and add 1.8. - Update `_data/reference.yml` to try to get the sidebar fixed up and more consistent with `kubefed`. * Update StatefulSet Basics for 1.8 release (#5398) * addresses `kubeadm upgrade` review comments 2nd iteration review comments by @luxas * adds kubelet upgrade section to kubeadm upgrade * Fix a bulleted list on docs/admin/kubeadm.md. (#5458) I updated this doc yesterday and I was absolutely sure I fixed this, but I just saw that this commit got lost somehow. This was introduced recently in #5440. * Clarify the API to check for device plugins * Moving Flexvolume to separate out-of-tree section * addresses `kubeadm upgrade` review comments CC: @luxas * fixes kubeadm upgrade index * Update Stackdriver Logging documentation (#5495) * Re-update WordPress and MySQL PV doc to use apps/v1beta2 APIs (#5526) * Update statefulset concepts doc to use apps/v1beta2 APIs (#5420) * add document on kubectl's behavior regarding initializers (#5505) * Update docs/admin/kubeadm.md to cover self-hosting in 1.8. (#5497) This is a new beta feature in 1.8. * Update kubectl patch doc to use apps/v1beta2 APIs (#5422) * [1.8] Update "Run Applications" tasks to apps/v1beta2. (#5525) * Update replicated stateful application task for 1.8. * Update single instance stateful app task for 1.8. * Update stateless app task for 1.8. * Update kubectl patch task for 1.8. * fix the link of persistent storage (#5515) * update the admission-controllers.md index.md what-is-kubernetes.md link * fix the link of persistent storage * Add quota support for local ephemeral storage (#5493) * Add quota support for local ephemeral storage update the doc to this alpha feature * Update resource-quotas.md * Updated Deployments concepts doc (#5491) * Updated Deployments concepts doc * Addressed comments * Addressed more comments * Modify allocatable storage to ephemeral-storage (#5490) Update the doc to use ephemeral-storage instead of storage * Revamped concepts doc for ReplicaSet (#5463) * Revamped concepts doc for ReplicaSet * Minor changes to call out specific versions for selector defaulting and immutability * Addressed doc review comments * Remove petset documentations (#5395) * Update docs to use batch/v1beta1 cronjobs (#5475) * add federation job doc (#5485) * add federation job doc * Update job.md Edits for clarity and consistency * Update job.md Fixed a typo * update DaemonSet concept for 1.8 release (#5397) * update DaemonSet concept for 1.8 release * Update daemonset.md Fix typo. than -> then * Update bootstrap tokens doc for 1.8. (#5479) * Update bootstrap tokens doc for 1.8. This has some changes I missed when I was updating the main kubeadm documention: - Bootstrap tokens are now beta, not alpha (kubernetes/enhancements#130) - The apiserver flag to enable the authenticator changedin 1.8 (kubernetes/kubernetes#51198) - Added `auth-extra-groups` documentaion (kubernetes/kubernetes#50933) - Updated the _Token Management with `kubeadm`_ section to link to the main kubeadm docs, since it was just duplicated information. * Update bootstrap-tokens.md * Updated the Cassandra tutorial to use apps/v1beta2 (#5548) * add docs for AllowPrivilegeEscalation (#5448) Signed-off-by: Jess Frazelle <acidburn@microsoft.com> * Add local ephemeral storage alpha feature in managing compute resource (#5522) * Add local ephemeral storage alpha feature in managing compute resource Since 1.8, we add the local ephemeral storage alpha feature as one resource type to manage. Add this feature into the doc. * Update manage-compute-resources-container.md * Update manage-compute-resources-container.md * Update manage-compute-resources-container.md * Update manage-compute-resources-container.md * Update manage-compute-resources-container.md * Update manage-compute-resources-container.md * Added documentation for Metrics Server (#5560) * authorization: improve authorization debugging docs (#5549) * Document mount propagation (#5544) * Update /docs/setup/independent/create-cluster-kubeadm.md for 1.8. (#5524) This introduction needed a couple of small tweaks to cover the `--discovery-token-ca-cert-hash` flag added in kubernetes/kubernetes#49520 and some version bumps. * Add task doc for alpha dynamic kubelet configuration (#5523) * Fix input/output of selfsubjectaccess review (#5593) * Add docs for implementing resize (#5528) * Add docs for implementing resize * Update admission-controllers.md * Added link to PVC section * minor typo fixes * Update NetworkPolicy concept guide with egress and CIDR changes (#5529) * update zookeeper tutorial for 1.8 release * add doc for hostpath type (#5503) * Federated Hpa feature doc (#5487) * Federated Hpa feature doc * Federated Hpa feature doc review fixes * Update hpa.md * Update hpa.md * update cloud controller manager docs for v1.8 * Update cronjob with defaults information (#5556) * Kubernetes 1.8 reference docs (#5632) * Kubernetes 1.8 reference docs * Kubectl reference docs for 1.8 * Update side bar with 1.8 kubectl and api ref docs links * remove petset.md * update on state of HostAlias in 1.8 with hostNetwork Pod support (#5644) * Fix cron job deletion section (#5655) * update imported docs (#5656) * Add documentation for certificate rotation. (#5639) * Link to using kubeadm page * fix the command output fix the command output * fix typo in api/resources reference: "Worloads" * Add documentation for certificate rotation. * Create TOC entry for cloud controller manager. (#5662) * Updates for new versions of API types * Followup 5655: fix link to garbage collection (#5666) * Temporarily redirect resources-reference to api-reference. (#5668) * Update config for 1.8 release. (#5661) * Update config for 1.8 release. * Address reviewer comments. * Switch references in HPA docs from alpha to beta (#5671) The HPA docs still referenced the alpha version. This switches them to talk about v2beta1, which is the appropriate version for Kubernetes 1.8 * Deprecate openstack heat (#5670) * Fix typo in pod preset conflict example Move container port definition to the correct line. * Highlight openstack-heat provider deprecation The openstack-heat provider for kube-up is being deprecated and will be removed in a future release. * Temporarily fix broken links by redirecting. (#5672) * Fix broken links. (#5675) * Fix render of code block (#5674) * Fix broken links. (#5677) * Add a small note about auto-bootstrapped CSR ClusterRoles (#5660) * Update kubeadm install doc for v1.8 (#5676) * add draft workloads api content for 1.8 (#5650) * add draft workloads api content for 1.8 * edits per review, add tables, for 1.8 workloads api doc * fix typo * Minor fixes to kubeadm 1.8 upgrade guide. (#5678) - The kubelet upgrade instructions should be done on every host, not just worker nodes. - We should just upgrade all packages, instead of calling out kubelet specifically. This will also upgrade kubectl, kubeadm, and kubernetes-cni, if installed. - Draining nodes should also ignore daemonsets, and master errors can be ignored. - Make sure that the new kubeadm download is chmoded correctly. - Add a step to run `kubeadm version` to verify after downloading. - Manually approve new kubelet CSRs if rotation is enabled (known issue). * Release 1.8 (#5680) * Fix versions for 1.8 API ref docs * Updates for 1.8 kubectl reference docs * Kubeadm /docs/admin/kubeadm.md cleanup, editing. (#5681) * Update docs/admin/kubeadm.md (mostly 1.8 related). This is Fabrizio's work, which I'm committing along with my edits (in a commit on top of this). * A few of my own edits to clarify and clean up some Markdown.
@kubernetes/sig-scheduling-pr-reviews @wojtek-t @davidopp
This change is
ref\ kubernetes/kubernetes#47604