Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DaemonSets should support MaxSurge to improve workload availability #1591

Closed
smarterclayton opened this issue Mar 2, 2020 · 54 comments
Closed
Labels
kind/feature Categorizes issue or PR as related to a new feature. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. stage/stable Denotes an issue tracking an enhancement targeted for Stable/GA status tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team
Milestone

Comments

@smarterclayton
Copy link
Contributor

smarterclayton commented Mar 2, 2020

Enhancement Description

  • One-line enhancement description: Many infrastructure components (CNI, CSI) require DaemonSets to place pods on each node, but the current update strategies limit end users from minimizing disruption during updates. It should be possible to surge daemonsets if a user requests it to allow handoff from one running pod to another, like Deployments.
  • Kubernetes Enhancement Proposal: https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/1591-daemonset-surge
  • Primary contact (assignee): @smarterclayton
  • Responsible SIGs: @kubernetes/sig-apps-feature-requests @kubernetes/sig-node-feature-requests
  • Enhancement target (which target equals to which milestone):
    • Alpha release target 1.20
    • Beta release target 1.21
    • Stable release target 1.25
@k8s-ci-robot k8s-ci-robot added sig/apps Categorizes an issue or PR as relevant to SIG Apps. kind/feature Categorizes issue or PR as related to a new feature. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Mar 2, 2020
@leoskyrocker
Copy link

we need it for fluentd so that logs generated by our applications are still being picked up even when update is happening.

@kikisdeliveryservice
Copy link
Member

Hi @smarterclayton !

1.19 Enhancements shadow here. I wanted to check in and see if you think this Enhancement will be graduating in 1.19?

In order to have this part of the release:

The KEP PR must be merged in an implementable state
The KEP must have test plans
The KEP must have graduation criteria.

The current release schedule is:

Monday, April 13: Week 1 - Release cycle begins
Tuesday, May 19: Week 6 - Enhancements Freeze
Thursday, June 25: Week 11 - Code Freeze
Thursday, July 9: Week 14 - Docs must be completed and reviewed
Tuesday, August 4: Week 17 - Kubernetes v1.19.0 released

Please let me know and I'll add it to the 1.19 tracking sheet (http://bit.ly/k8s-1-19-enhancements). Once coding begins please list all relevant k/k PRs in this issue so they can be tracked properly. 👍

Thanks!

@kikisdeliveryservice
Copy link
Member

As a reminder, enhancements freeze is tomorrow May 19th EOD PST. In order to be included in 1.19 all KEPS must be implementable with graduation criteria and a test plan.

Thanks.

@kikisdeliveryservice
Copy link
Member

Unfortunately the deadline for the 1.19 Enhancement freeze has passed. For now this is being removed from the milestone and 1.19 tracking sheet. If there is a need to get this in, please file an enhancement exception.

@kikisdeliveryservice kikisdeliveryservice added the tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team label May 20, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 18, 2020
@palnabarun
Copy link
Member

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 1, 2020
@kikisdeliveryservice
Copy link
Member

Hi @smarterclayton

Enhancements Lead here. Any plans for this in 1.20?

Thanks!
Kirsten

@kikisdeliveryservice
Copy link
Member

kikisdeliveryservice commented Sep 25, 2020

Hi @smarterclayton

Following up - any plans to start on this in 1.20? Enhancements Freeze is October 6th and the KEP is currently unmerged and provisional (it must be implementable).

Best,
Kirsten

@smarterclayton
Copy link
Contributor Author

Yes, KEP is ready for review for implementable, target would be alpha in 1.20 as denoted in KEP.

@kikisdeliveryservice kikisdeliveryservice added the tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team label Sep 30, 2020
@kikisdeliveryservice kikisdeliveryservice added this to the v1.20 milestone Sep 30, 2020
@kikisdeliveryservice kikisdeliveryservice added stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status and removed tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team labels Sep 30, 2020
@mikejoh
Copy link

mikejoh commented Oct 12, 2020

Hi @smarterclayton ,

Since your Enhancement is scheduled to be in 1.20, please keep in mind the important upcoming dates:

As a reminder, please link all of your k/k PR as well as docs PR to this issue so we can track them.

Regards,
Mikael

@somtochiama
Copy link
Member

somtochiama commented Oct 21, 2020

Hello @smarterclayton , 1.20 Docs shadow here 👋🏽.
Does this enhancement work planned for 1.20 require any new docs or modification to existing docs?

If so, please follows the steps here to open a PR against dev-1.20 branch in the k/website repo. This PR can be just a placeholder at this time and must be created before Nov 6th

Also take a look at Documenting for a release to get yourself familiarize with the docs requirement for the release.
Thank you!

@somtochiama
Copy link
Member

Hi @smarterclayton
The docs placeholder deadline is almost here. Please make sure to create a placeholder PR against the dev-1.20 branch in the k/website before the deadline.

Also, please keep in mind the important upcoming dates:

Thank you!

@smarterclayton
Copy link
Contributor Author

Updated with docs placeholder.

@kikisdeliveryservice
Copy link
Member

Hey @smarterclayton

You're still intending on getting kubernetes/kubernetes#96375 in by tomorrow? Seems to just need a lgtm.

Thanks!
Kirsten

@smarterclayton
Copy link
Contributor Author

Giving timing I'm going to delay this to 1.21 since the impl needs more tests and I don't want to rush it.

@kikisdeliveryservice
Copy link
Member

@smarterclayton thanks for the update!

@marosset
Copy link
Contributor

Hello @ravisantoshgudimetla, @smarterclayton 👋, 1.25 Enhancements team here.

Just checking in as we approach enhancements freeze on 18:00 PST on Thursday June 23, 2022.

For note, This enhancement is targeting for stage stable for 1.25 (correct me, if otherwise)

Here’s where this enhancement currently stands:

  • Updated KEP file using the latest template has been merged into the k/enhancements repo.
  • KEP status is marked as implementable for this release
  • KEP has a updated detailed test plan section filled out
  • KEP has up to date graduation criteria.
  • KEP has a production readiness review that has been completed and merged into k/enhancements.

With all the KEP requirements in place & merged into k/enhancements, this enhancement is all good for the upcoming enhancements freeze. 🚀

For note, the status of this enhancement is marked as tracked. Please keep the issue description up-to-date with appropriate stages as well. Thank you!

@marosset
Copy link
Contributor

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jun 13, 2022
@kcmartin
Copy link

kcmartin commented Jul 6, 2022

Hello @ravisantoshgudimetla 👋, 1.25 Release Docs Lead here.
This enhancement is marked as ‘Needs Docs’ for 1.25 release.

Please follow the steps detailed in the documentation to open a PR against dev-1.25 branch in the k/website repo. This PR can be just a placeholder at this time, and must be created by August 4.
 Also, take a look at Documenting for a release to familiarize yourself with the docs requirement for the release.

Thank you!

@sftim
Copy link
Contributor

sftim commented Jul 20, 2022

Please update the DaemonSet docs to mention the maxSurge field. The doc at https://kubernetes.io/docs/tasks/manage-daemon/update-daemon-set/ does also need updating; for a GA feature, we should at least mention that you can control surge scale.

@rhockenbury
Copy link

👋 Hey @smarterclayton,

Enhancements team checking in as we approach 1.25 code freeze at 01:00 UTC on Wednesday, 3rd August 2022.

Please ensure the following items are completed by code freeze:
[ ] All PRs to the Kubernetes repo that are related to your enhancement are linked in the above issue description (for tracking purposes).
[ ] All PRs are fully merged by the code freeze deadline.

Looks like there is one PR in k/k for the graduation for this enhancement to stable. Let me know if I missed any other PRs that need to be tracked.

As always, we are here to help should questions come up. Thanks!!

@soltysh
Copy link
Contributor

soltysh commented Jul 26, 2022

k/k PR merge recently - kubernetes/kubernetes#111194, docs should follow.

@atiratree
Copy link
Member

opened the docs update kubernetes/website#35538

@rhockenbury rhockenbury added tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team and removed tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team labels Sep 11, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 10, 2022
@thockin thockin moved this to pod lifecycle in KEPs I am tracking Dec 20, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 9, 2023
@soltysh
Copy link
Contributor

soltysh commented Jan 12, 2023

This is all fully ready and the feature gates was dropped in kubernetes/kubernetes#114410, so I'm closing this as completed.
/close

@k8s-ci-robot
Copy link
Contributor

@soltysh: Closing this issue.

In response to this:

This is all fully ready and the feature gates was dropped in kubernetes/kubernetes#114410, so I'm closing this as completed.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@github-project-automation github-project-automation bot moved this from pod lifecycle to Done in KEPs I am tracking Jan 12, 2023
@marosset marosset removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 12, 2023
RomanBednar pushed a commit to RomanBednar/enhancements that referenced this issue Apr 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. stage/stable Denotes an issue tracking an enhancement targeted for Stable/GA status tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team
Projects
None yet
Development

No branches or pull requests