Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deployment scale down has unexpected behavior #271

Closed
howardjohn opened this issue Feb 10, 2023 · 3 comments · Fixed by #275
Closed

Deployment scale down has unexpected behavior #271

howardjohn opened this issue Feb 10, 2023 · 3 comments · Fixed by #275
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.
Milestone

Comments

@howardjohn
Copy link

Steps:

  • Run kwok locally, connected to a kind cluster
  • Deploy 2 fake kwok nodes with high pods per node
  • Deploy fake-pod from docs
  • Scale it to 999 and back down

I expect this to be very fast. In practice, it takes a while.

On scale up, I see:

2023-02-10T08:59:01.608120 NAME       READY   UP-TO-DATE   AVAILABLE   AGE
2023-02-10T08:59:01.608332 fake-pod   1/1     1            1           12m
2023-02-10T08:59:02.247084 fake-pod   1/999   1            1           12m
2023-02-10T08:59:02.255063 fake-pod   1/999   1            1           12m
2023-02-10T08:59:02.260787 fake-pod   1/999   1            1           12m
2023-02-10T08:59:50.757661 fake-pod   501/999   501          501         13m
2023-02-10T08:59:50.809376 fake-pod   999/999   999          999         13m

So almost 1 min to scale up.

On scale down, logs are spammed with:

Delete pod                                                                                                                 pod="default/fake-pod-6cf6574478-t6ntb" node="kwok-node-1"
WARN Delete pod                                                     pod="default/fake-pod-6cf6574478-t6ntb" node="kwok-node-1" !BADKEY="pods \"fake-pod-6cf6574478-t6ntb\" not found"
ERROR Failed to finalizers                            pod="default/fake-pod-6cf6574478-sqrh6" node="kwok-node-1" err="the server rejected our request due to an error in our request"

And takes about 1min as well:

2023-02-10T09:00:42.833048 fake-pod   999/1     999          999         14m
2023-02-10T09:00:42.843271 fake-pod   999/1     999          999         14m
2023-02-10T09:01:31.405940 fake-pod   499/1     499          499         15m
2023-02-10T09:01:31.455725 fake-pod   1/1       1            1           15m

The slowness could reasonably be from k8s itself, but the error logs on scale down seem suspicious at least

This was referenced Feb 11, 2023
@wzshiming
Copy link
Member

/assign
/kind bug

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Feb 11, 2023
@github-project-automation github-project-automation bot moved this to 🆕 New in KWOK Tracking Feb 11, 2023
@wzshiming wzshiming added this to the v0.2 milestone Feb 11, 2023
@wzshiming wzshiming linked a pull request Feb 14, 2023 that will close this issue
@wzshiming
Copy link
Member

Fixed by #275
/close

@k8s-ci-robot
Copy link
Contributor

@wzshiming: Closing this issue.

In response to this:

Fixed by #275
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@github-project-automation github-project-automation bot moved this from 🆕 New to ✅ Done in KWOK Tracking Feb 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants