Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PodAutoscaler Active Condition should not affect Reachability #14309

Merged
merged 2 commits into from
Sep 13, 2023

Conversation

dprotaso
Copy link
Member

@dprotaso dprotaso commented Aug 30, 2023

Fixes #14115

PodAutoscaler.Spec.Reachability was added a long time ago. Since then we've added a ScaleTargetInitialized condition to make it easier to know if we've scaled up from zero successfully - and this simplified various condition checks in the autoscaler etc.

Given that the PodAutoscaler's Active condition would toggle Reachability as @SaschaSchwarze0 pointed out.

I believe we don't need this circular logic - reachability should be set depending on the revision's routing state with the exception of failing to progress the deployment/start a users container.

@knative-prow knative-prow bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. area/API API objects and controllers size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 30, 2023
@knative-prow knative-prow bot requested review from KauzClay and krsna-m August 30, 2023 01:58
@knative-prow knative-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 30, 2023
@knative-prow knative-prow bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 30, 2023
@dprotaso
Copy link
Member Author

dprotaso commented Aug 31, 2023

Besides the feedback loop with active - reachability is also set to unreachable when the revision's deployment is known to be in a failure state (eg. progress deadline hit/imagepullbackoff)

I'm not sure that's necessary because it seems like reachability just affects scale bounds and indirectly KPA condition status

@codecov
Copy link

codecov bot commented Sep 1, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: -0.04% ⚠️

Comparison is base (ad5455e) 86.16% compared to head (6e94600) 86.12%.
Report is 42 commits behind head on main.

❗ Current head 6e94600 differs from pull request most recent head 1d9f1f5. Consider uploading reports for the commit 1d9f1f5 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #14309      +/-   ##
==========================================
- Coverage   86.16%   86.12%   -0.04%     
==========================================
  Files         195      196       +1     
  Lines       14702    14787      +85     
==========================================
+ Hits        12668    12736      +68     
- Misses       1729     1745      +16     
- Partials      305      306       +1     
Files Changed Coverage Δ
pkg/reconciler/revision/resources/pa.go 100.00% <100.00%> (ø)

... and 7 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@SaschaSchwarze0
Copy link
Contributor

@dprotaso I have taken your PR and tested the problematic scenario and simple scaleUp and scaleFrom/ToZero. Is all working. I can close my PR if consensus exists to go your route. Thanks for checking.

@dprotaso dprotaso changed the title [wip] test reachability tweaks PodAutoscaler Active Condition should not affect Reachability Sep 11, 2023
@knative-prow knative-prow bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 11, 2023
@dprotaso
Copy link
Member Author

/assign @nak3 @ReToCode @skonto

cc @jsanin-vmw who's tackling a similar issue where pods don't scale down when we don't have metrics

Copy link
Contributor

@nak3 nak3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor comment otherwise LGTM.

pkg/reconciler/revision/resources/pa.go Show resolved Hide resolved
@nak3
Copy link
Contributor

nak3 commented Sep 13, 2023

/lgtm
/approve

Thank you!

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Sep 13, 2023
@knative-prow
Copy link

knative-prow bot commented Sep 13, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dprotaso, nak3

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow knative-prow bot merged commit 9ffab17 into knative:main Sep 13, 2023
@dprotaso dprotaso deleted the reachability-fix branch September 14, 2023 20:08
ReToCode pushed a commit to ReToCode/serving that referenced this pull request Nov 6, 2023
…e#14309)

* Don't change reachability if the PA is not active

* drop unused helper method

(cherry picked from commit 9ffab17)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/API API objects and controllers lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Old unreachable revision is causing new pods to get created when it should scale down
5 participants