-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ServerlessService's initial operation mode is now Proxy #12842
Conversation
44644ba
to
4254abe
Compare
Codecov Report
@@ Coverage Diff @@
## main #12842 +/- ##
==========================================
- Coverage 87.03% 87.01% -0.02%
==========================================
Files 197 197
Lines 14426 14429 +3
==========================================
Hits 12556 12556
- Misses 1576 1578 +2
- Partials 294 295 +1
Continue to review full report at Codecov.
|
9191449
to
11e0615
Compare
4e32fe3
to
d2b4373
Compare
af8e052
to
d67363f
Compare
/test istio-latest-no-mesh_serving_main |
1 similar comment
/test istio-latest-no-mesh_serving_main |
woot 300 runs with no flakes. |
d67363f
to
1ed8da4
Compare
/test istio-latest-no-mesh_serving_main |
1 similar comment
/test istio-latest-no-mesh_serving_main |
Prior the initial operation mode was Serve. This would create a narrow window where the revision's pods directly receive requests from the ingress. The size of this window depends on when the Revision's Pod becomes ready and the time it takes the autoscaler to make an initial decision on scale an excess burst capacity. If there is a continuous traffic this could overwhelm Revisions with a low container concurrency causing the queue-proxy to return 503s since it's queue is full.
1ed8da4
to
9da8be2
Compare
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dprotaso The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
// We put activator in the serving path in the following cases: | ||
// 1. The revision is scaled to 0: | ||
// a. want == 0 | ||
// b. want == -1 && PA is inactive (Autoscaler has no previous knowledge of | ||
// this revision, e.g. after a restart) but PA status is inactive (it was | ||
// already scaled to 0). | ||
// 2. The excess burst capacity is negative. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I wonder if we should keep this comment block (maybe move it to just before the switch
statement)? It's a nice statement on when we want the activator in the path, (though I guess since we're defaulting to Proxy mode, it's maybe not the right want to frame it any more...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I split this out into the three comments for the three cases - let me know if you think they need more clarity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, no issues there, I think they're very clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good
Fixes: #12887 #12712
Prior the initial operation mode was Serve. This would create a
narrow window where the revision's pods directly receive requests
from the ingress. The size of this window depends on when the
Revision's Pod becomes ready and the time it takes the autoscaler
to make an initial decision on scale an excess burst capacity.
If there is a continuous traffic this could overwhelm Revisions
with a low container concurrency causing the queue-proxy to return
503s since it's queue is full.