Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding GEP 3171: Percentage-based Request Mirroring #3178

Merged
merged 12 commits into from
Jul 29, 2024

Conversation

jakebennert
Copy link
Contributor

What type of PR is this?
/kind gep

What this PR does / why we need it:
This adds a new GEP proposing a new feature Percentage-based Request Mirroring.

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/gep PRs related to Gateway Enhancement Proposal(GEP) labels Jun 30, 2024
Copy link

linux-foundation-easycla bot commented Jun 30, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 30, 2024
@k8s-ci-robot k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Jun 30, 2024
@k8s-ci-robot
Copy link
Contributor

Welcome @jakebennert!

It looks like this is your first PR to kubernetes-sigs/gateway-api 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/gateway-api has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 30, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @jakebennert. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@kflynn
Copy link
Contributor

kflynn commented Jul 3, 2024

Love the idea! I personally would vote for allowing two forms:

MirrorFraction Fraction `json:"mirrorFraction,omitempty"`
MirrorPercent int `json:"mirrorPercent,omitempty"`

with validation allowing only one to be set (and we can define that if they are somehow both set, mirrorFraction takes precedence because it's more precise).

This is slightly harder on implementers, but I expect that the vast majority of users will only need an integral percentage, and forcing all of them to write

mirrorFraction:
  numerator: 5

instead of

mirrorPercent: 5

feels like putting the difficulty on the wrong users. It's definitely important to support the case where you have enough traffic that you need to mirror 1/10000th of it, but I don't think that's the common case, and I'd rather the common case be simple. 🙂

@candita
Copy link
Contributor

candita commented Jul 8, 2024

I'd like to point out that we already have a concept of Weight that works on percentages, and it is a shared type.
Can it be reused here?

// Weight specifies the proportion of requests forwarded to the referenced
// backend. This is computed as weight/(sum of all weights in this
// BackendRefs list). For non-zero values, there may be some epsilon from
// the exact proportion defined here depending on the precision an
// implementation supports. Weight is not a percentage and the sum of
// weights does not need to equal 100.
//
// If only one backend is specified and it has a weight greater than 0, 100%
// of the traffic is forwarded to that backend. If weight is set to 0, no
// traffic should be forwarded for this entry. If unspecified, weight
// defaults to 1.
//
// Support for this field varies based on the context where used.
//
// +optional
// +kubebuilder:default=1
// +kubebuilder:validation:Minimum=0
// +kubebuilder:validation:Maximum=1000000
Weight *int32 `json:"weight,omitempty"`

@kflynn
Copy link
Contributor

kflynn commented Jul 9, 2024

@candita, I think BackendRef's weight definition won't well for mirroring, sadly. BackendRef's weight scheme is almost the inverse of what we want: it relies on you to put large weights on the refs that should get a lot of traffic, and the only way to describe a small amount of traffic is to put large weights on all the other refs.

This is problematic with mirroring, where we'd really like a really easy way to set up mirroring of a very small fraction of traffic without having to edit all the resources describing the critical production paths. Overall, I think we really are going to need an arbitrary-fraction type.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Jul 9, 2024
@robscott
Copy link
Member

robscott commented Jul 9, 2024

Thanks @jakebennert!

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 9, 2024
@candita
Copy link
Contributor

candita commented Jul 12, 2024

@kflynn

it relies on you to put large weights on the refs that should get a lot of traffic, and the only way to describe a small amount of traffic is to put large weights on all the other refs.

Did you mean "the only way to describe a small amount of traffic is to put small weights on all the other refs"? It looks like the default Weight value is 1, so as long as you put something higher than 1 on the refs that should be used more often, that would work.

e.g. if we have four backendRefs, and one of them we want to have a majority of all traffic. From the godoc, it follows that you can leave 3 Weights unspecified (defaults to 1), and on the fourth, set Weight to 5. The fourth one will get 5/8 of the traffic, and the other 3 will get 1/8 of the traffic. Am I misinterpreting the godoc?

@kflynn
Copy link
Contributor

kflynn commented Jul 12, 2024

@candita, in my experience, what I would often see happen is that the app devs will have a running workload, then decide that they want to enable shadowing for some reason, which means that they're in the situation where they want to add something that gets a small fraction of traffic after they've already set up the normal backends. It's much more user-friendly to add a single resource for "here's a backend that I want to shadow a tiny fraction of traffic" than to have to edit all your existing backends to add a large weight to them, then add a new resource that asks for a small fraction of traffic.

Also, suppose you currently have three backends with weights of 50%, 25%, and 25%, you get 50K RPS, and you want to shadow no more than 50RPS. Quick, what weights do you need to use to specify that? Should you re-edit the weights back to 50/25/25 when you turn off shadowing a week later? How about if you decide the shadow needs to drop to 25RPS? Do we start telling people that the best practice is to deploy with a weight of 10000 for all their backends in case they need to mess with stuff like this? 😐

Let's just sidestep all those questions by having a way to say "here's a backend, give it 1/1000th of the traffic to this route, please." 🙂

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 22, 2024
@jakebennert
Copy link
Contributor Author

Love the idea! I personally would vote for allowing two forms:

MirrorFraction Fraction `json:"mirrorFraction,omitempty"`
MirrorPercent int `json:"mirrorPercent,omitempty"`

with validation allowing only one to be set (and we can define that if they are somehow both set, mirrorFraction takes precedence because it's more precise).

This is slightly harder on implementers, but I expect that the vast majority of users will only need an integral percentage, and forcing all of them to write

mirrorFraction:
  numerator: 5

instead of

mirrorPercent: 5

feels like putting the difficulty on the wrong users. It's definitely important to support the case where you have enough traffic that you need to mirror 1/10000th of it, but I don't think that's the common case, and I'd rather the common case be simple. 🙂

@kflynn I think this is a good idea. I've updated my proposal - how does this look to you?

@kflynn
Copy link
Contributor

kflynn commented Jul 22, 2024

@jakebennert Looks great to me, thank you!! 🙂

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 22, 2024
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 22, 2024
Copy link
Member

@robscott robscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jakebennert! One tiny nit about status, but otherwise LGTM. Would appreciate another LGTM from someone else.

/cc @arkodg @candita @kflynn @mlavacca

geps/gep-3171/index.md Outdated Show resolved Hide resolved
Co-authored-by: Rob Scott <rob.scott87@gmail.com>
@jakebennert
Copy link
Contributor Author

Thanks @jakebennert! One tiny nit about status, but otherwise LGTM. Would appreciate another LGTM from someone else.

/cc @arkodg @candita @kflynn @mlavacca

@kflynn I know you LGTM'd in a previous comment, but I think because I made some commits after the fact, it's out of date. If it all looks good to you, would you be able to LGTM again? :)

@robscott
Copy link
Member

Thanks @jakebennert! Will defer to @kflynn (or anyone else) for LGTM

/approve
/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 29, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jakebennert, robscott

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 29, 2024
@kflynn
Copy link
Contributor

kflynn commented Jul 29, 2024

/lgtm

again 🙂

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 29, 2024
@k8s-ci-robot k8s-ci-robot merged commit 267f994 into kubernetes-sigs:main Jul 29, 2024
8 checks passed
xtineskim pushed a commit to xtineskim/gateway-api that referenced this pull request Aug 8, 2024
…3178)

* Create index.md

* Create metadata.yaml

* Update index.md

* Update metadata.yaml

* Remove 'omitempty' from Fraction fields

* Split MirrorPercent into two fields: MirrorPercent and MirrorFraction.

* Update index.md

* Add config examples.

* Add Existing Support Table

* Update field names to avoid 'stuttering' in names.

* Update geps/gep-3171/index.md

Co-authored-by: Rob Scott <rob.scott87@gmail.com>

---------

Co-authored-by: Rob Scott <rob.scott87@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/gep PRs related to Gateway Enhancement Proposal(GEP) lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants