Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GEP-3155: Complete Backend mTLS Configuration #3180

Conversation

mkosieradzki
Copy link
Contributor

What type of PR is this?
/kind gep

What this PR does / why we need it:
Proposal for backend mTLS configuration

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/gep PRs related to Gateway Enhancement Proposal(GEP) labels Jul 2, 2024
@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 2, 2024
Copy link

linux-foundation-easycla bot commented Jul 2, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Jul 2, 2024
@k8s-ci-robot
Copy link
Contributor

Welcome @mkosieradzki!

It looks like this is your first PR to kubernetes-sigs/gateway-api 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/gateway-api has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 2, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @mkosieradzki. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Jul 2, 2024
@robscott robscott added this to the v1.2.0 milestone Jul 8, 2024
@shaneutt
Copy link
Member

shaneutt commented Jul 8, 2024

/cc @shaneutt

geps/gep-3155/index.md Outdated Show resolved Hide resolved
geps/gep-3155/index.md Outdated Show resolved Hide resolved
Copy link
Contributor

@youngnick youngnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few pieces of feedback here, but I think that the number one thing that's needed is use cases that show how some of these proposals work, or justify their existence.

In particular, I'm reluctant to add free-text map[string]string fields to the API without a very good explanation and example use cases.

geps/gep-3155/index.md Outdated Show resolved Hide resolved
geps/gep-3155/index.md Outdated Show resolved Hide resolved
geps/gep-3155/index.md Show resolved Hide resolved
geps/gep-3155/index.md Outdated Show resolved Hide resolved
geps/gep-3155/index.md Outdated Show resolved Hide resolved
geps/gep-3155/index.md Outdated Show resolved Hide resolved
Co-authored-by: Mike Morris <mikemorris@users.noreply.github.com>
@robscott robscott marked this pull request as ready for review July 16, 2024 15:17
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 16, 2024
@LiorLieberman
Copy link
Member

/cc

Copy link

@jaishals jaishals Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @mkosieradzki , if we specify client cert on Gateway then can you please share some insights as to how would the scenarios where there are services which do not need MTLS but need only End2End TLS coexist in the same Gateway -> Route configurations.

There could be a route targeting service (music) which needs MTLS and another route targeting service (shopping) which only needs End to End TLS.

These scenarios would not work on same Gateway if we use MTLS to all the services

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the distinction here is that we're providing a client cert at the Gateway level that should be presented if it's requested by the backend. @mkosieradzki had previously included a per-Service override for this config in BackendTLSPolicy. We ran into issues there where the personas attached to BackendTLSPolicy weren't particularly clear (see #3226 to chime in on that discussion), so this iteration of the GEP will leave per-Service overrides out, but I think that's still very much a longer term goal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could be a route targeting service (music) which needs MTLS and another route targeting service (shopping) which only needs End to End TLS.

To my best understanding whether the Gateway will send the certificate to the Backend or not depends on the backend server configuration, i.e. whether it will send CertificateRequest message during the handshake as per https://datatracker.ietf.org/doc/html/rfc5246#section-7.3 or not.

Therefore if backend is not configured to request a client certificate, the Gateway will not send it, even it is configured. My original rationale for adding per-service overrides was about corner cases:

  • where server application expects a certificate issued by a different CA
  • where server application can work in two modes: if the cert is available or not, and lack of certificate is perfectly fine for the server (and causes fallback to a different authentication method).

I think we should revisit this scenario as soon as we figure out the #3226.

Comment on lines 41 to 43
#### Gateway-level (Core support)
Specifying credentials at the gateway level is the default operation mode, where all
backends will be presented with a single gateway certificate.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re all backends - should the client cert be presented even when no BackendTLSPolicy targeting the Service backend is present?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my longer term vision, I don't think we should limit it to this scope. Ideally Gateway owners could provide a set of CAs they trust for certs from backends at the Gateway level and the validation in BackendTLSPolicy actually becomes more of a per-Service override.

Although I don't think it's strictly necessary, I think it could be acceptable to start with this restriction. Of course I certainly don't want to stick with that limitation long term.

Copy link
Contributor

@mikemorris mikemorris Jul 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally Gateway owners could provide a set of CAs they trust for certs from backends at the Gateway level

I think this makes sense and could potentially address the non-mutual end-to-end TLS functionality @jaishals is describing in #3180 (comment), where the presence of BackendTLSPolicy additional validation is specifically to enforce mutual TLS, restricting the set of acceptable client certificates? Would any additional signal like mode: mutual be needed somewhere to clarify this?


### SANs on BackendTLSPolicy

This change enables the certificate to have a different identity than the SNI
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is intended to be applicable to the "server"/destination cert, but can you please specify more precisely which cert is affected here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use "Backend" here to line up with the terminology in https://gateway-api.sigs.k8s.io/geps/gep-2907/#frontend-and-backend.

Comment on lines 64 to 65
// When specified, at least one of certificate's Subject Alternate Names MUST
// match at least one of the specified SubjectAltNames.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if it doesn't? Suggested status condition on the policy indicated it's invalid?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently we haven't defined how a request should be rejected here:

// 1. Hostname MUST be used as the SNI to connect to the backend (RFC 6066).
// 2. Hostname MUST be used for authentication and MUST match the certificate
// served by the matching backend.

At a minimum I think we can agree that the request should be rejected, ideally we can be more specific here for the sake of writing conformance tests and overall portability.

As far as populating status, I think that most controllers are separated from the dataplane and would not be able to translate runtime TLS validation errors to config time (k8s) status.

backend. This adds that configuration to both Gateway and Service (via
BackendTLSPolicy).

#### Gateway-level (Core support)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing an API example - seems to be suggesting the addition of a new field under the Gateway spec stanza https://gateway-api.sigs.k8s.io/reference/spec/#gateway.networking.k8s.io%2fv1.Gateway?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think this should have a quick sketch of what we need to add to Gateway.

Copy link
Member

@robscott robscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mkosieradzki! A few more comments, but mostly LGTM.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the distinction here is that we're providing a client cert at the Gateway level that should be presented if it's requested by the backend. @mkosieradzki had previously included a per-Service override for this config in BackendTLSPolicy. We ran into issues there where the personas attached to BackendTLSPolicy weren't particularly clear (see #3226 to chime in on that discussion), so this iteration of the GEP will leave per-Service overrides out, but I think that's still very much a longer term goal.

Comment on lines 41 to 43
#### Gateway-level (Core support)
Specifying credentials at the gateway level is the default operation mode, where all
backends will be presented with a single gateway certificate.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my longer term vision, I don't think we should limit it to this scope. Ideally Gateway owners could provide a set of CAs they trust for certs from backends at the Gateway level and the validation in BackendTLSPolicy actually becomes more of a per-Service override.

Although I don't think it's strictly necessary, I think it could be acceptable to start with this restriction. Of course I certainly don't want to stick with that limitation long term.


### SANs on BackendTLSPolicy

This change enables the certificate to have a different identity than the SNI
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use "Backend" here to line up with the terminology in https://gateway-api.sigs.k8s.io/geps/gep-2907/#frontend-and-backend.

geps/gep-3155/index.md Show resolved Hide resolved
Comment on lines 64 to 65
// When specified, at least one of certificate's Subject Alternate Names MUST
// match at least one of the specified SubjectAltNames.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently we haven't defined how a request should be rejected here:

// 1. Hostname MUST be used as the SNI to connect to the backend (RFC 6066).
// 2. Hostname MUST be used for authentication and MUST match the certificate
// served by the matching backend.

At a minimum I think we can agree that the request should be rejected, ideally we can be more specific here for the sake of writing conformance tests and overall portability.

As far as populating status, I think that most controllers are separated from the dataplane and would not be able to translate runtime TLS validation errors to config time (k8s) status.

@robscott
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 31, 2024
geps/gep-3155/index.md Outdated Show resolved Hide resolved
Copy link
Contributor

@youngnick youngnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been picky about wording, but once the two wording changes I've asked for are handled, and we have an example of what we are adding to Gateway, then this LGTM.

Copy link
Member

@robscott robscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mkosieradzki! Added a few formatting nits but otherwise LGTM.

/approve

geps/gep-3155/index.md Outdated Show resolved Hide resolved
geps/gep-3155/index.md Outdated Show resolved Hide resolved
geps/gep-3155/index.md Outdated Show resolved Hide resolved
geps/gep-3155/index.md Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 31, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mkosieradzki, robscott, youngnick

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@youngnick
Copy link
Contributor

With that last round of changes, this LGTM. I'll leave the final unhold for @robscott though, since he has some outstanding format changes.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 1, 2024
Co-authored-by: Rob Scott <rob.scott87@gmail.com>
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 1, 2024
mkosieradzki and others added 3 commits August 1, 2024 14:44
Co-authored-by: Rob Scott <rob.scott87@gmail.com>
Co-authored-by: Rob Scott <rob.scott87@gmail.com>
Co-authored-by: Rob Scott <rob.scott87@gmail.com>
@robscott
Copy link
Member

robscott commented Aug 1, 2024

Thanks @mkosieradzki!

/lgtm
/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 1, 2024
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 1, 2024
@k8s-ci-robot k8s-ci-robot merged commit b1cfe92 into kubernetes-sigs:main Aug 1, 2024
8 checks passed
mikemorris added a commit to mikemorris/gateway-api that referenced this pull request Aug 1, 2024
* Initial commit

* Pending changes exported from your codespace

* Update geps/gep-3155/index.md

Co-authored-by: Mike Morris <mikemorris@users.noreply.github.com>

* Initial commit

* Initial commit

* Initial commit

* Initial commit

* Fix enum values

* Initial commit

* Pending changes exported from your codespace

* Fix indentation

* Fix indentation

* Update geps/gep-3155/index.md

Co-authored-by: Nick Young <inocuo@gmail.com>

* Update geps/gep-3155/index.md

Co-authored-by: Nick Young <inocuo@gmail.com>

* Removed per-service override from the proposal

* - Reverted back changes necessary for the GatewaySpec
- Added clarifications

* Update geps/gep-3155/index.md

Co-authored-by: Rob Scott <rob.scott87@gmail.com>

* Update geps/gep-3155/index.md

Co-authored-by: Rob Scott <rob.scott87@gmail.com>

* Update geps/gep-3155/index.md

Co-authored-by: Rob Scott <rob.scott87@gmail.com>

* Update geps/gep-3155/index.md

Co-authored-by: Rob Scott <rob.scott87@gmail.com>

---------

Co-authored-by: Mike Morris <mikemorris@users.noreply.github.com>
Co-authored-by: Nick Young <inocuo@gmail.com>
Co-authored-by: Rob Scott <rob.scott87@gmail.com>
xtineskim pushed a commit to xtineskim/gateway-api that referenced this pull request Aug 8, 2024
* Initial commit

* Pending changes exported from your codespace

* Update geps/gep-3155/index.md

Co-authored-by: Mike Morris <mikemorris@users.noreply.github.com>

* Initial commit

* Initial commit

* Initial commit

* Initial commit

* Fix enum values

* Initial commit

* Pending changes exported from your codespace

* Fix indentation

* Fix indentation

* Update geps/gep-3155/index.md

Co-authored-by: Nick Young <inocuo@gmail.com>

* Update geps/gep-3155/index.md

Co-authored-by: Nick Young <inocuo@gmail.com>

* Removed per-service override from the proposal

* - Reverted back changes necessary for the GatewaySpec
- Added clarifications

* Update geps/gep-3155/index.md

Co-authored-by: Rob Scott <rob.scott87@gmail.com>

* Update geps/gep-3155/index.md

Co-authored-by: Rob Scott <rob.scott87@gmail.com>

* Update geps/gep-3155/index.md

Co-authored-by: Rob Scott <rob.scott87@gmail.com>

* Update geps/gep-3155/index.md

Co-authored-by: Rob Scott <rob.scott87@gmail.com>

---------

Co-authored-by: Mike Morris <mikemorris@users.noreply.github.com>
Co-authored-by: Nick Young <inocuo@gmail.com>
Co-authored-by: Rob Scott <rob.scott87@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/gep PRs related to Gateway Enhancement Proposal(GEP) lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants