Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP do not merge] Kube-Proxy Library: Breakout a KPNG-like interface "kube proxy lib" from #2104 #3649

Closed
wants to merge 19 commits into from
Closed
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
280 changes: 280 additions & 0 deletions keps/sig-network/2104-kube-proxy-lib/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,280 @@
<!--
**Note:** When your KEP is complete, all of these comment blocks should be removed.

To get started with this template:

- [ ] **Pick a hosting SIG.**
Make sure that the problem space is something the SIG is interested in taking
up. KEPs should not be checked in without a sponsoring SIG.
- [ ] **Create an issue in kubernetes/enhancements**
When filing an enhancement tracking issue, please make sure to complete all
fields in that template. One of the fields asks for a link to the KEP. You
can leave that blank until this KEP is filed, and then go back to the
enhancement and add the link.
- [ ] **Make a copy of this template directory.**
Copy this template into the owning SIG's directory and name it
`NNNN-short-descriptive-title`, where `NNNN` is the issue number (with no
leading-zero padding) assigned to your enhancement above.
- [ ] **Fill out as much of the kep.yaml file as you can.**
At minimum, you should fill in the "Title", "Authors", "Owning-sig",
"Status", and date-related fields.
- [ ] **Fill out this file as best you can.**
At minimum, you should fill in the "Summary" and "Motivation" sections.
These should be easy if you've preflighted the idea of the KEP with the
appropriate SIG(s).
- [ ] **Create a PR for this KEP.**
Assign it to people in the SIG who are sponsoring this process.
- [ ] **Merge early and iterate.**
Avoid getting hung up on specific details and instead aim to get the goals of
the KEP clarified and merged quickly. The best way to do this is to just
start with the high-level sections and fill out details incrementally in
subsequent PRs.

Just because a KEP is merged does not mean it is complete or approved. Any KEP
marked as `provisional` is a working document and subject to change. You can
denote sections that are under active debate as follows:

```
<<[UNRESOLVED optional short context or usernames ]>>
Stuff that is being argued.
<<[/UNRESOLVED]>>
```

When editing KEPS, aim for tightly-scoped, single-topic PRs to keep discussions
focused. If you disagree with what is already in a document, open a new PR
with suggested changes.

One KEP corresponds to one "feature" or "enhancement" for its whole lifecycle.
You do not need a new KEP to move from beta to GA, for example. If
new details emerge that belong in the KEP, edit the KEP. Once a feature has become
"implemented", major changes should get new KEPs.

The canonical place for the latest set of instructions (and the likely source
of this file) is [here](/keps/NNNN-kep-template/README.md).

**Note:** Any PRs to move a KEP to `implementable`, or significant changes once
it is marked `implementable`, must be approved by each of the KEP approvers.
If none of those approvers are still appropriate, then changes to that list
should be approved by the remaining approvers and/or the owning SIG (or
SIG Architecture for cross-cutting KEPs).
-->
# KEP-2104: kube-proxy library (KEP-2104, breakout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll probably want a new KEP number for this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we'd need a new issue for this right @jayunit100 ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya, i guess for the last thing we do we can clean that ... lets get consensus on the impl first tho i think b4 we gift wrap that part?

# Index
<!--
A table of contents is helpful for quickly jumping to sections of a KEP and for
highlighting any additional information provided beyond the standard KEP
template.

Ensure the TOC is wrapped with
<code>&lt;!-- toc --&rt;&lt;!-- /toc --&rt;</code>
tags, and then generate with `hack/update-toc.sh`.
-->

<!-- toc -->
- [Release Signoff Checklist](#release-signoff-checklist)
- [Summary](#summary)
- [Motivation](#motivation)
- [Goals](#goals)
- [Non-Goals](#non-goals)
- [Proposal](#proposal)
- [User Stories (Optional)](#user-stories-optional)
- [Story 1](#story-1)
- [Story 2](#story-2)
- [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional)
- [Risks and Mitigations](#risks-and-mitigations)
- [Design Details](#design-details)
- [Test Plan](#test-plan)
- [Unit tests](#unit-tests)
- [Integration tests](#integration-tests)
- [e2e tests](#e2e-tests)
- [Graduation Criteria](#graduation-criteria)
- [Production Readiness Review Questionnaire](#production-readiness-review-questionnaire)
- [Dependencies](#dependencies)
- [Scalability](#scalability)
- [Troubleshooting](#troubleshooting)
- [Implementation History](#implementation-history)
- [Drawbacks](#drawbacks)
- [Alternatives](#alternatives)
- [Infrastructure Needed (Optional)](#infrastructure-needed-optional)
<!-- /toc -->

## Release Signoff Checklist

<!--
**ACTION REQUIRED:** In order to merge code into a release, there must be an
issue in [kubernetes/enhancements] referencing this KEP and targeting a release
milestone **before the [Enhancement Freeze](https://git.k8s.io/sig-release/releases)
of the targeted release**.

For enhancements that make changes to code or processes/procedures in core
Kubernetes—i.e., [kubernetes/kubernetes], we require the following Release
Signoff checklist to be completed.

Check these off as they are completed for the Release Team to track. These
checklist items _must_ be updated for the enhancement to be released.
-->

Items marked with (R) are required *prior to targeting to a milestone / release*.

- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements] (not the initial KEP PR)
- [ ] (R) KEP approvers have approved the KEP status as `implementable`
- [ ] (R) Design details are appropriately documented
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input (including test refactors)
- [ ] e2e Tests for all Beta API Operations (endpoints)
- [ ] (R) Ensure GA e2e tests for meet requirements for [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
- [ ] (R) Minimum Two Week Window for GA e2e tests to prove flake free
- [ ] (R) Graduation criteria is in place
- [ ] (R) [all GA Endpoints](https://github.com/kubernetes/community/pull/1806) must be hit by [Conformance Tests](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md)
- [ ] (R) Production readiness review completed
- [ ] (R) Production readiness review approved
- [ ] "Implementation History" section is up-to-date for milestone
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes

<!--
**Note:** This checklist is iterative and should be reviewed and updated every time this enhancement is being considered for a milestone.
-->

[kubernetes.io]: https://kubernetes.io/
[kubernetes/enhancements]: https://git.k8s.io/enhancements
[kubernetes/kubernetes]: https://git.k8s.io/kubernetes
[kubernetes/website]: https://git.k8s.io/website

## Summary

After about a year and a half of testing a new kube-proxy implementation (https://github.com/kubernetes-sigs/kpng/) and
collecting sig-network and community feedback, it became clear that an interface for building new kube proxy's without an
opinionated implementation is desired by the Kubernetes networking commmunity attempting to build specialied networking
tools. This KEP distills the goals of such an interface and propose it's lifecycle and official support policy for sig-network.

This distillation of a "KPNG like interface" was originally presented by tim hockins in a sig-network breakout session, informally in https://docs.google.com/presentation/d/1Y-tZ4fFC9L2NvtBeiIXD1MiJ0ieg_zg4Q6SlFmIax8w/edit?hl=en&resourcekey=0-SFhIGTpnJT5fo6ZSzQC57g#slide=id.g16976fedf03_0_221.

## Motivation

There have been several presentations, issues, and projects dedicated to reusing kube proxy logic while extending it to embrace
different backend technologies (i.e. NFT, eBPF, openvswitch, and so on). This KEP attempts to make a library which will facilitate
this type of work.

A general solution to this problem is explored in the KPNG project (https://github.com/kubernetes-sigs/kpng/), which exhibits many properties
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it makes sense to talk about other KPNG features here, given that the whole point of this KEP is to not include those parts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, removing details, but will still give it a "nod" for historical puproses if thats ok :)

that allow for such goals to be accomplished. These are enabled by:

- A Generic "*Diff store"* which provides a client side, in memory data model for performant, declarative, generic calculation of differences between the Kubernetes networking state space from one time point to another (i.e. a replacement for the implicit change tracking functionality in k/k/pkg/proxy https://github.com/kubernetes/kubernetes/blob/master/pkg/proxy/endpoints.go#L161).
- A *library that simplifies consumption* of the Kubernetes network and its topology easy to consume for an individual node, abstracting away API Semantics (like topology) from underlying network routing rules.
- *Definition of types* which have a minimal amount of boiler plate, can be created without using the Kubernetes API data model (and thus used to extend proxying behaviour to things outside of the Kubernetes pod, service, and endpoint semantics), and which can describe routing of Kubernetes VIPs (services) to endpoints in a generic way that is understandable to people who don't work on Kubernetes on a day-to-day basis.

### Goals

- Build a vendorable repository "kube-proxy-lib" which can be used to make new kube-proxy's.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"kube-proxy" is the default implementation of the Kubernetes service proxy. If it had a plural, it would be "kube-proxies", not "kube-proxy's", but it's better to say "service proxies" or "service proxy implementations" or "kube-proxy alternatives". (Although we don't want to say "kube-proxy alternatives" here since part of the goal is to eventually replace the actual kube-proxy onto this.)

(Assume I made this comment again below everywhere you say "kube-proxy" or "kube proxy".)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

service proxies ftw :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we should be targeting k8s.io/kube-proxy for this, not a new kube-proxy-lib. I guess that's something we'll need to hash out in this PR.

- Exemplify the use of such a repository in a mock "backend" which uses the repository to process and respond to changes in the Kubernetes networking state.
- Define a policy around versioning and releasing of "kube-proxy-lib".
- Create a CI system that runs in test-grid which tests kube-proxy-lib compatibility with the latest Kubernetes releases continuously, and which can be used to inform API changes for regressions that break "kube-proxy-lib".
- Enable the eventual *replacement* of the k/k/pkg/proxy serviceChangeTracker and endpointsChangeTracker related caching structures inside of in-tree kube proxy with this generic library.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe point out that the existing code has hard-to-fix bugs. Eg, kubernetes/kubernetes#112604

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


### Non-Goals

- Rewrite or decouple the core , in-tree linux Kubernetes kube-proxy implementations, which are relied on by too many users to be tolerant to major changes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incompatible with the goal of replacing ServiceChangeTracker/EndpointsChangeTracker above.

In fact, I think that rebasing the iptables, ipvs, and winkernel implementations on top of this library is a requirement, not a non-goal. We don't want to be maintaining two separate implementations of "proxy business logic". (The entire point of this project is to reduce the maintenance burden of people keeping up with "proxy business logic".)

Certainly we need to be careful to not destabilize the existing implementations, but it seems to me that the way to do that is to use the existing shared proxy code (ServiceChangeTracker, etc) as the initial seed of the library, and replace it with new code bit by bit (ensuring that unit, integration, and e2e tests pass all along the way).

Copy link
Contributor

@astoycos astoycos Dec 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, I think that rebasing the iptables, ipvs, and winkernel implementations on top of this library is a requirement, not a non-goal.

I agree here, however some initial changes like adding shared bits for starting and stopping API server handlers and watchers will also need to be ported i.e something like the Provided interface right?

IMO the very minimum the first POC/ Seed should still provide external users the ability to start writing new backends quickly not only replace the client side caching mechanisms.

- Force a new architecture for the standard kube-proxy on to naive users.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this means "don't force Kubernetes end users to deploy kube-proxy in a 'split-brain' architecture as in Mikael's original KEP", but that's not at all clear unless you're already aware of that KEP / kpng.

Maybe "Making any incompatible architectural changes to the existing kube-proxy implementations" or something like that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm adding a few here.


## Proposal

We propose to build a kubernetes-sigs/kube-proxy-lib repository. This repository will be vendored by people wanting to build a new Kube proxy, and provide them with:
- A vendorable golang library that defines a few interfaces that can be easily implemented as a new kube proxy, which respond to endpoint slices and services changes.
- Documentation on how to build a kube proxy, based on https://github.com/kubernetes-sigs/kpng/blob/master/doc/service-proxy.md and other similar documents.
- integration test tooling, similar to the KPNG project's, which allows users to locally implement network routing logic in a small golang program which is not
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not clear what "similar to the KPNG project's" means. You need to explain that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, removed that bit

directly connected to the Kubernetes API server, for local, iterative development of Kubernetes network proxy tooling.

### User Stories (Optional)

#### Story 1

As a networking technology startup I want to make my own kube-proxy implementation but don't want to maintain the logic of talking to the APIServer, caching its data, or calculating an abbreviated/proxy-focused representation of the Kubernetes networking state space. I'd like a wholesale framework I can simply plug my logic into.

#### Story 2

As a Kubernetes maintainer, I don't want to have to understand the internals of a networking backend in order to simulate or write core updates to the logic of the kube-proxy locally.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a fairly small use case. In fact, it's pretty much limited to topology-related features, where all you care about is restricting the endpoints that get used on a given node. For just about anything else, any change to the core is going to need to be matched with new code in the backends as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, updated as a svc proxy maint , pushing shortly


### Notes/Constraints/Caveats (Optional)

- sending the full-state could be resource consuming on big clusters, but it should still be O(1) to
the actual kernel definitions (the complexity of what the node has to handle cannot be reduced
without losing functionality or correctness).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is completely unclear what this means, in the context of this KEP.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleting


### Risks and Mitigations

No risks, because we arent removing the in tree proxy as part of this repo, but rather, proposing a library for kube proxy extenders
to use optionally. There may be risks, eventually, if we write another KEP to *replatform* sig-networks k/k/pkg/proxy implementations
on top of this library, but that would be in a separate KEP.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it makes sense for that to be a separate KEP. IMO this project is only interesting for SIG Network if it is going to replace our existing shared proxy code. (Yes, the code would be useful for third parties even if SIG Network wasn't using it in the k8s core, but it wouldn't make sense for SIG Network to own the code in that case; it should just be a third-party project.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point but, in the interest of "building something that is uncontraversial as a first step", How strongly do you feel about this ? If i made this argument:

  • "sig network needs b"
  • "b needs a"
  • thus were going to make "sig-network/a" as a stepping stone to us being able to build "sig-network/b"
  • by the way "sig-network/b" is somethin lots of people want, so even if we get hit by a bus after making "b", and we never get around to "a", we still have a useful intermediate deliverable....

Is that digestible ? or does it sit funny w/ you ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm saying is that I don't think the library can be definitely "uncontroversial" and "a useful intermediate deliverable" unless we know that it is suited to replacing pkg/proxy.

I'm not saying we have to deliver the library and the updated iptables proxy in the same release, I'm saying that rather than doing:

  1. come up with a plan for the library
  2. implement the library
  3. come up with a plan to replace pkg/proxy with the new library
  4. replace pkg/proxy with the new library

we need to do

  1. come up with a plan for the library
  2. come up with a plan to replace pkg/proxy with the new library
  3. implement the library
  4. replace pkg/proxy with the new library

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gotcha, ok updated as a risk (and updated the goals to reflect)


## Design Details

### Test Plan

##### Unit tests

##### Integration tests

##### e2e tests

### Graduation Criteria

We will version "kube-proxy-lib" as 1.0 once it is known to be powering at least one backend proxy which can be run by an end userr, which is able to pass the Kubernetes sig-network (non disruptive) and Conformance suites, including Layer 7 and serviceType=Loadbalancer tests which currently run in the prow sig-network e2es.

## Production Readiness Review Questionnaire

### Dependencies

This project will depend on the Kubernetes client-go library to acquire Service, Endpoints, EndpointSlices, and other
networking API objects.


### Scalability

The ability to scale this service will be equivalent to that of the existing kube-proxy, insofar as the fact that
it will watch the same endpoints (as the existing kube-proxy) and generally then be used to forward to traffic to a single
backend loadbalancing technology (i.e. ebpf, nft, iptables, ...) as does the existing kube-proxy daemonset.

###### Will enabling / using this feature result in any new API calls?

No

###### Will enabling / using this feature result in introducing new API types?

Yes. There will be an in-memory API that is supported by this library, that is incremented overr time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. The question is about kubernetes API types, not about golang APIs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, added "No" + clarified

to reflect changes in the Kubernetes API. Upgrading a verrsion of this library may require users to change
the way they consume its various data structures. We will provide migration guides when this happens between
versions.

###### Will enabling / using this feature result in any new calls to the cloud provider?

No

###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?

### Troubleshooting

###### How does this feature react if the API server and/or etcd is unavailable?

The APISErver going down will prevent this library from generally working as would be expected in normal cases, where all incoming
Kubernetes networking data is being polled from the APIServer. However, since this library will be flexible, there are other ways
of providing it with networking information, and thus, APIServer outage doesn't have to result in the library itself being entirely unusable.

###### What are other known failure modes?

###### What steps should be taken if SLOs are not being met to determine the problem?

## Implementation History


"Librarification" PR into KPNG: https://github.com/kubernetes-sigs/kpng/pull/389.

## Drawbacks

## Alternatives

We could retain the existing kube proxy, but that would require copy and pasting alot of code, and continuing to document a the datastructures and golang maps for diffing which were never originally designed for external consumption. The exsiting Kube proxy's non-explicit mapping and diffing of Kubernetes API objects inspired the KPNG project, originally.

We could also leverage the KPNG project as an overall framework for solving these problems. The only drawback of this is that it is opinionated towards a raw GRPC implementaiton and other users (i.e. XDS) want something more decoupled possibly. This realization has inspired this KEP.

## Infrastructure Needed (Optional)

None
61 changes: 61 additions & 0 deletions keps/sig-network/2104-kube-proxy-lib/kep.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
title: rework kube-proxy architecture
kep-number: 2104
authors:
- "@mcluseau"
- "@rajaskakodar"
- "@astoyocos"
- "@jayunit100"
- "@nehalohia"
owning-sig: sig-network
participating-sigs:
- sig-network
- sig-windows
status: provisional|implementable|implemented|deferred|rejected|withdrawn|replaced
creation-date: 2020-10-10
reviewers:
- "@thockin"
- "@danwinship"
approvers:
- "@thockin"
- "@danwinship"

##### WARNING !!! ######
# prr-approvers has been moved to its own location
# You should create your own in keps/prod-readiness
# Please make a copy of keps/prod-readiness/template/nnnn.yaml
# to keps/prod-readiness/sig-xxxxx/00000.yaml (replace with kep number)
#prr-approvers:

see-also:
- "/keps/sig-aaa/1234-we-heard-you-like-keps"
- "/keps/sig-bbb/2345-everyone-gets-a-kep"
replaces:
- "/keps/sig-ccc/3456-replaced-kep"

# The target maturity stage in the current dev cycle for this KEP.
stage: alpha|beta|stable

# The most recent milestone for which work toward delivery of this KEP has been
# done. This can be the current (upcoming) milestone, if it is being actively
# worked on.
latest-milestone: "v1.19"

# The milestone at which this feature was, or is targeted to be, at each stage.
milestone:
alpha: "v1.19"
beta: "v1.20"
stable: "v1.22"

# The following PRR answers are required at alpha release
# List the feature gate name and the components for which it must be enabled
feature-gates:
- name: MyFeature
components:
- kube-apiserver
- kube-controller-manager
disable-supported: true

# The following PRR answers are required at beta release
metrics:
- my_feature_metric