Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation webhook #233

Merged
merged 2 commits into from
Sep 6, 2021
Merged

Validation webhook #233

merged 2 commits into from
Sep 6, 2021

Conversation

leonnicolas
Copy link
Collaborator

Add a sub command to kg to start a web server for validation of Kilo peers in Kubernetes.

@leonnicolas leonnicolas force-pushed the validation-webhook branch 3 times, most recently from 032a8e3 to f0149ea Compare August 24, 2021 10:57
cmd/kg/webhook.go Outdated Show resolved Hide resolved
@leonnicolas leonnicolas marked this pull request as ready for review August 27, 2021 14:30
"github.com/prometheus/client_golang/prometheus/promhttp"
"github.com/spf13/cobra"
kilo "github.com/squat/kilo/pkg/k8s/apis/kilo/v1alpha1"
"github.com/squat/kilo/pkg/version"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Could you separate these Kilo imports from the rest to follow convention?

cmd/kg/webhook.go Outdated Show resolved Hide resolved
cmd/kg/webhook.go Outdated Show resolved Hide resolved
cmd/kg/webhook.go Outdated Show resolved Hide resolved
cmd/kg/webhook.go Outdated Show resolved Hide resolved
var (
validationCounter = prometheus.NewCounterVec(
prometheus.CounterOpts{
Name: "https_admission_requests_total",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally in prometheus we standardize that all HTTP request counters use the same metric name http_requests_total. Do differentiate we add a label, e.g. handler="validate". The way this metric is used, seems to be less about HTTP actually and more about the business logic of validation. This is because it doesn't track all requests to the validate endpoint: all HTTP requests that error early don't increment the counter. I would say let's change the name to admission_requests_total. If we want a metric to track the HTTP requests, then we can wrap the handler.

cmd/kg/webhook.go Outdated Show resolved Hide resolved
}
if *gvk != v1.SchemeGroupVersion.WithKind("AdmissionReview") {
errorCounter.Inc()
level.Error(logger).Log("err", "only api v1 is supported")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep this consistent with a msg variable like above

Suggested change
level.Error(logger).Log("err", "only api v1 is supported")
msg := "only API v1 is supported"

cmd/kg/webhook.go Outdated Show resolved Hide resolved
cmd/kg/webhook.go Outdated Show resolved Hide resolved
namespace: kilo
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pretty wide permission to read all secrets in the cluster. Is it not enough to make this a Role that is scoped only to the secrets in the kilo namespace? Otherwise it's a bit of a security concern

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and use one cluster role for the validatingwebhookconfiguration and one role for the secret.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could be even more conservative here and add:

resourceNames:
- kilo-webhook

so that the ServiceAccount can only modify the Kilo webhook configuration

name: kilo-peer-validation
---
apiVersion: batch/v1
kind: Job
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job here 💪

Copy link
Owner

@squat squat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @leonnicolas!! I can't wait to deploy this. In a follow up i think it would be great to put this in the e2e tests as well to show everything working together

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
name: "peer-validation.kilo.svc"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a bit funny to name the webhook as a DNS name. Normally these resources have names like kilo-webhook or something

metadata:
name: "peer-validation.kilo.svc"
webhooks:
- name: "peer-validation.kilo.svc"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this name is normally namespaced to the domain of the project rather than the DNS name of a service within the project. In our case it could be peers.kilo.squat.ai, e.g. https://github.com/kubernetes-sigs/multi-tenancy/blob/master/incubator/hnc/config/webhook/manifests.yaml#L18

This commit uses cobra instead of pflags in kg to handle flags in preparation  to add a new subcommand
for the webhook server.

Signed-off-by: leonnicolas <leonloechner@gmx.de>
Copy link
Owner

@squat squat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks amazing 😻 one small naming nit and I'm totally happy

cancel()
}()

select {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this looks totally correct and does pretty much the same things that rungroups we use in kg do. However the rungroups provide some syntactic sugar so they're a little bit easier to maintain, eg you don't have to manually add new cases and bump the count in the wait groups every time that we add a new concurrent thread. I think it might be worth to switch to rungroups for consistency eventually. We can do this later tho

}
}

func metricsMiddleWare(path string, next func(http.ResponseWriter, *http.Request)) func(http.ResponseWriter, *http.Request) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

func init() {
webhookCmd.Flags().StringVar(&certPath, "cert-file", "", "The path to a certificate file")
webhookCmd.Flags().StringVar(&keyPath, "key-file", "", "The path to a key file")
webhookCmd.Flags().StringVar(&metricsAddr, "metrics-address", ":1107", "The metrics server will be listening to that address")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, what do you think about naming this listen-metrics?

cancel()
}()
if err := msrv.Shutdown(ctx); err != nil {
level.Error(logger).Log("msg", "failed to shut down webhook server gracefully", "err", err.Error())
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
level.Error(logger).Log("msg", "failed to shut down webhook server gracefully", "err", err.Error())
level.Error(logger).Log("msg", "failed to shut down metrics server gracefully", "err", err.Error())

if ok := errors.As(err, &serr); ok {
level.Info(logger).Log("msg", "received signal", "signal", serr.Signal.String(), "err", err.Error())
} else {
level.Error(logger).Log("msg", "reveived error", "err", err.Error())
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
level.Error(logger).Log("msg", "reveived error", "err", err.Error())
level.Error(logger).Log("msg", "received error", "err", err.Error())

if ok := errors.As(err, &serr); ok {
level.Info(logger).Log("msg", "received signal", "signal", serr.Signal.String(), "err", err.Error())
} else {
level.Error(logger).Log("msg", "reveived error", "err", err.Error())
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
level.Error(logger).Log("msg", "reveived error", "err", err.Error())
level.Error(logger).Log("msg", "received error", "err", err.Error())

cancel()
}()
var g run.Group
g.Add(run.SignalHandler(ctx, syscall.SIGINT, syscall.SIGTERM))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙀 I didn't know about this feature

squat
squat previously approved these changes Sep 6, 2021
Copy link
Owner

@squat squat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 small typos and let's merge it!

This commit adds a sub command `webhook` to Kilo.
It will start a https web server that answeres request from a Kubernetes
API server to validate updates and creations of Kilo peers.

It also updates the "Peer Validation" docs to enable users to
install the web hook server and generate the self signed certificates in
the cluster by only applying a manifest.

Signed-off-by: leonnicolas <leonloechner@gmx.de>

Apply suggestions from code review

Co-authored-by: Lucas Servén Marín <lserven@gmail.com>
Copy link
Owner

@squat squat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️❤️❤️❤️

@squat squat merged commit 6b93cc2 into squat:main Sep 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants