Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kruise-rollout-webhook-service is not accesible in GKE, and any webhook ports other that 9876 aren't working #169

Open
denist-huma opened this issue Aug 16, 2023 · 0 comments
Assignees

Comments

@denist-huma
Copy link

denist-huma commented Aug 16, 2023

I had a problem following the Kubernetes Manifest CD > Canary Rollout documentation.

$ vela addon enable kruise-rollout rollout.webhook.port=10250 replicaCount=1
Addon kruise-rollout enabled successfully.
Please access addon-kruise-rollout from the following endpoints:
+---------+----------------+-------------------------------------------------------+-------------------------------------------------------+-------+
| CLUSTER |   COMPONENT    |               REF(KIND/NAMESPACE/NAME)                |                       ENDPOINT                        | INNER |
+---------+----------------+-------------------------------------------------------+-------------------------------------------------------+-------+
| local   | kruise-rollout | Service/kruise-rollout/kruise-rollout-webhook-service | https://kruise-rollout-webhook-service.kruise-rollout | true  |
+---------+----------------+-------------------------------------------------------+-------------------------------------------------------+-------+
denis@L560:~/huma/huma-cloud-infrastructure/cluster$ kubectl get endpoints -n kruise-rollout kruise-rollout-webhook-service -o yaml | grep port
  ports:
  - port: 9876
kubevela-vela-core-b8c58f8b5-7dzzv kubevela E0816 16:18:12.096566       1 task.go:252] "do steps" err="run step(provider=oam,do=component-apply): Dispatch: pre-dispatch dryrun failed: Found 1 errors. [(cannot create object: Internal error occurred: failed calling webhook \"vrollout.kb.io\": failed to call webhook: Post \"https://kruise-rollout-webhook-service.kruise-rollout.svc:443/validate-rollouts-kruise-io-rollout?timeout=10s\": dial tcp 10.156.5.192:9876: i/o timeout)]" application="default/py-gallery-redis-mongo-s3" controller="application" resource_version="734089104" generation=4 publish_version="" step_name="gallery" step_type="builtin-apply-component" spanID="i-ibingrwb.execute application workflow.axq9ptom85"

Debugging this webhook problem, I think that the root case is port blocking in a private GKE cluster I'm on.
Read The Definitive Debugging Guide for the cert-manager Webhook Pod.

Separate console:

kubectl -n kruise-rollout port-forward deploy/kruise-rollout-controller-manager 9876

Probe deployment's port:

$ curl -vsS --resolve kruise-rollout-webhook-service.kruise-rollout.svc:9876:127.0.0.1 \
    -H 'Content-Type: application/json' \
    --service-name kruise-rollout-webhook-service \
    --cacert <(kubectl -n kruise-rollout get secret kruise-rollout-webhook-certs -ojsonpath='{.data.ca-cert\.pem}' | base64 -d) \
    https://kruise-rollout-webhook-service.kruise-rollout.svc:9876/validate-rollouts-kruise-io-rollout 2>&1 -d@- <<'EOF' | sed '/^* /d; /bytes data]$/d; s/> //; s/< //'
{"kind":"AdmissionReview","apiVersion":"admission.k8s.io/v1","request":{"requestKind":{"group":"kruise.io","version":"v1alpha1","kind":"Rollout"},"requestResource":{"group":"kruise.io","version":"v1alpha1","resource":"rollouts"},"name":"rollouts-demo","namespace":"default","operation":"CREATE","object":{"apiVersion":"rollouts.kruise.io/v1alpha1","kind":"Rollout","metadata":{"name":"rollouts-demo"},"spec":{"objectRef":{"workloadRef":{"apiVersion":"apps/v1","kind":"Deployment","name":"echoserver"}},"strategy":{"canary":{"steps":[{"weight":20,"pause":{}},{"weight":40,"pause":{"duration":10}},{"weight":60,"pause":{"duration":10}},{"weight":80,"pause":{"duration":10}},{"weight":100,"pause":{"duration":0}}],"trafficRoutings":[{"service":"echoserver","ingress":{"classType":"nginx","name":"echoserver"}}]}}}}}}
EOF

POST /validate-rollouts-kruise-io-rollout HTTP/2
Host: kruise-rollout-webhook-service.kruise-rollout.svc:9876
user-agent: curl/7.81.0
accept: */*
content-type: application/json
content-length: 813

HTTP/2 200 
content-type: text/plain; charset=utf-8
content-length: 135
date: Wed, 16 Aug 2023 20:20:12 GMT

{"kind":"AdmissionReview","apiVersion":"admission.k8s.io/v1","response":{"uid":"","allowed":true,"status":{"metadata":{},"code":200}}}

I try to change webhook port not using vela:

Patching deployment and svc ports from 9876 to the recommended 10250 doesn't help:

ns="kruise-rollout"

kubectl patch deployment -n $ns "${ns}-controller-manager" --type=json -p \
    '[{"op": "replace", "path": "/spec/template/spec/containers/0/ports/0/containerPort", "value": 10250}]'

kubectl patch service -n ${ns} "${ns}-webhook-service" --type=json -p \
    '[{"op": "replace", "path": "/spec/ports/0/targetPort", "value": 10250}]'
kubevela-vela-core-b8c58f8b5-7dzzv kubevela E0816 18:17:31.971150       1 task.go:252] "do steps" err="run step(provider=oam,do=component-apply): Dispatch: pre-dispatch dryrun failed: Found 1 errors. [(cannot create object: Internal error occurred: failed calling webhook \"vrollout.kb.io\": failed to call webhook: Post \"https://kruise-rollout-webhook-service.kruise-rollout.svc:443/validate-rollouts-kruise-io-rollout?timeout=10s\": dial tcp 10.156.5.197:10250: connect: connection refused)]" application="default/canary-demo" controller="application" resource_version="734280907" generation=1 publish_version="v1" step_name="canary-demo" step_type="builtin-apply-component" spanID="i-l9r9jzu8.execute application workflow.t36ft4ewso

The probe deployment's port 10250 not working (allowed by default):

kubectl -n kruise-rollout port-forward deploy/kruise-rollout-controller-manager 10250
$ curl -vsS --resolve kruise-rollout-webhook-service.kruise-rollout.svc:10250:127.0.0.1 \
    -H 'Content-Type: application/json' \
    --service-name kruise-rollout-webhook-service \
    --cacert <(kubectl -n kruise-rollout get secret kruise-rollout-webhook-certs -ojsonpath='{.data.ca-cert\.pem}' | base64 -d) \
    https://kruise-rollout-webhook-service.kruise-rollout.svc:10250/validate-rollouts-kruise-io-rollout 2>&1 -d@- <<'EOF' | sed '/^* /d; /bytes data]$/d; s/> //; s/< //'
{"kind":"AdmissionReview","apiVersion":"admission.k8s.io/v1","request":{"requestKind":{"group":"kruise.io","version":"v1alpha1","kind":"Rollout"},"requestResource":{"group":"kruise.io","version":"v1alpha1","resource":"rollouts"},"name":"rollouts-demo","namespace":"default","operation":"CREATE","object":{"apiVersion":"rollouts.kruise.io/v1alpha1","kind":"Rollout","metadata":{"name":"rollouts-demo"},"spec":{"objectRef":{"workloadRef":{"apiVersion":"apps/v1","kind":"Deployment","name":"echoserver"}},"strategy":{"canary":{"steps":[{"weight":20,"pause":{}},{"weight":40,"pause":{"duration":10}},{"weight":60,"pause":{"duration":10}},{"weight":80,"pause":{"duration":10}},{"weight":100,"pause":{"duration":0}}],"trafficRoutings":[{"service":"echoserver","ingress":{"classType":"nginx","name":"echoserver"}}]}}}}}}
EOF
curl: (35) error:0A000126:SSL routines::unexpected eof while reading

Try https://openkruise.io/rollouts/installation way, set port and vela selector:

rollout:
  webhook:
    port: 8443
    objectSelector:
      - key: kruise-rollout.oam.dev/webhook
        operator: Exists
replicaCount: 1
resources: {}

Install the chart.

helm install kruise-rollout openkruise/kruise-rollout --version 0.3.0 -f examples/kubevela/kruise-rollout-values.yaml
kubectl -n kruise-rollout port-forward deploy/kruise-rollout-controller-manager 8443

The probe deployment's port 8443 not working (allowed in my firewall):

$ curl -vsS --resolve kruise-rollout-webhook-service.kruise-rollout.svc:8443:127.0.0.1 \
    -H 'Content-Type: application/json' \
    --service-name kruise-rollout-webhook-service \
    --cacert <(kubectl -n kruise-rollout get secret kruise-rollout-webhook-certs -ojsonpath='{.data.ca-cert\.pem}' | base64 -d) \
    https://kruise-rollout-webhook-service.kruise-rollout.svc:8443/validate-rollouts-kruise-io-rollout 2>&1 -d@- <<'EOF' | sed '/^* /d; /bytes data]$/d; s/> //; s/< //'
{"kind":"AdmissionReview","apiVersion":"admission.k8s.io/v1","request":{"requestKind":{"group":"kruise.io","version":"v1alpha1","kind":"Rollout"},"requestResource":{"group":"kruise.io","version":"v1alpha1","resource":"rollouts"},"name":"rollouts-demo","namespace":"default","operation":"CREATE","object":{"apiVersion":"rollouts.kruise.io/v1alpha1","kind":"Rollout","metadata":{"name":"rollouts-demo"},"spec":{"objectRef":{"workloadRef":{"apiVersion":"apps/v1","kind":"Deployment","name":"echoserver"}},"strategy":{"canary":{"steps":[{"weight":20,"pause":{}},{"weight":40,"pause":{"duration":10}},{"weight":60,"pause":{"duration":10}},{"weight":80,"pause":{"duration":10}},{"weight":100,"pause":{"duration":0}}],"trafficRoutings":[{"service":"echoserver","ingress":{"classType":"nginx","name":"echoserver"}}]}}}}}}
EOF
curl: (35) error:0A000126:SSL routines::unexpected eof while reading

UPD
Although I have mitigated it 👍 by adding the port tcp:9876 to the GKE VPC firewall, it is still a bug of the rollouts chart and the operator. One cannot assign any rollout.webhook.port other that 9876! 👎
My usual strategy was to change the webhook port of all operators to 1-2 allowed in my firewall. I don't want to keep more that 3-5 ports in firewall rules, it become complicated and easy to mess up. 💯

@denist-huma denist-huma changed the title kruise-rollout-webhook-service is not accesible in GKE kruise-rollout-webhook-service is not accesible in GKE, and webhook port other that 9876 not working Aug 16, 2023
@denist-huma denist-huma changed the title kruise-rollout-webhook-service is not accesible in GKE, and webhook port other that 9876 not working kruise-rollout-webhook-service is not accesible in GKE, and any webhook ports other that 9876 aren't working Aug 17, 2023
@furykerry furykerry assigned furykerry and zmberg and unassigned furykerry Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants