Service controller doesn't populate TargetGroups #915

jacekn · 2024-05-16T09:45:54Z

What happened:

I deployed the controller and configured NLB type Service.
The service was created in AWS with associated target group but the target group is empty

What you expected to happen:

I expected the service-lb-controller controller to populate the target group

How to reproduce it (as minimally and precisely as possible):

I deployed the controller using manifest generated like this:

cat << EOF | helm template --values=- aws-cloud-controller-manager aws-cloud-controller-manager/aws-cloud-controller-manager>aws-cloud-controller.yaml
args:
  - --v=2
  - --cloud-provider=aws
  - --cluster-name=mycluster
  - --controllers=service-lb-controller,cloud-node
  - --allocate-node-cidrs=false
  - --configure-cloud-routes=false

image:
    repository: registry.k8s.io/provider-aws/cloud-controller-manager
    tag: v1.30.0
EOF

And used IAM policy from the docs. I then created the Service object like this:

apiVersion: v1
kind: Service
metadata:
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-scheme: "internal"
    service.beta.kubernetes.io/aws-load-balancer-type: nlb
  name: test-service
  namespace: mynamespace
spec:
  allocateLoadBalancerNodePorts: true
  externalTrafficPolicy: Local
  healthCheckNodePort: 30309
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - appProtocol: http
    name: http
    nodePort: 31199
    port: 80
    protocol: TCP
    targetPort: http
  selector:
    app.kubernetes.io/name: myapp
  sessionAffinity: None
  type: LoadBalancer

Once applied ELB was created together with health checks and TargetGroups. However target groups are empty.
I also noticed that security group entries were not added.

Anything else we need to know?:

This used to work with in-tree controller. We disabled the in-tree and moved to external and the Service controller no longer works in the same cluster.

Logs show successful calls to retrieve node details from AWS, for example:

I0516 09:28:28.565710       1 log_handler.go:37] AWS API ValidateResponse: ec2 DescribeInstances &{DescribeInstances POST / 0xc0004fee60 <nil>} {
  InstanceIds: ["i-xyz"]
} 200 OK

I also confirmed with CloudTrail that there are no permission error with API calls.

If I add nodes manually to the Target group they are removed from the Target group.

Environment:

Kubernetes version (use kubectl version):

Client Version: v1.28.9
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.9

Cloud provider or hardware configuration: AWS
OS (e.g. from /etc/os-release): Ubuntu 20.04.6 LTS
Kernel (e.g. uname -a): 5.15.0-1058-aws #64~20.04.1-Ubuntu SMP
Install tools: kubeadm
Others:

/kind bug

The text was updated successfully, but these errors were encountered:

k8s-ci-robot · 2024-05-16T09:46:02Z

This issue is currently awaiting triage.

If cloud-provider-aws contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-triage-robot · 2024-08-14T09:58:54Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label May 16, 2024

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label May 16, 2024

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Service controller doesn't populate TargetGroups #915

Service controller doesn't populate TargetGroups #915

jacekn commented May 16, 2024

k8s-ci-robot commented May 16, 2024

k8s-triage-robot commented Aug 14, 2024

Service controller doesn't populate TargetGroups #915

Service controller doesn't populate TargetGroups #915

Comments

jacekn commented May 16, 2024

k8s-ci-robot commented May 16, 2024

k8s-triage-robot commented Aug 14, 2024