service IPs and ports are not released when deleting a service via a finalizer-removing update #87603

chrischdi · 2020-01-28T08:30:32Z

What happened:

Created a service including a finalizer
Triggered deletion of the service
Removed finalizer
Service got deleted by apiserver (not visible anymore via kubectl)
Tried to create service again

Creation got denied: The Service "foo" is invalid: spec.ports[0].nodePort: Invalid value: 30003: provided port is already allocated
The apiserver logs the following lines prior to this happening:

E0128 08:15:36.920788       1 repair.go:145] the node port 30003 for service foo/default is not allocated; repairing
E0128 08:15:36.920837       1 repair.go:237] the cluster IP 10.0.0.81 for service foo/default is not allocated; repairing

After about 10 minutes I'm able to create the service, the apiserver shows the following log lines when it is repairing it:

E0128 08:28:51.429642       1 repair.go:184] the node port 30003 appears to have leaked: cleaning up
E0128 08:28:51.436350       1 repair.go:311] the cluster IP 10.0.0.81 appears to have leaked: cleaning up

What you expected to happen:

Service is allowed to get created some seconds after deletion

How to reproduce it (as minimally and precisely as possible):

cd $(mktemp -d)
mkdir etcd
docker run -d -p 2379:2379 --name=kube-etcd -v $(pwd)/etcd:/tmp/ --rm k8s.gcr.io/etcd:3.3.15 /usr/local/bin/etcd --data-dir /tmp/etcd --advertise-client-urls=http://0.0.0.0:2379 --listen-client-urls=http://0.0.0.0:2379
docker run -d --net=host --name=kube-apiserver --rm k8s.gcr.io/kube-apiserver:v1.17.2 kube-apiserver --etcd-servers http://127.0.0.1:2379 --insecure-port 8080 --authorization-mode=RBAC

export KUBECONFIG=$(pwd)/kubeconfig
touch $KUBECONFIG
kubectl config set-cluster etcd-local --server=http://localhost:8080
kubectl config set-context etcd-local --cluster=etcd-local
kubectl config use-context etcd-local

cat <<EOF > service.yaml
apiVersion: v1
kind: Service
metadata:
  name: foo
  finalizers:
  - foo.bar/some-finalizer
spec:
  ports:
  - port: 80
    protocol: TCP
    targetPort: 8080
    nodePort: 30003
  selector:
    app: kuard
  type: NodePort

EOF

for i in {1..200}; do
  echo "[$(date +%Y-%m-%d-%H:%M:%S)] # $i"
  kubectl apply -f service.yaml
  kubectl delete svc foo --wait=false
  sleep 1
  kubectl patch svc foo --type='json' -p='[{"op":"remove","path":"/metadata/finalizers"}]'
  kubectl delete svc foo --ignore-not-found
  sleep 1
done

Example output:

[2020-01-28-08:55:25] # 1                                                                        
service/foo unchanged                                                                                         
service "foo" deleted                                 
service/foo patched                                                      
...
[2020-01-28-08:58:21] # 77
service/foo created
service "foo" deleted
service/foo patched
[2020-01-28-08:58:23] # 78
service/foo created
service "foo" deleted
service/foo patched
[2020-01-28-08:58:26] # 79
The Service "foo" is invalid: spec.ports[0].nodePort: Invalid value: 30003: provided port is already allocated
Error from server (NotFound): services "foo" not found
Error from server (NotFound): services "foo" not found
[2020-01-28-08:58:28] # 80
The Service "foo" is invalid: spec.ports[0].nodePort: Invalid value: 30003: provided port is already allocated
Error from server (NotFound): services "foo" not found
Error from server (NotFound): services "foo" not found
[2020-01-28-08:58:30] # 81
The Service "foo" is invalid: spec.ports[0].nodePort: Invalid value: 30003: provided port is already allocated
Error from server (NotFound): services "foo" not found
Error from server (NotFound): services "foo" not found
...
[2020-01-28-09:07:23] # 5
The Service "foo" is invalid: spec.ports[0].nodePort: Invalid value: 30003: provided port is already allocated
Error from server (NotFound): services "foo" not found
Error from server (NotFound): services "foo" not found
[2020-01-28-09:07:26] # 6
service/foo created
service "foo" deleted
service/foo patched

Anything else we need to know?:

This does not always happen / this is a flakyness
Also the problem get's auto-resolved by the apiserver after some time (But this may need about 10 minutes):
E0128 08:01:24.562044 1 repair.go:300] the cluster IP 10.0.0.215 may have leaked: flagging for later clean up
Background for us here is: we want to run a custom controller for services of type: LoadBalancer and want to use a finalizer. We did hit this issue sometimes during dev.

Environment:

Kubernetes version (use kubectl version):

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.6", GitCommit:"72c30166b2105cd7d3350f2c28a219e6abcd79eb", GitTreeState:"clean", BuildDate:"2020-01-18T23:31:31Z", GoVersion:"go1.13.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.2", GitCommit:"59603c6e503c87169aea6106f57b9f242f64df89", GitTreeState:"clean", BuildDate:"2020-01-18T23:22:30Z", GoVersion:"go1.13.5", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider or hardware configuration: none / locally reproducable / all are affected

OS (e.g: cat /etc/os-release):

$ cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.3 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.3 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

Kernel (e.g. uname -a):
```
$ uname -a
```

Linux 5.3.0-26-generic #28~18.04.1-Ubuntu SMP Wed Dec 18 16:40:14 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

- Install tools:
- Network plugin and version (if this is a network-related bug):
- Others:

The text was updated successfully, but these errors were encountered:

chrischdi · 2020-01-28T08:34:32Z

/sig api-machinery

chrischdi · 2020-01-28T08:39:54Z

Maybe also helpful: the output of kubectl events:

$ k get events -o wide
LAST SEEN   TYPE      REASON                  OBJECT        SUBOBJECT   SOURCE                            MESSAGE                                             FIRST SEEN   COUNT   NAME
42m         Warning   ClusterIPNotAllocated   service/foo               ipallocator-repair-controller     Cluster IP 10.0.0.100 is not allocated; repairing   42m          1       foo.15edfdd270043df5
42m         Warning   PortNotAllocated        service/foo               portallocator-repair-controller   Port 30003 is not allocated; repairing              42m          1       foo.15edfdd270314403
39m         Warning   ClusterIPNotAllocated   service/foo               ipallocator-repair-controller     Cluster IP 10.0.0.215 is not allocated; repairing   39m          1       foo.15edfdfc5961dbfc
39m         Warning   PortNotAllocated        service/foo               portallocator-repair-controller   Port 30003 is not allocated; repairing              39m          1       foo.15edfdfc59620b05
22m         Warning   PortNotAllocated        service/foo               portallocator-repair-controller   Port 30003 is not allocated; repairing              22m          1       foo.15edfeecb777d9c1
22m         Warning   ClusterIPNotAllocated   service/foo               ipallocator-repair-controller     Cluster IP 10.0.0.81 is not allocated; repairing    22m          1       foo.15edfeecb778ae2c
6m24s       Warning   PortNotAllocated        service/foo               portallocator-repair-controller   Port 30003 is not allocated; repairing              6m24s        1       foo.15edffcf9d0dae46
6m24s       Warning   ClusterIPNotAllocated   service/foo               ipallocator-repair-controller     Cluster IP 10.0.0.136 is not allocated; repairing   6m24s        1       foo.15edffcf9d9e2f30

When these events get emmitted the problem does occur - so the repair seems to break things in this case.

liggitt · 2020-01-28T20:36:20Z

the service registry overrides the Delete implementation to free allocated ip/ports after a Delete API request:

kubernetes/pkg/registry/core/service/storage/rest.go

Lines 249 to 268 in 2cd90fb

    
           func (rs *REST) Delete(ctx context.Context, id string, deleteValidation rest.ValidateObjectFunc, options *metav1.DeleteOptions) (runtime.Object, bool, error) { 
        
           	// TODO: handle graceful 
        
           	obj, _, err := rs.services.Delete(ctx, id, deleteValidation, options) 
        
           	if err != nil { 
        
           		return nil, false, err 
        
           	} 
        
           	svc := obj.(*api.Service) 
        
           	// Only perform the cleanup if this is a non-dryrun deletion 
        
           	if !dryrun.IsDryRun(options.DryRun) { 
        
           		// TODO: can leave dangling endpoints, and potentially return incorrect 
        
           		// endpoints if a new service is created with the same name 
        
           		_, _, err = rs.endpoints.Delete(ctx, id, rest.ValidateAllObjectFunc, &metav1.DeleteOptions{}) 
        
           		if err != nil && !errors.IsNotFound(err) { 
        
           			return nil, false, err 
        
           		} 
        
           		rs.releaseAllocatedResources(svc) 
        
           	}

but it does not implement an AfterDelete hook in its strategy, so it does not participate in object deletion in response to removing the last finalizer of a service with a deletionTimestamp set during an Update API request:

kubernetes/staging/src/k8s.io/apiserver/pkg/registry/generic/registry/store.go

Lines 553 to 573 in 2cd90fb

    
           	// Check the default delete-during-update conditions, and store-specific conditions if provided 
        
           	if ShouldDeleteDuringUpdate(ctx, key, obj, existing) && 
        
           		(e.ShouldDeleteDuringUpdate == nil || e.ShouldDeleteDuringUpdate(ctx, key, obj, existing)) { 
        
           		deleteObj = obj 
        
           		return nil, nil, errEmptiedFinalizers 
        
           	} 
        
           	ttl, err := e.calculateTTL(obj, res.TTL, true) 
        
           	if err != nil { 
        
           		return nil, nil, err 
        
           	} 
        
           	if int64(ttl) != res.TTL { 
        
           		return obj, &ttl, nil 
        
           	} 
        
           	return obj, nil, nil 
        
           }, dryrun.IsDryRun(options.DryRun)) 
        
           if err != nil { 
        
           	// delete the object 
        
           	if err == errEmptiedFinalizers { 
        
           		return e.deleteWithoutFinalizers(ctx, name, key, deleteObj, storagePreconditions, dryrun.IsDryRun(options.DryRun)) 
        
           	}

kubernetes/staging/src/k8s.io/apiserver/pkg/registry/generic/registry/store.go

Lines 423 to 443 in 2cd90fb

    
           // deleteWithoutFinalizers handles deleting an object ignoring its finalizer list. 
        
           // Used for objects that are either been finalized or have never initialized. 
        
           func (e *Store) deleteWithoutFinalizers(ctx context.Context, name, key string, obj runtime.Object, preconditions *storage.Preconditions, dryRun bool) (runtime.Object, bool, error) { 
        
           	out := e.NewFunc() 
        
           	klog.V(6).Infof("going to delete %s from registry, triggered by update", name) 
        
           	// Using the rest.ValidateAllObjectFunc because the request is an UPDATE request and has already passed the admission for the UPDATE verb. 
        
           	if err := e.Storage.Delete(ctx, key, out, preconditions, rest.ValidateAllObjectFunc, dryRun); err != nil { 
        
           		// Deletion is racy, i.e., there could be multiple update 
        
           		// requests to remove all finalizers from the object, so we 
        
           		// ignore the NotFound error. 
        
           		if storage.IsNotFound(err) { 
        
           			_, err := e.finalizeDelete(ctx, obj, true) 
        
           			// clients are expecting an updated object if a PUT succeeded, 
        
           			// but finalizeDelete returns a metav1.Status, so return 
        
           			// the object in the request instead. 
        
           			return obj, false, err 
        
           		} 
        
           		return nil, false, storeerr.InterpretDeleteError(err, e.qualifiedResourceFromContext(ctx), name) 
        
           	} 
        
           	_, err := e.finalizeDelete(ctx, out, true)

kubernetes/staging/src/k8s.io/apiserver/pkg/registry/generic/registry/store.go

Lines 1050 to 1057 in 2cd90fb

    
           // finalizeDelete runs the Store's AfterDelete hook if runHooks is set and 
        
           // returns the decorated deleted object if appropriate. 
        
           func (e *Store) finalizeDelete(ctx context.Context, obj runtime.Object, runHooks bool) (runtime.Object, error) { 
        
           	if runHooks && e.AfterDelete != nil { 
        
           		if err := e.AfterDelete(obj); err != nil { 
        
           			return nil, err 
        
           		} 
        
           	}

/sig network
/remove-sig api-machinery

k8s-ci-robot · 2020-01-28T20:38:23Z

@liggitt: Those labels are not set on the issue: sig/api-machinery

In response to this:

the service registry overrides the Delete implementation to free allocated ip/ports after a Delete API request:

kubernetes/pkg/registry/core/service/storage/rest.go

Lines 249 to 268 in 2cd90fb

func (rs *REST) Delete(ctx context.Context, id string, deleteValidation rest.ValidateObjectFunc, options *metav1.DeleteOptions) (runtime.Object, bool, error) {

// TODO: handle graceful

obj, _, err := rs.services.Delete(ctx, id, deleteValidation, options)

if err != nil {

return nil, false, err

}

svc := obj.(*api.Service)

// Only perform the cleanup if this is a non-dryrun deletion

if !dryrun.IsDryRun(options.DryRun) {

// TODO: can leave dangling endpoints, and potentially return incorrect

// endpoints if a new service is created with the same name

_, _, err = rs.endpoints.Delete(ctx, id, rest.ValidateAllObjectFunc, &metav1.DeleteOptions{})

if err != nil && !errors.IsNotFound(err) {

return nil, false, err

}

rs.releaseAllocatedResources(svc)

}

but it does not implement an AfterDelete hook in its strategy, so it does not participate in object deletion in response to removing the last finalizer of a service with a deletionTimestamp set during an Update API request:

kubernetes/staging/src/k8s.io/apiserver/pkg/registry/generic/registry/store.go

Lines 553 to 573 in 2cd90fb

// Check the default delete-during-update conditions, and store-specific conditions if provided

if ShouldDeleteDuringUpdate(ctx, key, obj, existing) &&

(e.ShouldDeleteDuringUpdate == nil || e.ShouldDeleteDuringUpdate(ctx, key, obj, existing)) {

deleteObj = obj

return nil, nil, errEmptiedFinalizers

}

ttl, err := e.calculateTTL(obj, res.TTL, true)

if err != nil {

return nil, nil, err

}

if int64(ttl) != res.TTL {

return obj, &ttl, nil

}

return obj, nil, nil

}, dryrun.IsDryRun(options.DryRun))

if err != nil {

// delete the object

if err == errEmptiedFinalizers {

return e.deleteWithoutFinalizers(ctx, name, key, deleteObj, storagePreconditions, dryrun.IsDryRun(options.DryRun))

}

kubernetes/staging/src/k8s.io/apiserver/pkg/registry/generic/registry/store.go

Lines 423 to 443 in 2cd90fb

// deleteWithoutFinalizers handles deleting an object ignoring its finalizer list.

// Used for objects that are either been finalized or have never initialized.

func (e *Store) deleteWithoutFinalizers(ctx context.Context, name, key string, obj runtime.Object, preconditions *storage.Preconditions, dryRun bool) (runtime.Object, bool, error) {

out := e.NewFunc()

klog.V(6).Infof("going to delete %s from registry, triggered by update", name)

// Using the rest.ValidateAllObjectFunc because the request is an UPDATE request and has already passed the admission for the UPDATE verb.

if err := e.Storage.Delete(ctx, key, out, preconditions, rest.ValidateAllObjectFunc, dryRun); err != nil {

// Deletion is racy, i.e., there could be multiple update

// requests to remove all finalizers from the object, so we

// ignore the NotFound error.

if storage.IsNotFound(err) {

_, err := e.finalizeDelete(ctx, obj, true)

// clients are expecting an updated object if a PUT succeeded,

// but finalizeDelete returns a metav1.Status, so return

// the object in the request instead.

return obj, false, err

}

return nil, false, storeerr.InterpretDeleteError(err, e.qualifiedResourceFromContext(ctx), name)

}

_, err := e.finalizeDelete(ctx, out, true)

kubernetes/staging/src/k8s.io/apiserver/pkg/registry/generic/registry/store.go

Lines 1050 to 1057 in 2cd90fb

// finalizeDelete runs the Store's AfterDelete hook if runHooks is set and

// returns the decorated deleted object if appropriate.

func (e *Store) finalizeDelete(ctx context.Context, obj runtime.Object, runHooks bool) (runtime.Object, error) {

if runHooks && e.AfterDelete != nil {

if err := e.AfterDelete(obj); err != nil {

return nil, err

}

}

/sig network
/remove-sig api-machinery

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

athenabot · 2020-01-28T20:57:11Z

/triage unresolved

Comment /remove-triage unresolved when the issue is assessed and confirmed.

🤖 I am a bot run by vllry. 👩‍🔬

danwinship · 2020-02-06T22:12:00Z

/remove-triage unresolved

thockin · 2020-04-02T19:56:22Z

@MrHohn this one seems important.

MrHohn · 2020-04-02T20:39:57Z

Thanks for the great analysis. My understanding is that we need to implement a AfterDelete hook for service - I will take a stab :)
/assign

sparkoo · 2020-04-23T11:27:45Z

@MrHohn hello, any updates on this?

MrHohn · 2020-04-23T16:30:44Z

@sparkoo Sorry for the delay, I implemented a fix locally and am working on a test at the moment. Looking to have a PR out this week.

fejta-bot · 2020-07-22T16:44:47Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

BenTheElder · 2020-07-28T05:30:10Z

/remove-lifecycle stale

fejta-bot · 2020-10-26T05:59:27Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

aojea · 2020-10-26T09:21:42Z

/remove-lifecycle stale

fejta-bot · 2021-01-24T10:12:43Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

chrischdi · 2021-01-24T10:23:02Z

/remove-lifecycle stale

thockin · 2021-03-04T21:25:09Z

We need to revisit this soon

tomerleib · 2021-04-11T07:44:31Z

I've seen this as well:

K8s 1.17.6
Calico
Ingress-Nginx 0.45 (chart 3.29)

After deleting the chart from the system, I'm left with the controller service intact.
The events for this service shows the same outputs as above:

11m         Normal    EnsuringLoadBalancer      service/nginx-ingress-ingress-nginx-controller                                   Ensuring load balancer
11m         Normal    EnsuredLoadBalancer       service/nginx-ingress-ingress-nginx-controller                                   Ensured load balancer
3m19s       Normal    DeletingLoadBalancer      service/nginx-ingress-ingress-nginx-controller                                   Deleting load balancer
3m18s       Warning   FailedToCreateEndpoint    endpoints/nginx-ingress-ingress-nginx-controller                                 Failed to create endpoint for service default/nginx-ingress-ingress-nginx-controller: endpoints "nginx-ingress-ingress-nginx-controller" already exists
2m37s       Warning   PortNotAllocated          service/nginx-ingress-ingress-nginx-controller                                   Port 31861 is not allocated; repairing
2m37s       Warning   PortNotAllocated          service/nginx-ingress-ingress-nginx-controller                                   Port 32175 is not allocated; repairing
2m37s       Warning   ClusterIPNotAllocated     service/nginx-ingress-ingress-nginx-controller                                   Cluster IP 10.233.15.55 is not allocated; repairing

lkoniecz · 2021-06-12T09:48:12Z

Any updates on this?
I just got hit by that. Got stuck in deleting service

  ----     ------                 ----                ----                             -------
  Warning  PortNotAllocated       21m (x2 over 16h)   portallocator-repair-controller  Port 31704 is not allocated; repairing
  Warning  PortNotAllocated       21m (x2 over 16h)   portallocator-repair-controller  Port 30651 is not allocated; repairing
  Warning  PortNotAllocated       21m (x2 over 16h)   portallocator-repair-controller  Port 32757 is not allocated; repairing
  Warning  PortNotAllocated       21m (x2 over 16h)   portallocator-repair-controller  Port 31217 is not allocated; repairing
  Warning  PortNotAllocated       21m (x2 over 16h)   portallocator-repair-controller  Port 31105 is not allocated; repairing
  Normal   EnsuringLoadBalancer   10m (x236 over 8d)  service-controller               Ensuring load balancer
  Warning  PortNotAllocated       10m (x10 over 18h)  portallocator-repair-controller  Port 30651 is not allocated; repairing
  Normal   Type                   5m26s               service-controller               LoadBalancer -> NodePort
  Warning  PortNotAllocated       67s (x12 over 18h)  portallocator-repair-controller  Port 32757 is not allocated; repairing
  Warning  PortNotAllocated       67s (x12 over 18h)  portallocator-repair-controller  Port 31217 is not allocated; repairing
  Warning  PortNotAllocated       67s (x12 over 18h)  portallocator-repair-controller  Port 31105 is not allocated; repairing
  Warning  PortNotAllocated       67s (x12 over 18h)  portallocator-repair-controller  Port 31704 is not allocated; repairing
  Warning  ClusterIPNotAllocated  53s (x14 over 18h)  ipallocator-repair-controller    Cluster IP 172.20.39.20 is not allocated; repairing

rvillane · 2021-06-14T23:45:25Z

I was impacted by this issue today, also got stuck trying to delete a service in a Kubernetes 1.18 cluster

Warning ClusterIPNotAllocated service/myservice-stage-internal Cluster IP 10.32.19.1 is not allocated; repairing

lkoniecz · 2021-06-16T05:04:29Z

For those who are only interested in deleting the service

kubectl delete svc <your_service>
kubectl patch service/<your_service> --type json --patch='[ { "op": "remove", "path": "/metadata/finalizers" } ]'

aojea · 2021-06-16T07:05:29Z

For those who are only interested in deleting the service

This bug is a "temporary" problem when using finalizers on services, but doesn't cause the service to got stuck on deletion, if you have to remove the finalizer manually you should check what controller should remove that and what is causing it to fail to delete it

aojea · 2021-06-17T14:55:19Z

heh, I manage to reproduce it #102955
the key to make it deterministic was to wait for the repair loop, it is hardcoded to 3 mins ...

aojea · 2021-06-17T15:10:21Z

/assign

aojea · 2021-07-01T22:36:51Z

/unassign
/assign @thockin
/milestone 1.23
This will be fixed as part of Tim's PR #96684,
but not in 1.22 for sure, sorry

k8s-ci-robot · 2021-07-01T22:36:53Z

@aojea: You must be a member of the kubernetes/milestone-maintainers GitHub team to set the milestone. If you believe you should be able to issue the /milestone command, please contact your and have them propose you as an additional delegate for this responsibility.

In response to this:

/unassign
/assign @thockin
/milestone 1.23
This will be fixed as part of Tim's PR #96684,
but not in 1.22 for sure, sorry

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

collinjlesko · 2021-09-08T20:52:17Z

Thought I would add to this,

Having the exact same issue with v1.18.16 on EKS. Initially, namespaces would be stuck in a "Terminating" state. Further research would lead you to find that services (loadbalancers) in Kubernetes were hanging or would persist for 10+ minutes. If you left it alone for a while, it would go away... eventually. However, this is typically instantly deleted. When running describe, I would notice it would initially say

Normal DeletingLoadBalancer 102s service-controller Deleting load balancer

Followed by the below about 60 seconds later:

Warning PortNotAllocated 19s portallocator-repair-controller Port 31015 is not allocated; repairing Warning ClusterIPNotAllocated 19s ipallocator-repair-controller Cluster IP 172.20.131.10 is not allocated; repairing

Then the service would just sit there... until it got deleted ~10 minutes later.

@lkoniecz solution of:

kubectl patch service/<your_service> --type json --patch='[ { "op": "remove", "path": "/metadata/finalizers" } ]'

works perfect in the meantime, but going to have to overhaul a lot of automation.

Going to downgrade to 1.17 as the issues seems to appear in any version above 1.17.

chrischdi added the kind/bug Categorizes issue or PR as related to a bug. label Jan 28, 2020

k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jan 28, 2020

k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 28, 2020

k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. and removed sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels Jan 28, 2020

k8s-ci-robot added the triage/unresolved Indicates an issue that can not or will not be resolved. label Jan 28, 2020

liggitt changed the title ~~kube-apiserver sometimes fails to cleanup services when using finalizers and nodePorts~~ service IPs and ports are not released when deleting a service via a finalizer-removing update Jan 28, 2020

k8s-ci-robot removed the triage/unresolved Indicates an issue that can not or will not be resolved. label Feb 6, 2020

thockin added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Apr 2, 2020

k8s-ci-robot assigned MrHohn Apr 2, 2020

sparkoo mentioned this issue Apr 23, 2020

Services are not removed after workspace stop eclipse-che/che#16610

Closed

5 tasks

sparkoo mentioned this issue Apr 27, 2020

Multiple service delete with Foreground propagation policy results in hanging services #90512

Closed

MrHohn mentioned this issue May 5, 2020

Move IP de-allocation logic for Service into AfterDeleteFunc #90753

Closed

MrHohn mentioned this issue Jul 17, 2020

Skip repairing clusterIP and nodeport if a service is being deleted #93199

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 22, 2020

BenTheElder mentioned this issue Jul 28, 2020

CRD is hanging while deleting with "foregroundDeletion" policy kubernetes-sigs/kind#1755

Closed

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 28, 2020

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 26, 2020

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 26, 2020

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 24, 2021

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 24, 2021

cbandy mentioned this issue Feb 24, 2021

Endpoints being deleted twice #99407

Closed

chrischdi mentioned this issue May 18, 2021

REQUEST: New membership for chrischdi kubernetes/org#2717

Closed

6 tasks

aojea mentioned this issue Jun 17, 2021

[WIP] Fix services deletion with finalizers #102955

Closed

k8s-ci-robot assigned aojea Jun 17, 2021

k8s-ci-robot assigned thockin and unassigned aojea Jul 1, 2021

thockin mentioned this issue Sep 2, 2021

Simplify and de-layer Service REST implementation #96684

Merged

pacoxu mentioned this issue Sep 5, 2021

Servcie cluster ip leaks #104769

Closed

k8s-ci-robot closed this as completed in #96684 Sep 11, 2021

chrischdi mentioned this issue Dec 13, 2021

REQUEST: New membership for chrischdi kubernetes/org#3141

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

service IPs and ports are not released when deleting a service via a finalizer-removing update #87603

service IPs and ports are not released when deleting a service via a finalizer-removing update #87603

chrischdi commented Jan 28, 2020 •

edited

Loading

chrischdi commented Jan 28, 2020

chrischdi commented Jan 28, 2020 •

edited

Loading

liggitt commented Jan 28, 2020 •

edited

Loading

k8s-ci-robot commented Jan 28, 2020

athenabot commented Jan 28, 2020

danwinship commented Feb 6, 2020

thockin commented Apr 2, 2020

MrHohn commented Apr 2, 2020

sparkoo commented Apr 23, 2020

MrHohn commented Apr 23, 2020

fejta-bot commented Jul 22, 2020

BenTheElder commented Jul 28, 2020

fejta-bot commented Oct 26, 2020

aojea commented Oct 26, 2020

fejta-bot commented Jan 24, 2021

chrischdi commented Jan 24, 2021

thockin commented Mar 4, 2021

tomerleib commented Apr 11, 2021 •

edited

Loading

lkoniecz commented Jun 12, 2021 •

edited

Loading

rvillane commented Jun 14, 2021

lkoniecz commented Jun 16, 2021 •

edited

Loading

aojea commented Jun 16, 2021

aojea commented Jun 17, 2021 •

edited

Loading

aojea commented Jun 17, 2021

aojea commented Jul 1, 2021

k8s-ci-robot commented Jul 1, 2021

collinjlesko commented Sep 8, 2021

service IPs and ports are not released when deleting a service via a finalizer-removing update #87603

service IPs and ports are not released when deleting a service via a finalizer-removing update #87603

Comments

chrischdi commented Jan 28, 2020 • edited Loading

chrischdi commented Jan 28, 2020

chrischdi commented Jan 28, 2020 • edited Loading

liggitt commented Jan 28, 2020 • edited Loading

k8s-ci-robot commented Jan 28, 2020

athenabot commented Jan 28, 2020

danwinship commented Feb 6, 2020

thockin commented Apr 2, 2020

MrHohn commented Apr 2, 2020

sparkoo commented Apr 23, 2020

MrHohn commented Apr 23, 2020

fejta-bot commented Jul 22, 2020

BenTheElder commented Jul 28, 2020

fejta-bot commented Oct 26, 2020

aojea commented Oct 26, 2020

fejta-bot commented Jan 24, 2021

chrischdi commented Jan 24, 2021

thockin commented Mar 4, 2021

tomerleib commented Apr 11, 2021 • edited Loading

lkoniecz commented Jun 12, 2021 • edited Loading

rvillane commented Jun 14, 2021

lkoniecz commented Jun 16, 2021 • edited Loading

aojea commented Jun 16, 2021

aojea commented Jun 17, 2021 • edited Loading

aojea commented Jun 17, 2021

aojea commented Jul 1, 2021

k8s-ci-robot commented Jul 1, 2021

collinjlesko commented Sep 8, 2021

chrischdi commented Jan 28, 2020 •

edited

Loading

chrischdi commented Jan 28, 2020 •

edited

Loading

liggitt commented Jan 28, 2020 •

edited

Loading

tomerleib commented Apr 11, 2021 •

edited

Loading

lkoniecz commented Jun 12, 2021 •

edited

Loading

lkoniecz commented Jun 16, 2021 •

edited

Loading

aojea commented Jun 17, 2021 •

edited

Loading