client-go memory leak #102718

xiaoanyunfei · 2021-06-09T03:49:01Z

What happened:

We use client-go informerFactory to handle some pod information,

informerFactory.Core().V1().Pods().Informer().AddEventHandler(
cache.ResourceEventHandlerFuncs{
AddFunc: addPodToCache,
UpdateFunc: updatePodInCache,
DeleteFunc: deletePodFromCache,
})

cache := make(map[string]*v1.Pod)
addPodToCache -> AddPod(pod *v1.Pod) {
key, _ := framework.GetPodKey(pod)
cache[key]=pod
}

deletePodFromCache->RemovePod(pod *v1.Pod) error{
key, _ := framework.GetPodKey(pod)
delete(cache, key)
}

But we came across memory leak problem recently.

I add some log, and finally find the reason.

Our k8s version is 1.17 with more than 5500 node, but the default-watch-cache-size of kube-apiserver is 100 which is too small for our cluster.
(The last k8s already use dynamic size watch-cache ##90058)

Reflector.ListAndWatch

Reflector List function costs more than 35 second, and then Reflector use the last resourceVersion of List to call Watch function. Reflector Watch get an error too old resource version: 6214379869 (6214383056) because the last resourceVersion of List has been expired in default-watch-cache-size of kube-apiserver

Reflect.Run will keep the cycle List -> Watch - get too old resource version error

Then a memory leak occurred. We use cache map to store pod information, and the cache will contain pods in different PodList. This will prevent golang from gc the whole PodList

The code causing memory leak is meta.EachListItem
https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apimachinery/pkg/api/meta/help.go#L115

Replacing found = append(found, item) with found = append(found, item.DeepCopyObject()) can fix the problem.
https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/tools/cache/reflector.go#L453

What you expected to happen:

Calling DeepCopyObject has some memory overhead,
I wonder if there is a better solution for this.

How to reproduce it (as minimally and precisely as possible):

import (
        "fmt"
        goruntime "runtime"
        "time"

        v1 "k8s.io/api/core/v1"
        "k8s.io/apimachinery/pkg/api/meta"
        metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
        "k8s.io/apimachinery/pkg/runtime"
)

func main() {
        leak()
}

func unleak() {
        store := make(map[string]*v1.Pod)
        var index int
        for {
                plist := generatePod(index)
                for _, pod := range plist.Items {
                        pod := pod
                        store[pod.Name] = &pod
                }
                time.Sleep(time.Second * 2)
                index++
                goruntime.GC()
                fmt.Println("==unleak==", index, len(store))
        }
}

func leak() {
        store := make(map[string]*v1.Pod)
        var index int
        var items []runtime.Object
        for {
                items = items[:0]
                plist := generatePod(index)
                meta.EachListItem(plist, func(obj runtime.Object) error {
                        items = append(items, obj)
                        return nil
                })
                for _, item := range items {
                        pod := item.(*v1.Pod)
                        store[pod.Name] = pod
                }
                time.Sleep(time.Second * 2)
                index++
                goruntime.GC()
                fmt.Println("==leak==", index, len(store))
        }
}

func generatePod(num int) *v1.PodList {
        var plist v1.PodList
        for i := 0; i < 100000-num; i++ {
                pod := v1.Pod{
                        ObjectMeta: metav1.ObjectMeta{
                                Name: fmt.Sprintf("pod-%d", i),
                        },
                }
                plist.Items = append(plist.Items, pod)
        }
        return &plist
}

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version):
Cloud provider or hardware configuration:
OS (e.g: cat /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
Network plugin and version (if this is a network-related bug):
Others:

The text was updated successfully, but these errors were encountered:

xiaoanyunfei · 2021-06-09T03:50:07Z

/sig api-machinery

fedebongio · 2021-06-10T20:12:41Z

/help
/triage accepted

k8s-ci-robot · 2021-06-10T20:12:43Z

@fedebongio:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help
/triage accepted

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

novahe · 2021-06-12T12:25:08Z

I think this is the same problem issue: #102564 pr: #102565

novahe · 2021-06-12T12:43:59Z

I format code used for reproduction

package main

import (
	"fmt"
	goruntime "runtime"
	"time"

	v1 "k8s.io/api/core/v1"
	"k8s.io/apimachinery/pkg/api/meta"
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
	"k8s.io/apimachinery/pkg/runtime"
)

func main() {
	leak()
}

func unleak() {
	store := make(map[string]*v1.Pod)
	var index int
	for {
		plist := generatePod(index)
		for _, pod := range plist.Items {
			store[pod.Name] = &pod
		}
		time.Sleep(time.Second * 2)
		index++
		goruntime.GC()
		fmt.Println("==unleak==", index, len(store))
	}
}

func leak() {
	store := make(map[string]*v1.Pod)
	var index int
	var items []runtime.Object
	for {
		items = items[:0]
		plist := generatePod(index)
		meta.EachListItem(plist, func(obj runtime.Object) error {
			items = append(items, obj)
			return nil
		})
		for _, item := range items {
			pod := item.(*v1.Pod)
			store[pod.Name] = pod
		}
		time.Sleep(time.Second * 2)
		index++
		goruntime.GC()
		fmt.Println("==leak==", index, len(store))
	}
}

func generatePod(num int) *v1.PodList {
	var plist v1.PodList
	for i := 0; i < 100000-num; i++ {
		pod := v1.Pod{
			ObjectMeta: metav1.ObjectMeta{
				Name: fmt.Sprintf("pod-%d", i),
			},
		}
		plist.Items = append(plist.Items, pod)
	}
	return &plist
}

249043822 · 2021-06-14T08:47:24Z

Cc @jpbetz @wojtek-t

wojtek-t · 2021-06-14T12:23:47Z

Re code above - the leak is not coming from the ForEach, but the fact that you're not really releasing all the references here:

items = items[:0]

Going back to the original comment, I don't think we have a problem with ForEach function:

100 which is too small for our cluster.

you should adjust watchcache size - 100 items in 5k-node cluster is certainly enough. There was a function that adjusts it on number of nodes, but it was based on some flag (apiserver-ram-mb or sth like that), which I bet you don't set.

Then a memory leak occurred. We use cache map to store pod information, and the cache will contain pods in different PodList. This will prevent golang from gc the whole PodList

I don't understand this comment. If you're storing the data in cache, then it's expected that those will not be GC-ed. That's the whole point in storing them in cache, right?

sxllwx · 2021-06-14T15:23:20Z

The leak here is caused by the golang slice gc mechanism, and there is no memory leak in the for...range mode because of the copy mechanism of v in for_, v := range slice. If the unleak function is changed to the following code (equivalent to the mode of using reflect in meta.EachListItem), there will be an incorrect situation where the RES value monotonously increases. I suggest adding DeepCopy to meta.EachListItem to alleviate this situation.

func unleak() {
	store := make(map[string]*v1.Pod)
	var index int
	for {
		plist := generatePod(index)
		for i := 0; i < len(plist.Items); i++  {
			store[plist.Items[i].GetName()] = &plist.Items[i]
			//store[pod.Name] = &pod
		}
		time.Sleep(time.Second * 2)
		index++
		goruntime.GC()
		fmt.Println("==unleak==", index, len(store))
	}
}

@wojtek-t PTAL

xiaoanyunfei · 2021-06-15T06:19:23Z

@sxllwx is right, The leak here is caused by the golang slice gc mechanism.
Once we get address of slice element the whole slice can not be gc.
meta.EachListItem has the logic of getting address of slice element.
100000 pods can not use 6.9G memory, only about 300-500M

I add a new test case, and the store caches different Generation pod list.

import (
        "fmt"
        goruntime "runtime"
        "time"

        v1 "k8s.io/api/core/v1"
        metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

func main() {
        leak()
}

func leak() {
        store := make(map[string]*v1.Pod)
        var index int
        for {
                plist := generatePod(index)
                for index, pod := range plist {
                        store[pod.Name] = &plist[index]
                }
                time.Sleep(time.Second * 2)
                index++
                goruntime.GC()
                fmt.Println("==leak==", index, len(store))
        }
}

func generatePod(num int) []v1.Pod {
        var items []v1.Pod
        for i := 0; i < 100000-num; i++ {
                pod := v1.Pod{
                        ObjectMeta: metav1.ObjectMeta{
                                Name:       fmt.Sprintf("pod-%d", i),
                                Generation: int64(num),
                        },
                }
                items = append(items, pod)
        }
        return items
}

sxllwx · 2021-06-15T06:32:37Z

@sxllwx is right, The leak here is caused by the golang slice gc mechanism.
Once we get address of slice element the whole slice can not be gc.
meta.EachListItem has the logic of getting address of slice element.

I add a new test case, and the store caches different Generation pod list.

import (
        "fmt"
        goruntime "runtime"
        "time"

        v1 "k8s.io/api/core/v1"
        metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

func main() {
        leak()
}

func leak() {
        store := make(map[string]*v1.Pod)
        var index int
        for {
                plist := generatePod(index)
                for index, pod := range plist {
                        store[pod.Name] = &plist[index]
                }
                time.Sleep(time.Second * 2)
                index++
                goruntime.GC()
                fmt.Println("==leak==", index, len(store))
        }
}

func generatePod(num int) []v1.Pod {
        var items []v1.Pod
        for i := 0; i < 100000-num; i++ {
                pod := v1.Pod{
                        ObjectMeta: metav1.ObjectMeta{
                                Name:       fmt.Sprintf("pod-%d", i),
                                Generation: int64(num),
                        },
                }
                items = append(items, pod)
        }
        return items
}

@xiaoanyunfei

There is already a PR that is working on a similar issue. The link for this PR discussion
#102565 (comment) PTAL

wojtek-t · 2021-06-15T13:43:38Z

OK - that makes sense to me. Sorry for confusion, I was looking to much at #102565 (comment) and I still think there is no bug in that PR, because we're not taking an address of slice item there, right? The runtime.Object are already pointers, right?

That said, the extractList that is happening before has this problem:

kubernetes/staging/src/k8s.io/apimachinery/pkg/api/meta/help.go

Line 169 in ea07644

func ExtractList(obj runtime.Object) ([]runtime.Object, error) {

@liggitt @deads2k - FYI

sxllwx · 2021-06-15T15:36:47Z

After testing, only in two cases will affect the work of GC

store[plist.Items[i].GetName()] = &plist.Items[i]
items.Index(i).Addr().Interface()

The two functions with this kind of operation are

meta.EachListItem

kubernetes/staging/src/k8s.io/apimachinery/pkg/api/meta/help.go

Line 143 in ea07644

raw = raw.Addr()
meta.EachListItem

kubernetes/staging/src/k8s.io/apimachinery/pkg/api/meta/help.go

Line 196 in ea07644

if list[i], found = raw.Addr().Interface().(runtime.Object); !found {

What we need to note is that the corev1.Pod DeepCopyObject method requires a Pointer Receiver.

But in the PodList obtained from kube-apiserver, the Items field is a non-pointer type Pod-struct

kubernetes/staging/src/k8s.io/api/core/v1/types.go

Lines 3721 to 3731 in 0bc75af

    
           type PodList struct { 
        
           	metav1.TypeMeta `json:",inline"` 
        
           	// Standard list metadata. 
        
           	// More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds 
        
           	// +optional 
        
           	metav1.ListMeta `json:"metadata,omitempty" protobuf:"bytes,1,opt,name=metadata"` 
        
           	// List of pods. 
        
           	// More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md 
        
           	Items []Pod `json:"items" protobuf:"bytes,2,rep,name=items"` 
        
           }

So

kubernetes/staging/src/k8s.io/apimachinery/pkg/api/meta/help.go

Line 196 in ea07644

if list[i], found = raw.Addr().Interface().(runtime.Object); !found {

will be called every time

sxllwx · 2021-06-27T06:00:26Z

@liggitt @deads2k Is there anything I can do to help this move forward?

k8s-triage-robot · 2021-09-25T06:24:22Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

yuvaldolev · 2021-10-18T06:23:25Z

/assign

yuvaldolev · 2021-10-18T06:24:10Z

/remove-lifecycle stale

k8s-triage-robot · 2022-01-16T06:35:27Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2022-02-15T06:46:04Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot · 2022-03-17T07:20:25Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen
Mark this issue or PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot · 2022-03-17T07:20:43Z

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen

Mark this issue or PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

xigang · 2022-09-08T12:27:05Z

@deads2k @wojtek-t @sxllwx we also had a kube-controller-manager memory leak, suspected to be related to client-go.
Our cluster has 5600 nodes and kube-controller-manager memory usage 197.5GiB:

After restarting kube-controller-manager, the memory usage is only 15GiB

kubernetes version: v1.17.4

xiaoanyunfei added the kind/bug Categorizes issue or PR as related to a bug. label Jun 9, 2021

k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 9, 2021

k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 9, 2021

This was referenced Jun 9, 2021

improve DefaultWatchErrorHandler log level #102719

Closed

use uid first for MetaNamespaceKeyFunc #102721

Closed

novahe mentioned this issue Jun 12, 2021

Improve the efficiency of GC by reducing the reference to the original data #102565

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 25, 2021

k8s-ci-robot assigned yuvaldolev Oct 18, 2021

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 18, 2021

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 16, 2022

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 15, 2022

k8s-ci-robot closed this as completed Mar 17, 2022

xigang mentioned this issue Sep 8, 2022

kube-controller-manager memory leak #112319

Closed

sxllwx mentioned this issue Oct 26, 2022

Faster ExtractList. Add ExtractListWithAlloc variant. #113362

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

client-go memory leak #102718

client-go memory leak #102718

xiaoanyunfei commented Jun 9, 2021 •

edited

Loading

xiaoanyunfei commented Jun 9, 2021

fedebongio commented Jun 10, 2021

k8s-ci-robot commented Jun 10, 2021

novahe commented Jun 12, 2021

novahe commented Jun 12, 2021

249043822 commented Jun 14, 2021

wojtek-t commented Jun 14, 2021

sxllwx commented Jun 14, 2021 •

edited

Loading

xiaoanyunfei commented Jun 15, 2021 •

edited

Loading

sxllwx commented Jun 15, 2021 •

edited

Loading

wojtek-t commented Jun 15, 2021

sxllwx commented Jun 15, 2021 •

edited

Loading

sxllwx commented Jun 27, 2021

k8s-triage-robot commented Sep 25, 2021

yuvaldolev commented Oct 18, 2021

yuvaldolev commented Oct 18, 2021

k8s-triage-robot commented Jan 16, 2022

k8s-triage-robot commented Feb 15, 2022

k8s-triage-robot commented Mar 17, 2022

k8s-ci-robot commented Mar 17, 2022

xigang commented Sep 8, 2022 •

edited

Loading

client-go memory leak #102718

client-go memory leak #102718

Comments

xiaoanyunfei commented Jun 9, 2021 • edited Loading

What happened:

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

xiaoanyunfei commented Jun 9, 2021

fedebongio commented Jun 10, 2021

k8s-ci-robot commented Jun 10, 2021

novahe commented Jun 12, 2021

novahe commented Jun 12, 2021

249043822 commented Jun 14, 2021

wojtek-t commented Jun 14, 2021

sxllwx commented Jun 14, 2021 • edited Loading

xiaoanyunfei commented Jun 15, 2021 • edited Loading

sxllwx commented Jun 15, 2021 • edited Loading

wojtek-t commented Jun 15, 2021

sxllwx commented Jun 15, 2021 • edited Loading

sxllwx commented Jun 27, 2021

k8s-triage-robot commented Sep 25, 2021

yuvaldolev commented Oct 18, 2021

yuvaldolev commented Oct 18, 2021

k8s-triage-robot commented Jan 16, 2022

k8s-triage-robot commented Feb 15, 2022

k8s-triage-robot commented Mar 17, 2022

k8s-ci-robot commented Mar 17, 2022

xigang commented Sep 8, 2022 • edited Loading

xiaoanyunfei commented Jun 9, 2021 •

edited

Loading

sxllwx commented Jun 14, 2021 •

edited

Loading

xiaoanyunfei commented Jun 15, 2021 •

edited

Loading

sxllwx commented Jun 15, 2021 •

edited

Loading

sxllwx commented Jun 15, 2021 •

edited

Loading

xigang commented Sep 8, 2022 •

edited

Loading