Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

concurrent.Session closed immediately after creation #13675

Closed
Tracked by #13894
liberize opened this issue Feb 9, 2022 · 4 comments
Closed
Tracked by #13894

concurrent.Session closed immediately after creation #13675

liberize opened this issue Feb 9, 2022 · 4 comments
Assignees
Labels

Comments

@liberize
Copy link

liberize commented Feb 9, 2022

What happened?

I have an etcd cluster of 3 nodes. each node started like this:

/opt/etcd/etcd \
    --name node-1 \
    --data-dir /data/etcd/node-1.etcd \
    --listen-client-urls http://172.25.21.36:2379 \
    --advertise-client-urls http://172.25.21.36:2379 \
    --initial-advertise-peer-urls http://172.25.21.36:2380 \
    --listen-peer-urls http://172.25.21.36:2380 \
    --initial-cluster-token etcd-cluster \
    --initial-cluster node-0=http://172.25.23.46:2380,node-1=http://172.25.21.36:2380,node-2=http://172.25.23.15:2380 \
    --initial-cluster-state new

Then I use this code to create a lot of clients:

package main

import (
	"fmt"
	"github.com/sirupsen/logrus"
	clientv3 "go.etcd.io/etcd/client/v3"
	"go.etcd.io/etcd/client/v3/concurrency"
	"syscall"
	"time"
)

func checkError(err error, msg string) {
	if err != nil {
		panic(fmt.Sprintf("%v. error=`%v`", msg, err))
	}
}

func newEtcdClient(etcdHosts []string) *concurrency.Session {
	var err error
	etcdClient, err := clientv3.New(clientv3.Config{
		Endpoints:   etcdHosts,
		DialTimeout: 5 * time.Second,
	})
	checkError(err, "create etcd client failed")
	etcdSession, err := concurrency.NewSession(etcdClient, concurrency.WithTTL(10))
	checkError(err, "create etcd session failed")
	logrus.WithField("lease", etcdSession.Lease()).Info("etcd session created")
	go func() {
		select {
		case <-etcdSession.Done():
			logrus.WithField("lease", etcdSession.Lease()).Panic("etcd session disconnected")
		}
	}()
	return etcdSession
}

func main()  {
	var rLimit syscall.Rlimit
	rLimit.Max = 999999
	rLimit.Cur = 999999
	err := syscall.Setrlimit(syscall.RLIMIT_NOFILE, &rLimit)
	if err != nil {
		fmt.Println("Error Setting Rlimit ", err)
	}

	etcdHosts := []string{"172.25.21.36:2379","172.25.23.46:2379","172.25.23.15:2379"}
	taskCh := make(chan int, 16)
	exitCh := make(chan struct{}, 0)
	for i := 0; i < 2; i++ {
		go func() {
			for {
				select {
				case k := <-taskCh:
					logrus.WithField("n", k).Info("create client")
					newEtcdClient(etcdHosts)
				case <-exitCh:
					return
				}
			}
		}()
	}
	for i := 0; i < 10000; i++ {
		taskCh <- i
	}
}

It panics every time with only 10-100 clients:

...

INFO[0000] create client                                 n=102
INFO[0000] etcd session created                          lease=6232558402277502693
INFO[0000] create client                                 n=103
INFO[0000] etcd session created                          lease=6485322931366268654
INFO[0000] create client                                 n=104
INFO[0001] etcd session created                          lease=3530117151077574211
INFO[0001] create client                                 n=105
PANI[0001] etcd session disconnected                     lease=3530117151077574211

goroutine 2278 [running]:
github.com/sirupsen/logrus.Entry.log(0xc0000b6000, 0xc004469e60, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /root/go/pkg/mod/github.com/sirupsen/logrus@v1.6.0/entry.go:259 +0x335
github.com/sirupsen/logrus.(*Entry).Log(0xc003f24fc0, 0x0, 0xc005de8fb8, 0x1, 0x1)
        /root/go/pkg/mod/github.com/sirupsen/logrus@v1.6.0/entry.go:287 +0xeb
github.com/sirupsen/logrus.(*Entry).Panic(0xc003f24fc0, 0xc00528dfb8, 0x1, 0x1)
        /root/go/pkg/mod/github.com/sirupsen/logrus@v1.6.0/entry.go:325 +0x55
main.newEtcdClient.func1(0xc004ec5cb0)
        /mnt/e/Service/m-isvs/tools/etcd_bug_reproduce/reproduce_example.go:31 +0xdc
created by main.newEtcdClient
        /mnt/e/Service/m-isvs/tools/etcd_bug_reproduce/reproduce_example.go:28 +0x229

What did you expect to happen?

The code above runs without problem.

How can we reproduce it (as minimally and precisely as possible)?

Use the code pasted above.

Anything else we need to know?

It seems the server returned a keep alive response with TTL = 0. Then the channel returned by Client.KeepAlive was closed.

https://github.com/etcd-io/etcd/blob/main/client/v3/lease.go#L512

Wireshark proves that:

image

172.25.21.36 is leader node:

image

seems like a consistent issue?

Etcd version (please run commands below)

$ etcd --version
etcd Version: 3.5.2
Git SHA: 99018a77b
Go Version: go1.16.3
Go OS/Arch: linux/amd64

$ etcdctl version
etcdctl version: 3.5.2
API version: 3.5

Etcd configuration (command line flags or environment variables)

node-0:

/opt/etcd/etcd \
    --name node-0 \
    --data-dir /data/etcd/node-0.etcd \
    --listen-client-urls http://172.25.23.46:2379 \
    --advertise-client-urls http://172.25.23.46:2379 \
    --initial-advertise-peer-urls http://172.25.23.46:2380 \
    --listen-peer-urls http://172.25.23.46:2380 \
    --initial-cluster-token etcd-cluster \
    --initial-cluster node-0=http://172.25.23.46:2380,node-1=http://172.25.21.36:2380,node-2=http://172.25.23.15:2380 \
    --initial-cluster-state new

node-1:

/opt/etcd/etcd \
    --name node-1 \
    --data-dir /data/etcd/node-1.etcd \
    --listen-client-urls http://172.25.21.36:2379 \
    --advertise-client-urls http://172.25.21.36:2379 \
    --initial-advertise-peer-urls http://172.25.21.36:2380 \
    --listen-peer-urls http://172.25.21.36:2380 \
    --initial-cluster-token etcd-cluster \
    --initial-cluster node-0=http://172.25.23.46:2380,node-1=http://172.25.21.36:2380,node-2=http://172.25.23.15:2380 \
    --initial-cluster-state new

node-2:

/opt/etcd/etcd \
    --name node-2 \
    --data-dir /data/etcd/node-2.etcd \
    --listen-client-urls http://172.25.23.15:2379 \
    --advertise-client-urls http://172.25.23.15:2379 \
    --initial-advertise-peer-urls http://172.25.23.15:2380 \
    --listen-peer-urls http://172.25.23.15:2380 \
    --initial-cluster-token etcd-cluster \
    --initial-cluster node-0=http://172.25.23.46:2380,node-1=http://172.25.21.36:2380,node-2=http://172.25.23.15:2380 \
    --initial-cluster-state new

Etcd debug information (please run commands blow, feel free to obfuscate the IP address or FQDN in the output)

$ etcdctl member list -w table
+------------------+---------+--------+--------------------------+--------------------------+------------+
|        ID        | STATUS  |  NAME  |        PEER ADDRS        |       CLIENT ADDRS       | IS LEARNER |
+------------------+---------+--------+--------------------------+--------------------------+------------+
| deb864ebfb3ed67e | started | node-1 | http://172.25.21.36:2380 | http://172.25.21.36:2379 |      false |
| e495e97c1553b0fd | started | node-2 | http://172.25.23.15:2380 | http://172.25.23.15:2379 |      false |
| f424e2551763da00 | started | node-0 | http://172.25.23.46:2380 | http://172.25.23.46:2379 |      false |
+------------------+---------+--------+--------------------------+--------------------------+------------+

$ etcdctl --endpoints=<member list> endpoint status -w table
+-------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|     ENDPOINT      |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+-------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 172.25.21.36:2379 | deb864ebfb3ed67e |   3.5.2 |   46 MB |      true |      false |         5 |     318171 |             318171 |        |
| 172.25.23.46:2379 | f424e2551763da00 |   3.5.2 |   46 MB |     false |      false |         5 |     318171 |             318171 |        |
| 172.25.23.15:2379 | e495e97c1553b0fd |   3.5.2 |   46 MB |     false |      false |         5 |     318171 |             318171 |        |
+-------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

Relevant log output

No response

@ahrtr ahrtr self-assigned this Feb 9, 2022
@ahrtr
Copy link
Member

ahrtr commented Feb 9, 2022

Thanks @liberize for the finding. It's a real issue, and I reproduced it in my local environment on both 3.5.2 and the main branch.

It took me a couple of sleeping hours to figure out the root cause. The root cause is the leader hasn't finish applying the LeaseGrantRequest when it receives the following LeaseKeepAliveRequest request.

The rough workflow is something like below (assuming there are 3 members in the cluster),

  1. One follower receives a LeaseGrantRequest requests, then it forwards the request to the leader;
  2. The follower returns success to the client. Actually only the two followers finish the request for now;
  3. The leader receives a LeaseKeepAliveRequest request on the lease. It hasn't finish the previous LeaseGrantRequest, so the returns ErrLeaseNotFound, see lessor.go#L404. Accordingly the client receives a TTL with 0, see lease.go#L138

Since the 3 members are all running on one local server, so the communication between members is really fast. The leader finishes applying the LeaseGrantRequest just about 1~2 microseconds right after it receives the LeaseKeepAliveRequest.

In a real production environment, it rarely happens. But it indeed is a real issue.

I think it should be a generic issue. When etcdserver processes the raft message coming from raft, it applies the change locally in parallel with replicating to the followers. See L212-L224. So in theory, it's possible that the leader hasn't finish applying the request before the client receives a SUCCESS response and sends next request.

Any thoughts on the solution, or do you think there is no need to fix this since it is unlikely reproduced in real production environment? cc @ptabor @serathius @spzala @xiang90 @gyuho

@liberize
Copy link
Author

fixed by #13690.

@ahrtr
Copy link
Member

ahrtr commented Feb 12, 2022

Let's keep this issue open until the PR is reviewed & merged.

@serathius
Copy link
Member

Fix was merged, will be release in v3.5.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

3 participants