-
Notifications
You must be signed in to change notification settings - Fork 721
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pd can not select new leader which lead qps drop to zero when inject pdleader io delay 500ms last for 5mins #7251
Labels
affects-7.3
affects-7.4
affects-7.5
This bug affects the 7.5.x(LTS) versions.
affects-7.6
severity/major
type/bug
The issue is confirmed as a bug.
Comments
/type bug |
Lily2025
changed the title
pd can not select new leader when inject pdleader io delay 500ms last for 5mins
pd can not select new leader which lead qps drop to zero when inject pdleader io delay 500ms last for 5mins
Oct 27, 2023
SituationIt looks like pd0 kept trying to elect leader timed out for five minutes. Finally, pd2 is elected.
Solutionmaybe we need to maintain a slice record election times |
This was referenced Nov 13, 2023
ti-chi-bot bot
added a commit
that referenced
this issue
Nov 16, 2023
close #7251, ref #7377 when pd leader frequently campaign leader, but etcd leader did not change. We need to prevent this pd leader campaign and resign to another member. Signed-off-by: husharp <jinhao.hu@pingcap.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
/open |
The phenomenon still exists |
This was referenced Jan 12, 2024
Fixed by #7737. |
JmPotato
added
affects-7.3
affects-7.4
affects-7.5
This bug affects the 7.5.x(LTS) versions.
labels
Jan 30, 2024
ti-chi-bot bot
added a commit
that referenced
this issue
Feb 2, 2024
close #7251, ref #7377 when pd leader frequently campaign leader, but etcd leader did not change. We need to prevent this pd leader campaign and resign to another member. Signed-off-by: husharp <jinhao.hu@pingcap.com> Co-authored-by: husharp <jinhao.hu@pingcap.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
18 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
affects-7.3
affects-7.4
affects-7.5
This bug affects the 7.5.x(LTS) versions.
affects-7.6
severity/major
type/bug
The issue is confirmed as a bug.
Bug Report
What did you do?
1、run workload
2、inject pdleader io delay 500ms last for 5mins
apiVersion: chaos-mesh.org/v1alpha1
kind: IOChaos
metadata:
name: kv-timeout-data
namespace: testbed-xxx
spec:
action: latency
mode: one
selector:
namespaces:
- testbed-xxx
labelSelectors:
statefulset.kubernetes.io/pod-name: tc-pd-0
volumePath: /var/lib/pd
path: "/var/lib/pd/data/**/*"
delay: "500ms"
percent: 100
duration: "300s"
What did you expect to see?
1、pd can select new leader when inject pdleader io delay 500ms last for 5mins
2、qps can recover within 2mins when inject pdleader io delay 500ms
What did you see instead?
pd can not select new leader when inject pdleader io delay 500ms last for 5mins
write qps drop to zero
What version of PD are you using (
pd-server -V
)?./pd-server -V
Release Version: v6.5.0-nightly
Edition: Community
Git Commit Hash: 77d6f5b
Git Branch: heads/refs/tags/v6.5.0-nightly
UTC Build Time: 2023-10-07 10:21:09
2023-10-19T09:33:53.650+0800
The text was updated successfully, but these errors were encountered: