-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
domain: fast new a etcd session when the session is stale in the schemaVersionSyncer #7774
Conversation
@@ -88,7 +90,7 @@ type SchemaSyncer interface { | |||
type schemaVersionSyncer struct { | |||
selfSchemaVerPath string | |||
etcdCli *clientv3.Client | |||
session *concurrency.Session | |||
session unsafe.Pointer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
session is accessed in multi-thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you talk about it in detail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
loadSnapshotInfoSchemaIfNeeded will use the session, and loadSchemaLoop will too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha.
@zimulala @crazycs520 @shenli PTAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@zimulala PTAL. |
err := do.mustRestartSyncer() | ||
if err != nil { | ||
log.Errorf("[ddl] reload schema in loop, schema syncer restart err %v", errors.ErrorStack(err)) | ||
break | ||
} | ||
do.SchemaValidator.Restart() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to add metrics here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we already have metrics.NewSessionHistogram and metrics.DeploySyncerHistogram, I think there is no need to add here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it.
/run-all-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -88,7 +90,7 @@ type SchemaSyncer interface { | |||
type schemaVersionSyncer struct { | |||
selfSchemaVerPath string | |||
etcdCli *clientv3.Client | |||
session *concurrency.Session | |||
session unsafe.Pointer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha.
Fix issue that when the pd leader is down, the etcd session will be stale, and tidb may costs several minutes to create a new session to the etcd in some case.
I have done a manual test using ansible, with inventory.ini:
There are 2 pd, 1 tidb, 1 tikv in the test cluster, I start a client which continue inserting data into tidb, and I kill -9 the pd leader, we can see the tidb report errors:
After the pd recover:
2018/09/28 11:54:59.562 leader.go:269: [info] PD cluster leader pd1 is ready to serve
, the etcd session is recovered:It costs several millseconds to recover.