Skip to content
This repository has been archived by the owner on Sep 30, 2024. It is now read-only.

MySQL connection is full, orchestrator does not switch #1015

Open
liuqian1990 opened this issue Nov 12, 2019 · 11 comments
Open

MySQL connection is full, orchestrator does not switch #1015

liuqian1990 opened this issue Nov 12, 2019 · 11 comments

Comments

@liuqian1990
Copy link

MySQL connection is full,orchestrator display mysql Master down,but does not switch

@yangeagle
Copy link
Contributor

yangeagle commented Nov 12, 2019

There are two type connections: connection for probe instance and connection for topology change.

The connection for topology change is only established if there is a fault.

When mysql connection is full, orchestrator can't connect to the mysql master.

Without mysql connection, orchestrator can do nothing

ignore this.

@liuqian1990
Copy link
Author

@yangeagle How to operate like this problem

@shlomi-noach
Copy link
Collaborator

orchestrator should actually handle this situation in some cases. What is the analysis orchestrator shows? Are all replicas happy and are in fact replicating?

@shlomi-noach
Copy link
Collaborator

@liuqian1990
Copy link
Author

@shlomi-noach cc

@yangeagle
Copy link
Contributor

@yangeagle How to operate like this problem

In my opinion, this problem is currently not resolved. There may be ways in the future.

@shlomi-noach
Copy link
Collaborator

@liuqian1990 please try #1010

(you will need to build a binary based on that branch; you can use the dockerfiles to build a binary)

@yangeagle
Copy link
Contributor

https://github.com/github/orchestrator/pull/1010/files#diff-86933c5afedc1f1a02e011c53af7b039R1448 might be related

The PR is for UnreachableMasterWithLaggingReplicas.

The problem in this issue is UnreachableMaster maybe.

If we do the same thing for UnreachableMaster, the problem may be resolved.

@shlomi-noach
Copy link
Collaborator

But if you notice the link I pasted earlier: https://github.com/github/orchestrator/pull/1010/files#diff-86933c5afedc1f1a02e011c53af7b039R1448

as part of the PR there is a fix to what could be a concurrency iteration problem.

@yangeagle
Copy link
Contributor

yangeagle commented Nov 14, 2019

I mean: add emergentlyRestartReplicationOnTopologyInstanceReplicas to the code for UnreachableMaster:
https://github.com/github/orchestrator/blob/8ac16d37d2ff0aeffddb1f1525cf393c715c1a01/go/logic/topology_recovery.go#L1521

@shlomi-noach
Copy link
Collaborator

Wait, someone edited the original comment and now I am not sure what the problem looked like?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants