-
Notifications
You must be signed in to change notification settings - Fork 937
GTID not found properly (5.7) and some graceful-master-takeover issues #78
Comments
Can you please issue: select @@global.gtid_mode, @@global.gtid_purged on your master?
That makes sense. We need to solve the GTID recognition problem.
By default I apologize for the inconvenient name. I'll be working to minimize the number of configuration params.
As per #57:
The problem is
This can be easily scripted on the user's side. I really think that in the event of planned takeover the user should choose the identity of the new master. If To this end, when things go bad, |
|
looking into!
It cannot. You cannot reveal the password by
More than anything, I'd appreciate help with documentation! |
@shlomi-noach: slave username and password information are available via
So in theory you could try these credentials. However, depending on existing grants this may or many not work as expected as the grant may be for It may be worth having an option to try or check the configuration but specific site configs may vary. For what it's worth for planned topology changes (of the master) I don't use orchestrator but custom scripts. This gives a bit more control and reduces downtime, but orchestrator is nearly always used manually both before and afterwards to arrange the topology as needed to minimise the impact of the master changeover. I guess I could use orchestrator and most of what's described here is what I do already but I have more freedom to check stuff both before and afterwards which makes me feel more comfortable. Maybe I need to look again at how well orchestrator handles this task as it simplifies things if the amount of software used is reduced. |
That's right, orchestrator user has select permissions on slave_master_info and the information is there, just need to read it and use it. If you expect that orchestrator executes this task cleanly, the user should ensure that replication user has permissions in all the nodes involved. |
Reading credentials from |
Hello, just wanted to report the same issue with GTID based replication (Percona 5.7.16) not being recognized on a simple master - 4 slaves topology. On the master :
|
@fuyar thank you! |
no problem @shlomi-noach :) Seems like Orchestrator was finally able to detect GTIDs on the 4 slaves (I rechecked this morning while doing nothing previously). oracle_gtid: 0 still for the master in the 'database_instance' table but yeah as the master is not a slave of anyone it should be ok I suppose ? |
|
OK, have taken a closer look into GTID recoveries:
|
@ecortestws my last comment suggests that:
is wrong. Are you able to show that the recovery was not based on GTID? I do mean it's a completely valid assumption on your side, but I believe is incorrect. The logs actually specify the type of recovery. Look for:
I realize that was |
Hi @shlomi-noach: |
@ecortestws thank you. Then, indeed, |
Applying replication-credentials on demoted master is addressed by #93 |
Hi @shlomi-noach, |
@ecortestws Perfect timing. I am setting up an environment for this now. |
@ecortestws can you confirm your servers are Percona Server? If so, this is identified in #96 and solved via #98 (no release yet) My current GTID testing environment is happily identifying GTID topologies. #106 makes the web interface recognize a GTID master as "using GTID" -- but this is a visualization matter only; recoveries are using a lower level logic. |
@shlomi-noach my servers are Oracle MySQL. Will try the new release and let you know. |
@shlomi-noach I have tested it but it didn't work as expected.
|
The web interface now shows GTID enabled in the master. |
@ecortestws thank you. Are you again looking at a I'll run some more checks and may come back with more questions. |
@shlomi-noach yes, the same approach, the same topology. Moved C from A to B before the takeover, and verified that the replication chain was healthy. I have all the logs, let me know if you need anything else. I understand that #93 hasn't been merged yet, so the issue with the credentials after the takeover is expected. Thanks. |
#93 is now merged |
@ecortestws I'm happy if you can share the logs. If they contain sensitive data, can you please share them with me via email? My address is shlomi-noach@-youknowhichcompany-.com |
OK I'm able to reproduce this. The reason this happens: the |
@ecortestws can you please confirm https://github.com/github/orchestrator/releases/tag/v2.1.1-BETA works for you? Make sure that the replicas are on |
@shlomi-noach it worked, but the replication was not started in the demoted master. Is it a expected behavior? The credentials were in place, and after execute "START SLAVE" in the old master it started syncing with the new master. |
@ecortestws This is expected behavior. I see advantages and reasons for both starting and not starting replication automatically; "not starting" is on the safer side. |
Hi,
I am testing orchestrator with 5.7.17, Master and two slaves. Have moved one of the slaves to change the topology like A-B-C and then executed orchestrator -c graceful-master-takeover -alias myclusteralias
The issues found are:
mysql
.slave_master_info
in the cluster).Thanks for this amazing tool!
Regards,
Eduardo
The text was updated successfully, but these errors were encountered: