Skip to content
This repository has been archived by the owner on Sep 30, 2024. It is now read-only.

actualize Vagrant and fix problems on orchestrator-agent api with sqlite backend #445

Merged
merged 18 commits into from
Apr 25, 2018
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
conf/orchestrator.conf.json
.DS_Store
.vagrant/
vagrant/.sqlite
vagrant/admin-post-install.sh
vagrant/db-post-install.sh
vagrant/db1-post-install.sh
Expand All @@ -13,3 +14,7 @@ vagrant/vagrant-ssh-key.pub
Godeps/_workspace
.gopath/
main
.idea/
*.deb
*.pcap
*.log
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

4 changes: 2 additions & 2 deletions Vagrantfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@ Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
config.vm.box = BOX
config.vm.box_download_insecure = true
config.vm.box_check_update = false
config.vm.synced_folder '.', '/orchestrator', type: 'rsync',
rsync__auto: true
config.vm.synced_folder '.', '/orchestrator'
#, type: 'rsync', rsync__auto: true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please explain this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rsync__auto want work on my Vagrant environment ;(
when i change some sources on host OS, it's not rsynced into guest maching
ok. i rollback this change

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, please elaborate. Did you introduce rsync__auto in the first place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i try use following variants
1)

 config.vm.synced_folder '.', '/orchestrator', type: 'rsync'

it's require run vagrant reload when i change any file on my host machine
2)

 config.vm.synced_folder '.', '/orchestrator', type: 'rsync',  rsync__auto: true

it not worked, and i don't investigate why
then i just commented this options and default sharing mode over vboxfs works as i expected


(0..4).each do |n|
name = (n > 0 ? ("db" + n.to_s) : "admin")
Expand Down
16 changes: 9 additions & 7 deletions build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ function oinstall() {
cd $mydir
gofmt -s -w go/
rsync -qa ./resources $builddir/orchestrator${prefix}/orchestrator/
rsync -qa ./conf/orchestrator-sample.* $builddir/orchestrator${prefix}/orchestrator/
rsync -qa ./conf/orchestrator-sample*.conf.json $builddir/orchestrator${prefix}/orchestrator/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

cp etc/init.d/orchestrator.bash $builddir/orchestrator/etc/init.d/orchestrator
chmod +x $builddir/orchestrator/etc/init.d/orchestrator
}
Expand Down Expand Up @@ -134,12 +134,14 @@ function package() {
esac

echo "---"
if cat /etc/centos-release | grep 'CentOS release 6' ; then
rm ${TOPDIR:-?}/orchestrator*.deb
rm ${TOPDIR:-?}/orchestrator*.tar.gz
# n CentOD 6 box: we only want the rpms for CentOS6
# Add "-centos6" to the file name.
ls ${TOPDIR:-?}/*.rpm | while read f; do centos_file=$(echo $f | sed -r -e "s/^(.*)-${RELEASE_VERSION}(.*)/\1-centos6-${RELEASE_VERSION}\2/g") ; mv $f $centos_file ; done
if [[ -e /etc/centos-release ]]; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better: -f

if cat /etc/centos-release | grep 'CentOS release 6' ; then
rm ${TOPDIR:-?}/orchestrator*.deb
rm ${TOPDIR:-?}/orchestrator*.tar.gz
# n CentOD 6 box: we only want the rpms for CentOS6
# Add "-centos6" to the file name.
ls ${TOPDIR:-?}/*.rpm | while read f; do centos_file=$(echo $f | sed -r -e "s/^(.*)-${RELEASE_VERSION}(.*)/\1-centos6-${RELEASE_VERSION}\2/g") ; mv $f $centos_file ; done
fi
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

fi
echo "Done. Find releases in $TOPDIR"
}
Expand Down
120 changes: 120 additions & 0 deletions conf/orchestrator-sample-sqlite.conf.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
{
"Debug": true,
"EnableSyslog": false,
"ListenAddress": ":3000",
"MySQLTopologyUser": "orc_client_user",
"MySQLTopologyPassword": "orc_client_password",
"MySQLTopologyCredentialsConfigFile": "",
"MySQLTopologySSLPrivateKeyFile": "",
"MySQLTopologySSLCertFile": "",
"MySQLTopologySSLCAFile": "",
"MySQLTopologySSLSkipVerify": true,
"MySQLTopologyUseMutualTLS": false,
"BackendDB": "sqlite",
"SQLite3DataFile": "/usr/local/orchestrator/orchestrator.sqlite3",
"DefaultInstancePort": 3306,
"DiscoverByShowSlaveHosts": true,
"InstancePollSeconds": 5,
"UnseenInstanceForgetHours": 240,
"SnapshotTopologiesIntervalHours": 0,
"InstanceBulkOperationsWaitTimeoutSeconds": 10,
"HostnameResolveMethod": "default",
"MySQLHostnameResolveMethod": "@@hostname",
"SkipBinlogServerUnresolveCheck": true,
"ExpiryHostnameResolvesMinutes": 60,
"RejectHostnameResolvePattern": "",
"ReasonableReplicationLagSeconds": 10,
"ProblemIgnoreHostnameFilters": [],
"VerifyReplicationFilters": false,
"ReasonableMaintenanceReplicationLagSeconds": 20,
"CandidateInstanceExpireMinutes": 60,
"AuditLogFile": "",
"AuditToSyslog": false,
"RemoveTextFromHostnameDisplay": ".mydomain.com:3306",
"ReadOnly": false,
"AuthenticationMethod": "",
"HTTPAuthUser": "",
"HTTPAuthPassword": "",
"AuthUserHeader": "",
"PowerAuthUsers": [
"*"
],
"ClusterNameToAlias": {
"127.0.0.1": "test suite"
},
"SlaveLagQuery": "",
"DetectClusterAliasQuery": "SELECT SUBSTRING_INDEX(@@hostname, '.', 1)",
"DetectClusterDomainQuery": "",
"DetectInstanceAliasQuery": "",
"DetectPromotionRuleQuery": "",
"DataCenterPattern": "[.]([^.]+)[.][^.]+[.]mydomain[.]com",
"PhysicalEnvironmentPattern": "[.]([^.]+[.][^.]+)[.]mydomain[.]com",
"PromotionIgnoreHostnameFilters": [],
"DetectSemiSyncEnforcedQuery": "",
"ServeAgentsHttp": false,
"AgentsServerPort": ":3001",
"AgentsUseSSL": false,
"AgentsUseMutualTLS": false,
"AgentSSLSkipVerify": false,
"AgentSSLPrivateKeyFile": "",
"AgentSSLCertFile": "",
"AgentSSLCAFile": "",
"AgentSSLValidOUs": [],
"UseSSL": false,
"UseMutualTLS": false,
"SSLSkipVerify": false,
"SSLPrivateKeyFile": "",
"SSLCertFile": "",
"SSLCAFile": "",
"SSLValidOUs": [],
"URLPrefix": "",
"StatusEndpoint": "/api/status",
"StatusSimpleHealth": true,
"StatusOUVerify": false,
"AgentPollMinutes": 60,
"UnseenAgentForgetHours": 6,
"StaleSeedFailMinutes": 60,
"SeedAcceptableBytesDiff": 8192,
"PseudoGTIDPattern": "",
"PseudoGTIDPatternIsFixedSubstring": false,
"PseudoGTIDMonotonicHint": "asc:",
"DetectPseudoGTIDQuery": "",
"BinlogEventsChunkSize": 10000,
"SkipBinlogEventsContaining": [],
"ReduceReplicationAnalysisCount": true,
"FailureDetectionPeriodBlockMinutes": 60,
"RecoveryPeriodBlockSeconds": 3600,
"RecoveryIgnoreHostnameFilters": [],
"RecoverMasterClusterFilters": [
"_master_pattern_"
],
"RecoverIntermediateMasterClusterFilters": [
"_intermediate_master_pattern_"
],
"OnFailureDetectionProcesses": [
"echo 'Detected {failureType} on {failureCluster}. Affected replicas: {countSlaves}' >> /tmp/recovery.log"
],
"PreFailoverProcesses": [
"echo 'Will recover from {failureType} on {failureCluster}' >> /tmp/recovery.log"
],
"PostFailoverProcesses": [
"echo '(for all types) Recovered from {failureType} on {failureCluster}. Failed: {failedHost}:{failedPort}; Successor: {successorHost}:{successorPort}' >> /tmp/recovery.log"
],
"PostUnsuccessfulFailoverProcesses": [],
"PostMasterFailoverProcesses": [
"echo 'Recovered from {failureType} on {failureCluster}. Failed: {failedHost}:{failedPort}; Promoted: {successorHost}:{successorPort}' >> /tmp/recovery.log"
],
"PostIntermediateMasterFailoverProcesses": [
"echo 'Recovered from {failureType} on {failureCluster}. Failed: {failedHost}:{failedPort}; Successor: {successorHost}:{successorPort}' >> /tmp/recovery.log"
],
"CoMasterRecoveryMustPromoteOtherCoMaster": true,
"DetachLostSlavesAfterMasterFailover": true,
"ApplyMySQLPromotionAfterMasterFailover": false,
"MasterFailoverDetachSlaveMasterHost": false,
"MasterFailoverLostInstancesDowntimeMinutes": 0,
"PostponeSlaveRecoveryOnLagMinutes": 0,
"OSCIgnoreHostnameFilters": [],
"GraphiteAddr": "",
"GraphitePath": "",
"GraphiteConvertHostnameDotsToUnderscores": true
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

1 change: 1 addition & 0 deletions docs/configuration-sample.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ The following is a production configuration file, with some details redacted.
"HttpTimeoutSeconds": 60,
"StaleSeedFailMinutes": 60,
"SeedAcceptableBytesDiff": 8192,
"SeedWaitSecondsBeforeSend": 2,
"PseudoGTIDPattern": "drop view if exists `meta`.`_pseudo_gtid_hint__asc:",
"PseudoGTIDPatternIsFixedSubstring": true,
"PseudoGTIDMonotonicHint": "asc:",
Expand Down
3 changes: 2 additions & 1 deletion etc/init.d/orchestrator.bash
Original file line number Diff line number Diff line change
Expand Up @@ -54,8 +54,9 @@ start_daemon () {
post_start_daemon_hook 1>&2
}

# The file /etc/orchestrator_profile can be used to inject pre-service execution
# This files can be used to inject pre-service execution
# scripts, such as exporting variables or whatever. It's yours!
[ -f /etc/default/orchestrator ] && . /etc/default/orchestrator
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain why /etc/default/orchestrator? IF anything, I'd add /etc/profile.d/orchestrator which is what I should have done in the first place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/etc/default it's a default behavion on debian based distributives

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, i add usage /etc/profile.d/orchestrator and /etc/default/orchestrator

/etc/profile.d/orchestrator will have major priority

[ -f /etc/orchestrator_profile ] && . /etc/orchestrator_profile

case "$1" in
Expand Down
18 changes: 9 additions & 9 deletions go/agent/agent_dao.go
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,7 @@ func ReadOutdatedAgentsHosts() ([]string, error) {
from
host_agent
where
IFNULL(last_checked < now() - interval ? minute, true)
IFNULL(last_checked < now() - interval ? minute, 1)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change need for SQLlite compatibility cause SQLite desn't have BOOLEAN literal

`
err := db.QueryOrchestrator(query, sqlutils.Args(config.Config.AgentPollMinutes), func(m sqlutils.RowMap) error {
hostname := m.GetString("hostname")
Expand Down Expand Up @@ -461,7 +461,7 @@ func MountLV(hostname string, lv string) (Agent, error) {
return executeAgentCommand(hostname, fmt.Sprintf("mountlv?lv=%s", lv), nil)
}

// RemoveLV requests an agent to remvoe a snapshot
// RemoveLV requests an agent to remove a snapshot
func RemoveLV(hostname string, lv string) (Agent, error) {
return executeAgentCommand(hostname, fmt.Sprintf("removelv?lv=%s", lv), nil)
}
Expand Down Expand Up @@ -547,8 +547,8 @@ func AbortSeed(seedId int64) error {
}

// PostCopy will request an agent to invoke post-copy commands
func PostCopy(hostname string) (Agent, error) {
return executeAgentCommand(hostname, "post-copy", nil)
func PostCopy(hostname, sourceHostname string) (Agent, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change going to break behavior for existing users?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you help me figure out howto add optional parameter to martini routes?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@Slach Slach Mar 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shlomi-noach ok.
i add backwards compatibility on orchestrator-agent here
https://github.com/github/orchestrator-agent/pull/20/files#diff-fbdc8cde28a71961e53c2e83e2483c9aL619

and add some other improvments and tested it with percona-xtrabackup
currenly in my setup i have successfull data seed between nodes over netcat + xtrabackup + xbstream

return executeAgentCommand(hostname, fmt.Sprintf("post-copy/?sourceHost=%s", sourceHostname), nil)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is backwards compatibility

}

// SubmitSeedEntry submits a new seed operation entry, returning its unique ID
Expand Down Expand Up @@ -693,7 +693,7 @@ func executeSeed(seedId int64, targetHostname string, sourceHostname string) err
}

seedFromLogicalVolume := sourceAgent.LogicalVolumes[0]
seedStateId, _ = submitSeedStateEntry(seedId, fmt.Sprintf("Mounting logical volume: %s", seedFromLogicalVolume.Path), "")
seedStateId, _ = submitSeedStateEntry(seedId, fmt.Sprintf("%s Mounting logical volume: %s", sourceHostname, seedFromLogicalVolume.Path), "")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

_, err = MountLV(sourceHostname, seedFromLogicalVolume.Path)
if err != nil {
return updateSeedStateEntry(seedStateId, err)
Expand Down Expand Up @@ -722,8 +722,8 @@ func executeSeed(seedId int64, targetHostname string, sourceHostname string) err
seedStateId, _ = submitSeedStateEntry(seedId, fmt.Sprintf("%s will now receive data in background", targetHostname), "")
ReceiveMySQLSeedData(targetHostname, seedId)

seedStateId, _ = submitSeedStateEntry(seedId, fmt.Sprintf("Waiting some time for %s to start listening for incoming data", targetHostname), "")
time.Sleep(2 * time.Second)
seedStateId, _ = submitSeedStateEntry(seedId, fmt.Sprintf("Waiting %d seconds for %s to start listening for incoming data", config.Config.SeedWaitSecondsBeforeSend, targetHostname), "")
time.Sleep(time.Duration(config.Config.SeedWaitSecondsBeforeSend) * time.Second)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


seedStateId, _ = submitSeedStateEntry(seedId, fmt.Sprintf("%s will now send data to %s in background", sourceHostname, targetHostname), "")
SendMySQLSeedData(sourceHostname, targetHostname, seedId)
Expand Down Expand Up @@ -774,12 +774,12 @@ func executeSeed(seedId int64, targetHostname string, sourceHostname string) err

// Cleanup:
seedStateId, _ = submitSeedStateEntry(seedId, fmt.Sprintf("Executing post-copy command on %s", targetHostname), "")
_, err = PostCopy(targetHostname)
_, err = PostCopy(targetHostname, sourceHostname)
if err != nil {
return updateSeedStateEntry(seedStateId, err)
}

seedStateId, _ = submitSeedStateEntry(seedId, fmt.Sprintf("Unmounting logical volume: %s", seedFromLogicalVolume.Path), "")
seedStateId, _ = submitSeedStateEntry(seedId, fmt.Sprintf("%s Unmounting logical volume: %s", sourceHostname, seedFromLogicalVolume.Path), "")
_, err = Unmount(sourceHostname)
if err != nil {
return updateSeedStateEntry(seedStateId, err)
Expand Down
2 changes: 2 additions & 0 deletions go/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,7 @@ type Configuration struct {
UnseenAgentForgetHours uint // Number of hours after which an unseen agent is forgotten
StaleSeedFailMinutes uint // Number of minutes after which a stale (no progress) seed is considered failed.
SeedAcceptableBytesDiff int64 // Difference in bytes between seed source & target data size that is still considered as successful copy
SeedWaitSecondsBeforeSend int64 // Number of seconds for waiting before start send data command on agent
AutoPseudoGTID bool // Should orchestrator automatically inject Pseudo-GTID entries to the masters
PseudoGTIDPattern string // Pattern to look for in binary logs that makes for a unique entry (pseudo GTID). When empty, Pseudo-GTID based refactoring is disabled.
PseudoGTIDPatternIsFixedSubstring bool // If true, then PseudoGTIDPattern is not treated as regular expression but as fixed substring, and can boost search time
Expand Down Expand Up @@ -363,6 +364,7 @@ func newConfiguration() *Configuration {
UnseenAgentForgetHours: 6,
StaleSeedFailMinutes: 60,
SeedAcceptableBytesDiff: 8192,
SeedWaitSecondsBeforeSend: 2,
AutoPseudoGTID: false,
PseudoGTIDPattern: "",
PseudoGTIDPatternIsFixedSubstring: false,
Expand Down
2 changes: 1 addition & 1 deletion go/db/generate_base.go
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@ var generateSQLBase = []string{
last_checked timestamp NULL DEFAULT NULL,
last_seen timestamp NULL DEFAULT NULL,
mysql_port smallint(5) unsigned DEFAULT NULL,
count_mysql_snapshots smallint(5) unsigned NOT NULL,
count_mysql_snapshots smallint(5) unsigned NOT NULL DEFAULT 0,
Copy link
Collaborator

@shlomi-noach shlomi-noach Mar 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚫 Must change: existing SQL must never ever change. The most we can do is add a modification in generate_patches.go. However, modifying a column is not supported by sqlite so this change must unfortunately go away.

The value is implicitly 0 anyway, what is the concern?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh. ok. sorry i will rollback this change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes concern is
count_mysql_snapshots not used when try insert into host_agent table
and sql query is failed when agent submitting

PRIMARY KEY (hostname)
) ENGINE=InnoDB DEFAULT CHARSET=ascii
`,
Expand Down
21 changes: 16 additions & 5 deletions vagrant/admin-build.sh
Original file line number Diff line number Diff line change
@@ -1,23 +1,34 @@
#!/bin/bash

set -xeuo pipefail
# Install orchestrator
rpm -i /tmp/orchestrator-release/orchestrator*.rpm
if [[ -e /etc/redhat-release ]]; then
rpm -i /tmp/orchestrator-release/orchestrator*.rpm
fi

if [[ -e /etc/debian_version ]]; then
dpkg -i /tmp/orchestrator-release/orchestrator*.deb
fi

if [[ -e /orchestrator/vagrant/.sqlite ]]; then
cp -fv /usr/local/orchestrator/orchestrator-sample-sqlite.conf.json /etc/orchestrator.conf.json
else
cp -fv /usr/local/orchestrator/orchestrator-sample.conf.json /etc/orchestrator.conf.json
fi

if [[ -e /etc/redhat-release ]]; then

/sbin/chkconfig orchestrator on
cp /usr/local/orchestrator/orchestrator-sample.conf.json /etc/orchestrator.conf.json
/sbin/service orchestrator start

elif [[ -e /etc/debian_version ]]; then

update-rc.d orchestrator defaults
cp /usr/local/orchestrator/orchestrator-sample.conf.json /etc/orchestrator.conf.json
/usr/sbin/service orchestrator start

fi

echo '* * * * * root /usr/bin/orchestrator -c discover -i db1' > /etc/cron.d/orchestrator-discovery

# Discover instances
/usr/bin/orchestrator -c discover -i localhost
/usr/bin/orchestrator --verbose --debug --stack -c redeploy-internal-db
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happened that the above became necessary? Current logic should make it safe to run orchestrator without specifying redeploy-internal-db.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohh, sorry, i multiple run vagrant up and add redeploy-internal-db to try fix errors when using sqlite backend
i revert this changes right now

/usr/bin/orchestrator --verbose --debug --stack -c discover -i localhost
Loading