Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic handling of server region errors #4087

Merged
merged 2 commits into from
Mar 30, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,8 @@ BUG FIXES:
* core: Fix an issue in which multiple servers could be acting as a leader. A
prominent side-effect being nodes TTLing incorrectly [[GH-3890](https://github.com/hashicorp/nomad/issues/3890)]
* core: Fix an issue where jobs with the same name in a different namespace were not being blocked correctly [[GH-3972](https://github.com/hashicorp/nomad/issues/3972)]
* cli: server member command handles failure to retrieve leader in remote
regions [[GH-4087](https://github.com/hashicorp/nomad/issues/4087)]
* client: Support IP detection of wireless interfaces on Windows [[GH-4011](https://github.com/hashicorp/nomad/issues/4011)]
* client: Migrated ephemeral_disk's maintain directory permissions [[GH-3723](https://github.com/hashicorp/nomad/issues/3723)]
* client: Always advertise driver IP when in driver address mode [[GH-3682](https://github.com/hashicorp/nomad/issues/3682)]
Expand Down
25 changes: 14 additions & 11 deletions command/server_members.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ import (
"sort"
"strings"

multierror "github.com/hashicorp/go-multierror"
"github.com/hashicorp/nomad/api"
"github.com/posener/complete"
"github.com/ryanuber/columnize"
Expand Down Expand Up @@ -92,11 +93,7 @@ func (c *ServerMembersCommand) Run(args []string) int {
sort.Sort(api.AgentMembersNameSort(srvMembers.Members))

// Determine the leaders per region.
leaders, err := regionLeaders(client, srvMembers.Members)
if err != nil {
c.Ui.Error(fmt.Sprintf("Error determining leaders: %s", err))
return 1
}
leaders, leaderErr := regionLeaders(client, srvMembers.Members)

// Format the list
var out []string
Expand All @@ -108,6 +105,14 @@ func (c *ServerMembersCommand) Run(args []string) int {

// Dump the list
c.Ui.Output(columnize.SimpleFormat(out))

// If there were leader errors display a warning
if leaderErr != nil {
c.Ui.Output("")
c.Ui.Warn(fmt.Sprintf("Error determining leaders: %s", leaderErr))
return 1
}

return 0
}

Expand Down Expand Up @@ -181,19 +186,17 @@ func regionLeaders(client *api.Client, mem []*api.AgentMember) (map[string]strin
return leaders, nil
}

var mErr multierror.Error
status := client.Status()
for reg := range regions {
l, err := status.RegionLeader(reg)
if err != nil {
// This error means that region has no leader.
if strings.Contains(err.Error(), "No cluster leader") {
continue
}
return nil, err
multierror.Append(&mErr, fmt.Errorf("Region %q: %v", reg, err))
continue
}

leaders[reg] = l
}

return leaders, nil
return leaders, mErr.ErrorOrNil()
}