Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"nomad server-members" from a server results in 500 #1515

Closed
jshaw86 opened this issue Aug 3, 2016 · 6 comments · Fixed by #3941
Closed

"nomad server-members" from a server results in 500 #1515

jshaw86 opened this issue Aug 3, 2016 · 6 comments · Fixed by #3941

Comments

@jshaw86
Copy link

jshaw86 commented Aug 3, 2016

Nomad version

nomad 0.4.0

Operating system and Environment details

ubuntu 14.04

Issue

the nomad server-members command 500's if a whole region is gone, possibly related to #1382

Reproduction steps

  1. Setup 2 regions with federation
  2. take down one region
  3. run nomad server-members on the existing region

Nomad Client logs (if appropriate)

ubuntu@nomad-sched1:~$ nomad server-members
Error determining leaders: Unexpected response code: 500 (No path to region)
@camerondavison
Copy link
Contributor

marking #1382 as possibly related, not that they are the same issue, but that anyone working on these could tackle both

@jshaw86
Copy link
Author

jshaw86 commented Aug 16, 2016

@dadgar this is actually from the nomad server not the client

@jshaw86 jshaw86 changed the title "nomad server-members" from a client results in 500 "nomad server-members" from a server results in 500 Aug 16, 2016
@dadgar dadgar added this to the v0.5.1 milestone Nov 10, 2016
@cetex
Copy link

cetex commented Jan 24, 2017

So, i tried nomad again (0.5.2), by accident i setup a server in the wrong region, fixed the config and rejoined it to the cluster with the right region set.
After this i got the following error when listing server members no matter what i did:
'''
cetex@master-s14:~$ nomad server-members
Error determining leaders: Unexpected response code: 500 (No path to region)
'''
The logs show:
'''
2017/01/24 03:51:25.520362 [WARN] nomad.rpc: RPC request for region 'global', no path found
2017/01/24 03:51:25.520387 [ERR] http: Request /v1/status/leader?region=global, error: No path to region
2017/01/24 03:51:34.751509 [WARN] nomad.rpc: RPC request for region 'global', no path found
2017/01/24 03:51:34.751533 [ERR] http: Request /v1/status/leader?region=global, error: No path to region
'''

I tried server-force-leave on the broken node, deleted it's storage again and restarted it but the error remains.

Only solution which i found to fix this was to stop all running instances of nomad globally, delete all storage (state, the datadir) for all nomad nodes, and then restart it with the right config from scratch.

@michaelw
Copy link

We hit the same issue with 0.7.1 last week. server-force-leave did not help resolve this (i.e., get rid of the "ghost" server in the defunct region), neither did any other API calls we tried (e.g., node/:nodeid/purge).

We ended up fixing this by changing the client-id of the server that moved regions, then take down the entire new region (happened to be the smallest we have). It came back clean (without the ghost node), and then we did a rolling restart of all servers in server-members (i.e., globally) to clean the other regions. So, not as bad as taking down the entire infrastructure and removing all state, but nevertheless inconvenient.

I have a small patch that makes server-members work, but there are more commands that bail out. It would still be good to get nomad to resolve the situation itself.

@chelseakomlo
Copy link
Contributor

Thanks for the update- we are tracking this in our team's future roadmap. In the interim, please feel free to submit PRs for specific commands, we'll complete the remainder when we pick up this task.

@github-actions
Copy link

github-actions bot commented Dec 3, 2022

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 3, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants