Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When whole aditional region goes down imposibble to get nomad server-members #3600

Closed
tantra35 opened this issue Nov 29, 2017 · 7 comments
Closed

Comments

@tantra35
Copy link
Contributor

tantra35 commented Nov 29, 2017

Nomad version

0.6.3

After some time we decide remove someone. We simple remove all nomad servers from that region. And after that when we try to watch nomad server-members in alive regions we got follow error:

root@vol-cl-control-01:/home/ruslan# nomad server-members
Error determining leaders: Unexpected response code: 500 (No path to region)

Which is on one side is correct because we have dead region, but how can we got to make work nomad server-members command?

@chelseakomlo
Copy link
Contributor

Hi, thanks for submitting this issue. Would you mind submitting logs from the Nomad client when you experience this issue?

This issue appears similar to #1515 and looks to be a duplicate.

@tantra35
Copy link
Contributor Author

@chelseakomlo yes you right, this is a duplicate

and we have the same logs:

Nov 29 20:55:21 vol-cl-control-01 nomad[1769]:     2017/11/29 20:55:21.063985 [DEBUG] http: Request /v1/status/peers (676.627µs)
Nov 29 20:55:22 vol-cl-control-01 nomad[1769]:     2017/11/29 20:55:22.092648 [DEBUG] http: Request /v1/agent/members (220.102µs)
Nov 29 20:55:22 vol-cl-control-01 nomad[1769]:     2017/11/29 20:55:22.203128 [DEBUG] http: Request /v1/status/leader?region=aws-analitics-eu-central (109.92
Nov 29 20:55:22 vol-cl-control-01 nomad[1769]:     2017/11/29 20:55:22.203460 [WARN] nomad.rpc: RPC request for region 'aws-analitics-us-east', no path found
Nov 29 20:55:22 vol-cl-control-01 nomad[1769]:     2017/11/29 20:55:22.203472 [ERR] http: Request /v1/status/leader?region=aws-analitics-us-east, error: No p
Nov 29 20:55:22 vol-cl-control-01 nomad[1769]:     2017/11/29 20:55:22.203562 [DEBUG] http: Request /v1/status/leader?region=aws-analitics-us-east (121.821µs
Nov 29 20:55:27 vol-cl-control-01 nomad[1769]:     2017/11/29 20:55:27 [DEBUG] memberlist: TCP connection from=127.0.0.1:44264

But what we can do now for cleanup, and verify that we fully made cleanup?

@chelseakomlo
Copy link
Contributor

Thanks for confirming this is a duplicate, and the logs!

See here for documentation on outage recovery: https://www.nomadproject.io/guides/outage.html

It might be a good idea to try bringing up one server in the "down" region, and allow it to rejoin. You could then have the server leave cleanly, for example by using the following command: https://www.nomadproject.io/docs/commands/server-force-leave.html

Let us know how this goes.

@tantra35
Copy link
Contributor Author

@chelseakomlo Sorry but three doesn't described situation for outage recovery in multiregion configurations. Bring up server in lost region was first idea what come to us, but for some reasons this is not possible. We try to remove servers by memory, but without any success, seems that we doesn't remembers all their names ;-(. Maybe esist any hidden api that can list all serveres in all regions?

@chelseakomlo
Copy link
Contributor

Sure, we will add to our documentation about outage recovery in a multi-region cluster, that is a good suggestion.

Here is a link to our HTTP api for nodes: https://www.nomadproject.io/api/nodes.html as well as system maintenance: https://www.nomadproject.io/api/system.html

@chelseakomlo
Copy link
Contributor

Closing as this issue is a duplicate, let us know if you continue to experience difficulty with this.

@github-actions
Copy link

github-actions bot commented Dec 3, 2022

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 3, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants