Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable Serf ReconnectTimeout and TombstoneTimeout #333

Closed
gregory-m opened this issue Oct 25, 2015 · 3 comments
Closed

Configurable Serf ReconnectTimeout and TombstoneTimeout #333

gregory-m opened this issue Oct 25, 2015 · 3 comments

Comments

@gregory-m
Copy link
Contributor

Part of our infrastructure running on AWS spot instances. I played with nomad on spot instances and finished with 50 members in "failed" state. I only run 3 instances concurrent, and can imagine what will happened when I will run 50 agents on spot instances. The default 3 days timeout its huge if you running on spot instances.

If you agree I can open PR.

Thanks.

@dadgar
Copy link
Contributor

dadgar commented Oct 26, 2015

Hey,

Is there a concern with this? Nomad won't schedule to the failed nodes, they just remain in the system in case they reconnect.

But I agree, we need to expose more of the configuration so a PR would be appreciated!

@gregory-m
Copy link
Contributor Author

It's only user interface issue. And yes its only important to these who run nomad on highly changeable environments like spot instances. For example we bid on 10 spot instances. After hour or two somebody overbids us, and we decide to launch 10 regular instances. 5 hours later spot instances prices go down. And we decide to run 10 spot instances and shutdown regular ones.

This will lead to 20 instances in "leave" state which will never rejoin cluster.

Not related to this ticket but one more question:
Serf doesn't provide any public methods to remove instances from cluster maybe it's good idea to make "reap" function public?

This was referenced Oct 28, 2015
benbuzbee pushed a commit to benbuzbee/nomad that referenced this issue Jul 21, 2022
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants