Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bumps nomad timeout, adds a few missing dependencies, speeds up Nomad… #617

Merged
merged 10 commits into from
Sep 14, 2018

Conversation

kurtwheeler
Copy link
Contributor

… job killing in deploy process.

Issue Number

N/A came up during crunch.

Purpose/Implementation Notes

Most of these are pretty minor/obvious things. The one significant change is in deploy.sh. This makes it so we kill all the Nomad jobs in the background and then wait for them so we can do it a lot faster.

Here is an image of the CPU utilization of the lead nomad server:
image

The long period at the start at ~18% was us killing them one by one. After the change to deploy.sh you can see it peak up to the mid fifties as it kills all the jobs very quickly. I think the last high point which then trails off is probably us re-registering the base jobs.

Types of changes

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

Functional tests

I have tested all of this by running it against production.

Checklist

  • Lint and unit tests pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • Any dependent changes have been merged and published in downstream modules

@Miserlou
Copy link
Contributor

Is this PR going to including killing the telemetry stuff? Ideally just a comment out since it shouuuuuld hopefully be improved in a future Nomad, but we want to disable it for prod for now.

@kurtwheeler
Copy link
Contributor Author

I have added commenting out the graphite code for Nomad lead server along with reverting the EBS-volume discovery changes I made.

@Miserlou
Copy link
Contributor

Related: hashicorp/nomad#4422

@Miserlou
Copy link
Contributor

You have removed the telemetry statsd servers, but not the telemetry configuration from Nomad itself

@kurtwheeler kurtwheeler merged commit 51fb291 into dev Sep 14, 2018
@kurtwheeler kurtwheeler deleted the kurtwheeler/minor-fixes branch September 14, 2018 16:41
kurtwheeler added a commit that referenced this pull request Jan 10, 2019
#617)

* Bumps nomad timeout, adds a few missing dependencies, speeds up Nomad job killing in deploy process.

* Comment out graphite from lead nomad server, revert EBS volume discovery change.

* Hopefully speed up Nomad job registration time.

* Prevent old or local distributions of common from getting installed while deploying.

* Remove telemetry terraform blocks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants