Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement script checks #986

Merged
merged 26 commits into from
Mar 26, 2016
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
4faf16b
Added implementation to run checks for docker, exec and raw_exec
diptanu Mar 24, 2016
42dc8be
Added a check type for consul service to delegate certain checks
diptanu Mar 24, 2016
2b60e8b
Updated consul related dependencies
diptanu Mar 24, 2016
1e4da82
Running script checks periodically
diptanu Mar 24, 2016
f1cf87b
Enabling script checks
diptanu Mar 24, 2016
e8995be
Fixed merge conflicts
diptanu Mar 24, 2016
588666d
Introducing ConsulContext
diptanu Mar 24, 2016
fe9e271
Creating the docker driver in the executor properly
diptanu Mar 24, 2016
52d4b01
Added a test for the exec script check
diptanu Mar 24, 2016
8342d7d
Updated the vendored copy of github.com/fsouza/go-dockerclient
diptanu Mar 25, 2016
62249fe
Added an impl for Nomad Checks
diptanu Mar 25, 2016
12dc439
Changing the logic of keep services
diptanu Mar 25, 2016
52f7f93
Added some docs
diptanu Mar 25, 2016
0fed52c
Removing non relevant tests
diptanu Mar 25, 2016
9a27bf4
Added some more docs to the executor
diptanu Mar 25, 2016
e866d3b
Vendoring circbuf
diptanu Mar 25, 2016
7454ffd
Renamed NomadChecks to CheckRunner
diptanu Mar 25, 2016
4c27660
Renamed NomadChecks to CheckRunner and a fix for checkrunner start
diptanu Mar 25, 2016
f8db6a4
Using tickers instead of creating new timers
diptanu Mar 25, 2016
bfc2a0d
using switch to determine the state of checks
diptanu Mar 25, 2016
e0b9f03
Using a single timer to run checks
diptanu Mar 25, 2016
644710a
Added more tests for the checks
diptanu Mar 25, 2016
f1d9b2c
Removing the container after running script check
diptanu Mar 26, 2016
d4a5f07
Moved the dockerIsConnected to testutils
diptanu Mar 26, 2016
6ee99af
Fixing the exec script check to run within the chroot
diptanu Mar 26, 2016
59e91e1
Using latest busybox
diptanu Mar 26, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 26 additions & 23 deletions Godeps/Godeps.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 2 additions & 1 deletion api/tasks.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,8 @@ type ServiceCheck struct {
Id string
Name string
Type string
Script string
Cmd string
Args []string
Path string
Protocol string
Interval time.Duration
Expand Down
9 changes: 6 additions & 3 deletions client/client.go
Original file line number Diff line number Diff line change
Expand Up @@ -1184,7 +1184,7 @@ func (c *Client) syncConsul() {
for {
select {
case <-sync:
var runningTasks []*structs.Task
services := make(map[string]struct{})
// Get the existing allocs
c.allocLock.RLock()
allocs := make([]*AllocRunner, 0, len(c.allocs))
Expand All @@ -1199,14 +1199,17 @@ func (c *Client) syncConsul() {
for taskName, taskState := range taskStates {
if taskState.State == structs.TaskStateRunning {
if tr, ok := ar.tasks[taskName]; ok {
runningTasks = append(runningTasks, tr.task)
for _, service := range tr.task.Services {
services[service.ID(ar.alloc.ID, tr.task.Name)] = struct{}{}
}
}
}
}
}
if err := c.consulService.KeepServices(runningTasks); err != nil {
if err := c.consulService.KeepServices(services); err != nil {
c.logger.Printf("[DEBUG] client: error removing services from non-running tasks: %v", err)
}
sync = time.After(consulSyncInterval)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are creating a lot of timers. Just outside the for loop create one https://golang.org/pkg/time/#NewTimer and then use sync.Reset()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a Ticker would be more appropriate.

case <-c.shutdownCh:
c.logger.Printf("[INFO] client: shutting down consul sync")
return
Expand Down
87 changes: 87 additions & 0 deletions client/consul/check.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
package consul

import (
"log"
"math/rand"
"sync"
"time"

cstructs "github.com/hashicorp/nomad/client/driver/structs"
)

// CheckRunner runs a given check in a specific interval and update a
// corresponding Consul TTL check
type CheckRunner struct {
check Check
runCheck func(Check)
logger *log.Logger
stop bool
stopCh chan struct{}
stopLock sync.Mutex

started bool
startedLock sync.Mutex
}

// NewCheckRunner configures and returns a CheckRunner
func NewCheckRunner(check Check, runCheck func(Check), logger *log.Logger) *CheckRunner {
cr := CheckRunner{
check: check,
runCheck: runCheck,
logger: logger,
stopCh: make(chan struct{}),
}
return &cr
}

// Start is used to start the check. The check runs until stop is called
func (r *CheckRunner) Start() {
r.startedLock.Lock()
defer r.startedLock.Unlock()
if r.started {
return
}
r.stopLock.Lock()
defer r.stopLock.Unlock()
go r.run()
r.started = true
}

// Stop is used to stop the check.
func (r *CheckRunner) Stop() {
r.stopLock.Lock()
defer r.stopLock.Unlock()
if !r.stop {
r.stop = true
close(r.stopCh)
}
}

// run is invoked by a goroutine to run until Stop() is called
func (r *CheckRunner) run() {
// Get the randomized initial pause time
initialPauseTime := randomStagger(r.check.Interval())
r.logger.Printf("[DEBUG] agent: pausing %v before first invocation of %s", initialPauseTime, r.check.ID())
next := time.After(initialPauseTime)
for {
select {
case <-next:
r.runCheck(r.check)
next = time.After(r.check.Interval())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dadgar Can't use a single timer or ticker here since we are sleeping for a different duration for the first time, and from then onwards sleeping for the duration of the check interval.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you can: https://golang.org/pkg/time/#Timer.Reset

Reset takes a duration

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh cool.

case <-r.stopCh:
return
}
}
}

// Check is an interface which check providers can implement for Nomad to run
type Check interface {
Run() *cstructs.CheckResult
ID() string
Interval() time.Duration
}

// Returns a random stagger interval between 0 and the duration
func randomStagger(intv time.Duration) time.Duration {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a duplicate from somewhere. Can you put it in a utils package or something to share?

return time.Duration(uint64(rand.Int63()) % uint64(intv))
}
Loading