Skip to content

Commit

Permalink
drain: use client status to determine drain is complete
Browse files Browse the repository at this point in the history
If an allocation is slow to stop because of `kill_timeout`, the node drain is
marked as complete prematurely, even though drain monitoring will continue to
report allocation migrations. This impacts the UI or API clients that monitor
node draining to shut down nodes.
  • Loading branch information
tgross committed Aug 26, 2022
1 parent 6d99ca1 commit 720123f
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 1 deletion.
2 changes: 1 addition & 1 deletion nomad/drainer/draining_node.go
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ func (n *drainingNode) IsDone() (bool, error) {
}

// If there is a non-terminal we aren't done
if !alloc.TerminalStatus() {
if !alloc.ClientTerminalStatus() {
return false, nil
}
}
Expand Down
4 changes: 4 additions & 0 deletions nomad/drainer/watch_jobs.go
Original file line number Diff line number Diff line change
Expand Up @@ -363,6 +363,10 @@ func handleTaskGroup(snap *state.StateSnapshot, batch bool, tg *structs.TaskGrou
drainingNodes[alloc.NodeID] = onDrainingNode
}

if onDrainingNode && !alloc.ClientTerminalStatus() {
result.done = false
}

// Check if the alloc should be considered migrated. A migrated
// allocation is one that is terminal, is on a draining
// allocation, and has only happened since our last handled index to
Expand Down

0 comments on commit 720123f

Please sign in to comment.