Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle Nomad leadership flapping (attempt 2) #6977

Merged
merged 6 commits into from
Jan 28, 2020
Merged

Commits on Jan 22, 2020

  1. extract leader step function

    Mahmood Ali committed Jan 22, 2020
    Configuration menu
    Copy the full SHA
    ccd9c14 View commit details
    Browse the repository at this point in the history
  2. Handle Nomad leadership flapping

    Fixes a deadlock in leadership handling if leadership flapped.
    
    Raft propagates leadership transition to Nomad through a NotifyCh channel.
    Raft blocks when writing to this channel, so channel must be buffered or
    aggressively consumed[1]. Otherwise, Raft blocks indefinitely in `raft.runLeader`
    until the channel is consumed[1] and does not move on to executing follower
    related logic (in `raft.runFollower`).
    
    While Raft `runLeader` defer function blocks, raft cannot process any other
    raft operations.  For example, `run{Leader|Follower}` methods consume
    `raft.applyCh`, and while runLeader defer is blocked, all raft log applications
    or config lookup will block indefinitely.
    
    Sadly, `leaderLoop` and `establishLeader` makes few Raft calls!
    `establishLeader` attempts to auto-create autopilot/scheduler config [3]; and
    `leaderLoop` attempts to check raft configuration [4].  All of these calls occur
    without a timeout.
    
    Thus, if leadership flapped quickly while `leaderLoop/establishLeadership` is
    invoked and hit any of these Raft calls, Raft handler _deadlock_ forever.
    
    Depending on how many times it flapped and where exactly we get stuck, I suspect
    it's possible to get in the following case:
    
    * Agent metrics/stats http and RPC calls hang as they check raft.Configurations
    * raft.State remains in Leader state, and server attempts to handle RPC calls
      (e.g. node/alloc updates) and these hang as well
    
    As we create goroutines per RPC call, the number of goroutines grow over time
    and may trigger a out of memory errors in addition to missed updates.
    
    [1] https://github.com/hashicorp/raft/blob/d90d6d6bdacf1b35d66940b07be515b074d89e88/config.go#L190-L193
    [2] https://github.com/hashicorp/raft/blob/d90d6d6bdacf1b35d66940b07be515b074d89e88/raft.go#L425-L436
    [3] https://github.com/hashicorp/nomad/blob/2a89e477465adbe6a88987f0dcb9fe80145d7b2f/nomad/leader.go#L198-L202
    [4] https://github.com/hashicorp/nomad/blob/2a89e477465adbe6a88987f0dcb9fe80145d7b2f/nomad/leader.go#L877
    Mahmood Ali committed Jan 22, 2020
    Configuration menu
    Copy the full SHA
    2810bf3 View commit details
    Browse the repository at this point in the history

Commits on Jan 28, 2020

  1. include test and address review comments

    Mahmood Ali committed Jan 28, 2020
    Configuration menu
    Copy the full SHA
    0912400 View commit details
    Browse the repository at this point in the history
  2. handle channel close signal

    Always deliver last value then send close signal.
    Mahmood Ali committed Jan 28, 2020
    Configuration menu
    Copy the full SHA
    97f20bd View commit details
    Browse the repository at this point in the history
  3. tweak leadership flapping log messages

    Mahmood Ali committed Jan 28, 2020
    Configuration menu
    Copy the full SHA
    94a75b4 View commit details
    Browse the repository at this point in the history
  4. tests: defer closing shutdownCh

    Mahmood Ali committed Jan 28, 2020
    Configuration menu
    Copy the full SHA
    8ae03c3 View commit details
    Browse the repository at this point in the history