Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: do not restart restored tasks until server is contacted #5669

Merged
merged 6 commits into from
May 14, 2019

Commits on May 14, 2019

  1. Configuration menu
    Copy the full SHA
    846b482 View commit details
    Browse the repository at this point in the history
  2. client: do not restart dead tasks until server is contacted

    Fixes #1795
    
    Running restored allocations and pulling what allocations to run from
    the server happen concurrently. This means that if a client is rebooted,
    and has its allocations rescheduled, it may restart the dead allocations
    before it contacts the server and determines they should be dead.
    
    This commit makes tasks that fail to reattach on restore wait until the
    server is contacted before restarting.
    schmichael committed May 14, 2019
    Configuration menu
    Copy the full SHA
    e7042b6 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    4b854cc View commit details
    Browse the repository at this point in the history
  4. client: do not restart dead tasks until server is contacted (try 2)

    Refactoring of 104067b
    
    Switch the MarkLive method for a chan that is closed by the client.
    Thanks to @notnoop for the idea!
    
    The old approach called a method on most existing ARs and TRs on every
    runAllocs call. The new approach does a once.Do call in runAllocs to
    accomplish the same thing with less work. Able to remove the gate
    abstraction that did much more than was needed.
    schmichael committed May 14, 2019
    Configuration menu
    Copy the full SHA
    6a2792a View commit details
    Browse the repository at this point in the history
  5. client: register before restoring

    Registration and restoring allocs don't share state or depend on each
    other in any way (syncing allocs with servers is done outside of
    registration).
    
    Since restoring is synchronous, start the registration goroutine first.
    
    For nodes with lots of allocs to restore or close to their heartbeat
    deadline, this could be the difference between becoming "lost" or not.
    schmichael committed May 14, 2019
    Configuration menu
    Copy the full SHA
    796c05b View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    abd809d View commit details
    Browse the repository at this point in the history