Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: wait for batched driver updates before registering nodes #5585

Merged
merged 4 commits into from
Apr 19, 2019

Commits on Apr 19, 2019

  1. client: wait for batched driver updated

    Here we retain 0.8.7 behavior of waiting for driver fingerprints before
    registering a node, with some timeout.  This is needed for system jobs,
    as system job scheduling for node occur at node registration, and the
    race might mean that a system job may not get placed on the node because
    of missing drivers.
    
    The timeout isn't strictly necessary, but raising it to 1 minute as it's
    closer to indefinitely blocked than 1 second.  We need to keep the value
    high enough to capture as much drivers/devices, but low enough that
    doesn't risk blocking too long due to misbehaving plugin.
    
    Fixes #5579
    Mahmood Ali committed Apr 19, 2019
    Configuration menu
    Copy the full SHA
    7a68d76 View commit details
    Browse the repository at this point in the history
  2. client: avoid registering node twice right away

    I noticed that `watchNodeUpdates()` almost immediately after
    `registerAndHeartbeat()` calls `retryRegisterNode()`, well after 5
    seconds.
    
    This call is unnecessary and made debugging a bit harder.  So here, we
    ensure that we only re-register node for new node events, not for
    initial registration.
    Mahmood Ali committed Apr 19, 2019
    Configuration menu
    Copy the full SHA
    9dcebcd View commit details
    Browse the repository at this point in the history
  3. client: log detected driver health state

    Noticed that `detected drivers` log line was misleading - when a driver
    doesn't fingerprint before timeout, their health status is empty string
    `""` which we would mark as detected.
    
    Now, we log all drivers along with their state to ease driver
    fingerprint debugging.
    Mahmood Ali committed Apr 19, 2019
    Configuration menu
    Copy the full SHA
    9a2f46f View commit details
    Browse the repository at this point in the history
  4. clarify cryptic log line

    Mahmood Ali committed Apr 19, 2019
    Configuration menu
    Copy the full SHA
    8041b0c View commit details
    Browse the repository at this point in the history