Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Uncordon the node during failed updates #1572

Closed
wants to merge 1 commit into from

Commits on Mar 19, 2020

  1. Uncordon the node during failed updates

    Today we cordon the node before we write updates to the node. This
    means that if a file write fails (e.g. failed to create a directory),
    we fail the update but the node stays cordoned. This will cause
    deadlocks as the node annotation for desired config will no longer
    be updated.
    
    With the rollback added, if you delete the erroneous machineconfig
    in question, we will be able to auto-recover from failed writes,
    like we do for failed reconciliation. The side effect of this is
    that the node will flip between Ready and Ready,Unschedulable,
    since each time we receive a node event we will attempt to update
    again and go through the full process.
    
    Signed-off-by: Yu Qi Zhang <jerzhang@redhat.com>
    yuqi-zhang committed Mar 19, 2020
    Configuration menu
    Copy the full SHA
    8ee8efc View commit details
    Browse the repository at this point in the history