Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIXED] Clustering: leadership acquired actions could get stuck #1287

Merged
merged 2 commits into from
Apr 3, 2023

Commits on Mar 30, 2023

  1. [FIXED] Clustering: leadership acquired actions could get stuck

    If a leadership changed occurred while leadership actions were
    executed, before the raft.Barrier() call was made, the server
    would be stuck in that call. This is because RAFT library
    notifies the Streaming server code that a leadership changed
    through a go channel that was just of size 1. Since the
    streaming server read from the channel and then executes
    the leadership acquired code, it could not read from the
    notification channel that caused the RAFT library to block
    on a go channel send, which then made the Barrier() call
    block.
    
    I believe the right approach is to have a bigger notification
    go channel instead of making Barrier() time out. If it does
    timeout, the server should then transfer leadership, which
    I am afraid could cause a cascading effect if all servers
    getting elected need longer that the chosen timeout to
    apply all the preceding entries to the FSM.
    
    Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
    kozlovic committed Mar 30, 2023
    Configuration menu
    Copy the full SHA
    e48a0c7 View commit details
    Browse the repository at this point in the history

Commits on Mar 31, 2023

  1. Change travis to exclude staticcheck on Go 1.18

    Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
    kozlovic committed Mar 31, 2023
    Configuration menu
    Copy the full SHA
    ee84146 View commit details
    Browse the repository at this point in the history