Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix "broker received out of order sequence" when brokers die #1661

Merged
merged 2 commits into from
May 4, 2020

Commits on Apr 14, 2020

  1. Fix "broker received out of order sequence" when brokers die

    When the following three conditions are satisfied, the producer code can
    skip message sequence numbers and cause the broker to complain that the
    sequences are out of order:
    
    * config.Producer.Idempotent is set
    * The producer loses, and then regains, its connection to a broker
    * The client code continues to attempt to produce messages whilst the
    broker is unavailable.
    
    For every message the client attempted to send while the broker is
    unavailable, the transaction manager sequence number will be
    incremented, however these messages will eventually fail and return an
    error to the caller. When the broker re-appears, and another message is
    published, it's sequence number is higher than the last one the broker
    remembered - the values that were attempted while it was down were never
    seen. Thus, from the broker's perspective, it's seeing out-of-order
    sequence numbers.
    
    The fix to this has a few parts:
    
    * Don't obtain a sequence number from the transaction manager until
    we're sure we want to try publishing the message
    * Affix the producer ID and epoch to the message once the sequence is
    generated
    * Increment the transaction manager epoch (and reset all sequence
    numbers to zero) when we permenantly fail to publish a message. That
    represents a sequence that the broker will never see, so the only safe
    thing to do is to roll over the epoch number.
    * Ensure we don't publish message sets that contain messages from
    multiple transaction manager epochs.
    KJTsanaktsidis committed Apr 14, 2020
    Configuration menu
    Copy the full SHA
    9df3038 View commit details
    Browse the repository at this point in the history

Commits on Apr 16, 2020

  1. Configuration menu
    Copy the full SHA
    ca14191 View commit details
    Browse the repository at this point in the history