New box.ctl.on_replication_split_brain_rollback
event
#10943
CuriousGeorgiy
started this conversation in
RFC
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The unconfirmed asynchronous transactions from the old term will be rolled back just as unconfirmed synchronous transactions are currently rolled back: by
txn_limbo_read_rollback
after thePROMOTE
request is written.To give users more flexibility and allow them to take action to save their data when
replication_split_brain_handling_mode
is not set tonone
, we will introduce a new system eventbox.ctl.on_replication_split_brain_rollback.
The event will be delivered before writing a
PROMOTE
request for each asynchronous transaction in the synchronous queue (with theTXN_EARLY_ACK
flag set) that will be rolled back thePROMOTE
request. This will ensure that if the event trigger fails, thePROMOTE
request will also fail,ER_SPLIT_BRAIN
will be raised, and the asynchronously committed data will not be lost.Event trigger arguments
Information about the asynchronous transaction that will be rolled back by a
PROMOTE
request will be passed to the trigger.The trigger will receive a transaction statement iterator similar to that of other transaction event trigger. The iterator will yield transaction statement information as described in the format of the
_repair_queue
space (Space format). The information will be decoded fromstmt→row
and pushed onto the Lua stack in a format similar to thexlog
module. I.e., if possible, tuples will pushed, otherwise, Msgpack objects will be pushed.Event trigger failure
If the trigger fails, the corresponding
PROMOTE
request will also fail with anER_SPLIT_BRAIN
error.Transactions in event trigger
It will be guaranteed that before the event trigger is called, there active fiber will have no active transactions. I.e., the rolled back transaction will be detached from the active fiber.
Therefore, the trigger will be able to start new transactions. It will only be able to write to local spaces , which will be ensured by the
limbo→is_in_rollback
flag.When all the asynchronous transactions will be processed by the trigger, the last transaction, if any, will be committed. If the commit of the last transaction fails, the corresponding promote request will also fail with an
ER_SPLIT_BRAIN
error.Retrying a failed
PROMOTE
requestThe semantics of the
box.ctl.on_replication_split_brain_rollback
event require its event trigger to be called multiple times if thePROMOTE
request or the trigger fail. Therefore, the trigger must be idempotent with respect to transactions.The
transaction_id
value (namely, thetxn→id
field) that will be yielded by the transaction statement iterator passed to the trigger will be sufficient to uniquely determine transactions.Handling synchronous
PROMOTE
requestsWith synchronous
PROMOTE
requests, the promote effect takes place only after the correspondingCONFIRM
request.To handle synchronous
PROMOTE
requests the event will need to be delivered before writing the correspondingCONFIRM
request, rather than before writing thePROMOTE
request. The same semantics for failure and retrying a failed request will apply.Beta Was this translation helpful? Give feedback.
All reactions