-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to fix data loss when using the transaction and some kafka partitions are down #2016
Comments
The design of maxwell's So if you have: row 1 <- succeeded rows 1 and 3 will reach kafka but |
hi osheroff, it seems that you don't get my meaning, let me explain again InflightMessage doesn't record no final tx-commit action in a local transaction even it's a insert action as the following so in my above case: insert A (no tx commit) doesn't be added in InflightMessage. so if "insert A " 's position is 1, insert B (tx commit)'s position is 2. Then the all process is as the following timeline: At last, when maxwell restarts, it will continue from position 2(insert B) but not position 1(insert A), and insert A doesn't send kafka message successfully, right? @osheroff |
if rows in a transaction are flowing to different partitions, data loss was possible if one partition got stuck while the "commit" message made progress. This has the net affect of serializing maxwell's ability to make progress in a binlog. i think that's mostly a good thing.
ah, yeah, sorry. I missed that the rows inside a transaction are sent to different positions. #2019 addresses this, but I do want to be a bit cautious about the increased memory usage possible here. |
thank you , I get it. Would you also make DEFAULT_CAPACITY in InflightMessageList can be configured? That means we need to increase this capactity |
Right idea, but dumb implementation. This reverts commit c20370c.
backgroud:
As we all know, inflightMessages just add tx-commit record. That means if we start a local transaction as the following:
begin transaction
insert A (no tx commit)
insert B (tx commit)
end transaction
And if "insert A" and "insert B" dispatch to different kafka partitions, "insert B" completes callbacking before "insert A" out of order.
The event timeline is as the following:
1 "insert B" completes callbacking and updates the this tx's position (tx commit)
2 some kafka partitions are down
3 partitons don't receive "insert A" data yet
Finally, dose it main we lose the data "insert A" ? Is there any way to fix this problem?
The text was updated successfully, but these errors were encountered: