feat: transactions #323

rkrishn7 · 2024-08-17T02:02:06Z

This PR adds client support for Pulsar Transactions.

Resolves #253

Notes:

This PR tries to mimic the Java client where possible. More details on transactions in Pulsar can be found here.
Will likely need to edit the Pulsar service container in CI to enable the TC and bootstrap transaction metadata
This PR necessitates a breaking version change as it adds a new error variant

Open Questions:

Should these changes be gated behind a feature flag?
This change likely has implications when buffering in the producer. Thoughts on how we should handle that?
The main Transaction API uses internal synchronization to provide a "no mut" public interface. Are there any alternative approaches that may be better?

rkrishn7 · 2024-08-17T02:03:09Z

@FlorentinDUBOIS Tagging you for visibility. I know this is a lot so no rush here!

rkrishn7 · 2024-08-25T17:54:59Z

Bumping and tagging some additional folks - @freeznet @BewareMyPower

BewareMyPower · 2024-10-18T01:43:45Z

I'm going to review this PR soon.

FlorentinDUBOIS · 2024-10-18T09:04:20Z

Thanks you @rkrishn7, I will take a look at this pull request and test it in production next week. Sorry for the delay for my answer

BewareMyPower · 2024-10-18T12:13:13Z

src/transaction/meta_store_handler.rs

+                        current_backoff = std::cmp::min(
+                            Self::OP_MAX_BACKOFF * 2u32.saturating_pow(current_retries),
+                            Self::OP_MAX_BACKOFF,


Why use cmp::min here? OP_MAX_BACKOFF should always be the smaller one. It looks like you should use OP_MIN_BACKOFF * 2u32.saturating_pow(current_retries).

BTW, could you reuse the operation_retry_parameters in Pulsar::new instead of hard coding backoff parameters? And it would be better to abstract a Backoff class like Java so that we can reuse the logic in connect_inner rather than rewriting the same logic again

Why use cmp::min here? OP_MAX_BACKOFF should always be the smaller one. It looks like you should use OP_MIN_BACKOFF * 2u32.saturating_pow(current_retries).

Thanks for catching that! Updated to use OP_MIN_BACKOFF there.

BTW, could you reuse the operation_retry_parameters in Pulsar::new instead of hard coding backoff parameters?

The same operation_retry_parameters as declared in Pulsar::new should already be getting used here because these operations rely on ConnectionSender::send_message .

The retry logic here is solely for TransactionCoordinatorNotFound errors. I think it's possible to make the backoff parameters here configurable, but they should be different from operation_retry_parameters.

And it would be better to abstract a Backoff class like Java so that we can reuse the logic in connect_inner rather than rewriting the same logic again

Agreed, but I think maybe we can make another issue for this? May not be a great idea to increase the scope of this PR since it's already quite large.

BewareMyPower · 2024-10-18T12:26:27Z

Should these changes be gated behind a feature flag?

I don't think so.

This change likely has implications when buffering in the producer. Thoughts on how we should handle that?

Sorry I don't get it. It only registers the partition with the transaction id via the ADD_PARTITION_TO_TXN command before adding the message to the buffer. It's necessary for transactional messages and has no impact on the regular send without transaction.

rkrishn7 · 2024-11-17T20:30:45Z

Should these changes be gated behind a feature flag?

I don't think so.

👍🏾

This change likely has implications when buffering in the producer. Thoughts on how we should handle that?

Sorry I don't get it. It only registers the partition with the transaction id via the ADD_PARTITION_TO_TXN command before adding the message to the buffer. It's necessary for transactional messages and has no impact on the regular send without transaction.

Sorry! To clarify, I think there's a potential footgun lurking here. For example:

Let's say we've enabled transactions and start producing transactional messages with a batching producer. If we haven't hit the batching threshold before the transaction's timeout, then all those buffered messages will fail to be produced. This is subtle, but definitely seems undesirable.

It seems like either transactional messages should bypass batching or we should warn the user if both transactions and batching are enabled.

rkrishn7 added 3 commits August 16, 2024 18:48

transaction mod

22fa96f

update supporting code

155608b

add txn example

b67cd0a

rkrishn7 and others added 3 commits August 26, 2024 09:58

fix: clippy errors

8b075cc

update Pulsar standalone in CI

e66b5f3

Merge branch 'master' into feat/transactions

e2b416d

BewareMyPower assigned rkrishn7 Oct 18, 2024

BewareMyPower reviewed Oct 18, 2024

View reviewed changes

BewareMyPower and others added 4 commits October 22, 2024 20:34

Merge branch 'master' into feat/transactions

409d4a5

fix: formatting errors

e655d12

fix: new_txn doctest

81e1d86

fix: backoff logic

07babe3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: transactions #323

feat: transactions #323

rkrishn7 commented Aug 17, 2024 •

edited

Loading

rkrishn7 commented Aug 17, 2024

rkrishn7 commented Aug 25, 2024

BewareMyPower commented Oct 18, 2024

FlorentinDUBOIS commented Oct 18, 2024

BewareMyPower Oct 18, 2024

rkrishn7 Nov 17, 2024

BewareMyPower commented Oct 18, 2024

rkrishn7 commented Nov 17, 2024

feat: transactions #323

Are you sure you want to change the base?

feat: transactions #323

Conversation

rkrishn7 commented Aug 17, 2024 • edited Loading

rkrishn7 commented Aug 17, 2024

rkrishn7 commented Aug 25, 2024

BewareMyPower commented Oct 18, 2024

FlorentinDUBOIS commented Oct 18, 2024

BewareMyPower Oct 18, 2024

Choose a reason for hiding this comment

rkrishn7 Nov 17, 2024

Choose a reason for hiding this comment

BewareMyPower commented Oct 18, 2024

rkrishn7 commented Nov 17, 2024

rkrishn7 commented Aug 17, 2024 •

edited

Loading