Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Message chunking might not work in certain cases after PIP 132 #16195

Closed
BewareMyPower opened this issue Jun 23, 2022 · 0 comments · Fixed by #16196
Closed

Message chunking might not work in certain cases after PIP 132 #16195

BewareMyPower opened this issue Jun 23, 2022 · 0 comments · Fixed by #16196
Assignees
Labels
type/bug The PR fixed a bug or issue reported a bug

Comments

@BewareMyPower
Copy link
Contributor

Describe the bug
Sometimes a large message cannot be split into chunks and then failed to pass the isMessageSizeExceeded check. It's a bug introduced from PIP-132 (#14007).

To Reproduce

Add the following unit test to MessageChunkingTest:

    @Test
    public void testChunkSize() throws Exception {
        this.conf.setMaxMessageSize(100);

        final Producer<String> producer = pulsarClient.newProducer(Schema.STRING)
                .topic("my-property/my-ns/test")
                .enableChunking(true)
                .enableBatching(false)
                .create();
        producer.send(createMessagePayload(60));
    }

It will fail with

The producer test-0-0 of the topic persistent://my-property/my-ns/test sends a message with 102 bytes that exceeds 100 bytes

Expected behavior
It should not fail. Instead, this message should be split into two chunks and sent successfully.

@BewareMyPower BewareMyPower added the type/bug The PR fixed a bug or issue reported a bug label Jun 23, 2022
@BewareMyPower BewareMyPower self-assigned this Jun 23, 2022
BewareMyPower added a commit to BewareMyPower/pulsar that referenced this issue Jun 23, 2022
…hunks after PIP-132

Fixes apache#16195

### Motivation

[PIP-132](apache#14007) considers the
message metadata size when computing the payload chunk size and the
number of chunks. However, it could make some messages whose size is
less than `maxMessageSize` cannot be sent. There are two reasons:
1. The `MessageMetadata` will be updated after computing the payload
   chunk size, i.e. the actual metadata size would be greater.
2. `OpSendMsg#getMessageHeaderAndPayloadSize` doesn't exclude all bytes
   other than the metadata and payload, e.g. the 4 bytes checksum field.

For example, if the max message size is 100, send a string whose size is
60 with chunking enabled.
1. The initial metadata size is 25 so the chunk size is 75, the message
   won't be spit into chunks.
2. After `serializeAndSendMessage`, the metadata size becomes 32, so the
   serialized header's total size is 4 + 8 + 6 + 4 + 32 = 54, and the
  total size is 54 + 60 = 114, see `headerContentSize` in
  `serializeCommandSendWithSize`.
3. In `getMessageHeaderAndPayloadSize`, the returned value is computed
   by 114 - 8 - 4 = 102 > 100. The 6 bytes magic and checksum and 4
   bytes metadata length field are not included.

### Modifications

- Update the message metadata before computing the chunk size.
- Compute the correct size in `getMessageHeaderAndPayloadSize`.

### Verifying this change

Add `testChunkSize` to verify all sizes in range [1, maxMessageSize] can
be sent successfully when chunking is enabled.
BewareMyPower added a commit that referenced this issue Jun 28, 2022
…hunks after PIP-132 (#16196)

Fixes #16195

### Motivation

[PIP-132](#14007) considers the
message metadata size when computing the payload chunk size and the
number of chunks. However, it could make some messages whose size is
less than `maxMessageSize` cannot be sent. There are two reasons:
1. The `MessageMetadata` will be updated after computing the payload
   chunk size, i.e. the actual metadata size would be greater.
2. `OpSendMsg#getMessageHeaderAndPayloadSize` doesn't exclude all bytes
   other than the metadata and payload, e.g. the 4 bytes checksum field.

For example, if the max message size is 100, send a string whose size is
60 with chunking enabled.
1. The initial metadata size is 25 so the chunk size is 75, the message
   won't be spit into chunks.
2. After `serializeAndSendMessage`, the metadata size becomes 32, so the
   serialized header's total size is 4 + 8 + 6 + 4 + 32 = 54, and the
  total size is 54 + 60 = 114, see `headerContentSize` in
  `serializeCommandSendWithSize`.
3. In `getMessageHeaderAndPayloadSize`, the returned value is computed
   by 114 - 8 - 4 = 102 > 100. The 6 bytes magic and checksum and 4
   bytes metadata length field are not included.

### Modifications

- Update the message metadata before computing the chunk size.
- Compute the correct size in `getMessageHeaderAndPayloadSize`.

### Verifying this change

Add `testChunkSize` to verify all sizes in range [1, maxMessageSize] can
be sent successfully when chunking is enabled.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
1 participant