Skip to content
This repository has been archived by the owner on Jun 23, 2022. It is now read-only.

fix: message body size unset after parsed which leads to large io throughputs #1008

Merged
merged 4 commits into from
Jan 10, 2022

Conversation

padmejin
Copy link
Contributor

@padmejin padmejin commented Jan 6, 2022

Issue

see apache/incubator-pegasus#866

Description

We have noticed pegasus 2.0 are using much more network/disk bandwidths than pegasus 1.12.3. After weeks of debugging, we have finally found that this is a bug introduced in #255, when a new thrift message parser that compatible with both old format and new format is refactored to replace the old one. Now we are trying to fix this.

The main cause is that when parsing the message body, instead of return exactly size of the body, it actually return the whole buffer that holds all the received messages. Thus, when there are many write requests piled up, from the moment the write request arrives, to the moment it writes to mutation logs and to other nodes, the throughput is significantly amplified.

To fix that we merely add one line to set the message body size before calling create_message_from_request_blob().

We also noticed that when a message header is found invalid, the bad message will not be consumed and discarded, instead it stays in the buffer forever. We think it is a bug. To fix this bug, we consume the buffer before calling create_message_from_request_blob().

@acelyc111
Copy link
Member

Better to check that there is no extra buffer space after message_ex been constructed, i.e. at the end of parse_request_body_v0/v1, check msg->header->body_length is equal to msg->buffers[1].length()

@hycdong
Copy link
Contributor

hycdong commented Jan 6, 2022

I raise a new issue to describe this bug:apache/incubator-pegasus#866. Could you please fulfill it according to your pull request?

@padmejin
Copy link
Contributor Author

padmejin commented Jan 6, 2022

I raise a new issue to describe this bug:apache/incubator-pegasus#866. Could you please fulfill it according to your pull request?

done.

acelyc111
acelyc111 previously approved these changes Jan 6, 2022
Copy link
Member

@acelyc111 acelyc111 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@foreverneverer foreverneverer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@levy5307 need re-check it

@levy5307
Copy link
Contributor

levy5307 commented Jan 7, 2022

We also noticed that when a message header is found invalid, the bad message will not be consumed and discarded, instead it stays in the buffer forever. We think it is a bug.

Could you add a unit test for this bug?

@padmejin
Copy link
Contributor Author

padmejin commented Jan 7, 2022

@padmejin padmejin closed this Jan 7, 2022
@acelyc111 acelyc111 reopened this Jan 7, 2022
@padmejin
Copy link
Contributor Author

padmejin commented Jan 7, 2022

We also noticed that when a message header is found invalid, the bad message will not be consumed and discarded, instead it stays in the buffer forever. We think it is a bug.

Could you add a unit test for this bug?

I meant to add a new test case, but then I found an old case that already cover this scenario. For some unknown reason, the author truncated the buffer before next message was parsed. I add a new line to check if the buffer is consumed.

Before this patch this case failed on the assert, but it passed after this patch. So I believed it works.

@foreverneverer
Copy link
Contributor

instead of return exactly size of the body, it actually return the whole buffer that holds all the received messages. Thus, when there are many write requests piled up, from the moment the write request arrives

Do you mean:

The request size task piled up queue as follow:
1KB 1KB 1KB 1KB 1KB 1KB

Then mutation log flush size will: 1KB、1K+1KB、1KB+1KB+1KB、1KB+1KB+1KB+1KB.......

@padmejin
Copy link
Contributor Author

padmejin commented Jan 7, 2022

instead of return exactly size of the body, it actually return the whole buffer that holds all the received messages. Thus, when there are many write requests piled up, from the moment the write request arrives

Do you mean:

The request size task piled up queue as follow: 1KB 1KB 1KB 1KB 1KB 1KB

Then mutation log flush size will: 1KB、1K+1KB、1KB+1KB+1KB、1KB+1KB+1KB+1KB.......

Yes.

Actually I printed the buffer size out and used awk & sort & uniq to analyze the distribution. The buffer size was up to 10000+, and all the number in a way conformed to arithmetic sequence. Not exactly, but followed some kinds of rules.

I also wrote a script to analyze all the slog file, and It turned out that a certain key pair appears multiple times in adjacent blocks, but when i dumped by mlog_dump I found there were no duplicated key pairs. I guessed those were just dirty data.

@padmejin
Copy link
Contributor Author

padmejin commented Jan 7, 2022

The following is a part of mlog_dump result. These three mutation log length were: [19971, 19649, 19327].

mutation [4.0.3.3873136]: gpid=4.0, ballot=3, decree=3873136, timestamp=1640322274780591, last_committed_decree=3873133, log_offset=22216353421, log_length=19971, update_count=1
  [MULTI_PUT] 1
    [PUT] "hashKeyPrefix_00000000000000000000000000000000000000000261387976" : "hashKeyPrefix_00000000000000000000000000000000000000000261387976" => 0 : "pElLpiaC0juYshL86KXI30N5mYvNlwQNkO5k44hKInyhyomgdO4Eswz2RWDrODyP"
mutation [4.0.3.3873137]: gpid=4.0, ballot=3, decree=3873137, timestamp=1640322274781203, last_committed_decree=3873135, log_offset=22216373392, log_length=19649, update_count=1
  [MULTI_PUT] 1
    [PUT] "hashKeyPrefix_00000000000000000000000000000000000000000054160672" : "hashKeyPrefix_00000000000000000000000000000000000000000054160672" => 0 : "drsrinPuO8Zty7nP0YU2RRyQKoSQCBBBcGUaFKOz4Ufk5C2pgPHf0pekCVDCuE7I"
mutation [4.0.3.3873138]: gpid=4.0, ballot=3, decree=3873138, timestamp=1640322274781608, last_committed_decree=3873135, log_offset=22216393041, log_length=19327, update_count=1
  [MULTI_PUT] 1
    [PUT] "hashKeyPrefix_00000000000000000000000000000000000000000336035963" : "hashKeyPrefix_00000000000000000000000000000000000000000336035963" => 0 : "c8kxhTIEw6c6TSAf0jDlV8R01QulIeHSIK3O364HDD7PP8CQGTNs4PldV9mz1PLr"

@acelyc111 acelyc111 merged commit 3d1c988 into XiaoMi:master Jan 10, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants