-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read the enitre message in one go without copies #108
Conversation
I've run |
rustbus/src/connection/ll_conn.rs
Outdated
|
||
const BUFSIZE: usize = 512; | ||
let mut tmpbuf = [0u8; BUFSIZE]; | ||
if self.msg_buf_in.len() < max_buffer_size { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to be if self.msg_buf_in.len() != max_buffer_size {
so that msg_buf_in
also gets resized to be smaller if the next message is smaller than the previous one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? I specifically didn't do this to avoid unnecessary memset
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the msg_buf_in
should only ever contain at most as many bytes as the message that is currently being read.
I think the msg_buf_in
should also be resized after the read to match how many bytes actually where received
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This information is stored in msg_buf_filled
. Imagine you receive a sequence of messages with the following sizes: 1K
, 128B
, 512B
, 1K
. The way this PR works right now, resizes would write 1K
of zeroes. With your suggestion it would write at least 2K-128B
of zeroes (best case, if each message is read in one recvmsg).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is true but I would accept that cost for the sake of an easier invariant and making the msg_buf_filled
field unnecessary. The few memsets are unlikely to be a bottleneck in the library.
Another thing I don't really like is that the code currently "reuses" the msg_buf_in
by cloning (part of) it when a full message has been received. I cases where there are typically small messages received one big message will forever bloat the memory usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you point where the cloning is?
rustbus/src/wire/unmarshal.rs:175
I think it would be better to do a swap here with a newly allocated vec of a sensible default size instead of doing it like this. Or maybe there is an even better way?
Right. Exactly as it was. .clear() does not release any memory
Yes, I am aware. I wanted to say that it's currently unsatisfying and that, while we are working on this anyways we could improve this aspect too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you point where the cloning is?
rustbus/src/wire/unmarshal.rs:175
I think it would be better to do a swap here with a newly allocated vec of a sensible default size instead of doing it like this. Or maybe there is an even better way?
Thanks! I completely missed this. A swap, even with an empty vector, sounds reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, should I .drain(..offset)
bytes from the buffer, or keep the offset to avoid shifting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think draining is fine as a first step towards improvement. Ideally the initial offset would be passed through to the unmarshalling code though. But that could be a separate PR, or you could leave it for me :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've implemented the take+drain part, but left the msg_buf_filled
for now. The annoying thing is that without it the message must be resized back in case of any error, not just partial reads.
No description provided.