Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow the message frame to create chunks as large as transport will allow #313

Closed
wants to merge 3 commits into from

Conversation

louiscryan
Copy link
Contributor

For #312

@louiscryan
Copy link
Contributor Author

FYI - This PR is work in progress, initially just demonstrates idea.

@nmittler
Copy link
Member

@louiscryan One potential issue I see with this is that OutputStreamAdapter which calls writeRaw for every call to write(byte[],int, int), which could lead to a lot of buffer allocations. I guess this won't be a problem in most cases since we'll be using one of the Deferred input streams, which I think would make a single call to write.

It would be interesting to see if the number of allocations increases significantly with this change.

@ejona86
Copy link
Member

ejona86 commented Apr 19, 2015

Note that right now this guarantees two buffer allocations per message: 1 for the header and 1 for the body. That isn't all that hard to fix though, so it can still be useful in seeing how things behave for really large messages.

@louiscryan
Copy link
Contributor Author

@nmittler The buffer allocator can and probably should choose to allocate a buffer with a minimum size in the case where the messages are small to address this issue when we fallback to ByteStreams.copy for the degenerate case.

Similarly the allocator can place an upper bound on the size of the buffer allocated so we don't fragment buffer pools or thrash the heap for very large messages.

The other optimization for more sophisticated bindings is to allow the frame to receive messages that are already in the right buffer format for the transport layer. Right now we only admit byte[] in flushTo but we could let a deferred stream act as a factory for transport buffers

@louiscryan
Copy link
Contributor Author

@ejona86 yes, using one alloc for the header and the message instead of two does make a difference too. For context some numbers from running QpsClient with 16k payloads, 10s warmup, 20s run

baseline --> 8574 qps
one alloc for header & body --> 9082 qps
one alloc for header & body & alloc message len --> 10429 qps

which is a pretty decent improvement

@louiscryan
Copy link
Contributor Author

@ejona86 Ready for review

@@ -32,6 +32,7 @@
package io.grpc.testing.integration;

import io.grpc.ChannelImpl;
import io.grpc.transport.netty.NegotiationType;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert the changes to this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused import is the only change to this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@ejona86
Copy link
Member

ejona86 commented Apr 21, 2015

@louiscryan, LGTM, after reverting changes to Http2NettyTest. After reverting that Travis should turn green.

@@ -181,7 +184,8 @@ private void writeRaw(byte[] b, int off, int len) {
commitToSink(false);
}
if (buffer == null) {
buffer = bufferAllocator.allocate(maxFrameSize);
// Allocate as big a buffer as the allocator will allow
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment correct? Shouldn't it be something like "Allocate a buffer that is at least big enough to for this write"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly no, see discussion below. Renamed param in interface to 'capacityHint'

@louiscryan
Copy link
Contributor Author

PTAL

* outside of the arena-pool which are orders of magnitude slower. The Netty transport can receive
* buffers of arbitrary size and will chunk them based on flow-control so there is no transport
* requirement for an upper bound.
* <p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be on the same line as note:

Note:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@louiscryan
Copy link
Contributor Author

PTAL

@ejona86
Copy link
Member

ejona86 commented Apr 23, 2015

@louiscryan, LGTM, after fixing Travis failure (checkstyle failure).

Louis Ryan added 2 commits April 23, 2015 16:45
Set upper and lower bounds for Netty & OkHttp allocators based on transport limitations and benchmark results.
Fix OkHttp OutboundFlowController to chunk payloads larger than frameWriter.maxDataLength
Allow OkHttp to allocate buffers to FrameWriter larger than max DATA length
@louiscryan
Copy link
Contributor Author

Merged. Note that the performance numbers shown above depend on the ability to send larger messages in a single DATA frame and thus need larger flow-control windows. The ability to configure that is coming in a separate PR from @buchgr

@lock lock bot locked as resolved and limited conversation to collaborators Jan 23, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants