-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clientv3 requires two copies of byte data for a Put() #7249
Comments
We probably can use buffer pool or pre allocate some bytes array to avoid the frequent allocation.
Can you clarify on this? I do not quite get it. |
Generally interfaces that deal with binary data would accept []byte to avoid the copy that the compiler adds (to keep the string immutable). To the second point Op.key and Op.val are private, so it's not possible for an advanced client to add a "WithOptions" func that set those directly, although that might still result in a copy if the compiler can't be certain the values don't escape. |
I.e. |
@smarterclayton is this an end-to-end profile or a microbenchmark? Is there a major performance penalty in terms of latency/throughput? The overhead of doing a write to etcd is already very expensive compared to an allocation; it's not clear if saving the alloc here is a big win for overall system performance. |
In this case I was demonstrating micro, but in my tests for high range transactions it has come up. I've been prototyping changes for Kubernetes that allow high cardinality range selection from a Txn (hundreds of key subsets) and the 3x penalty on keys contributes materially to the construction of the query. In embedded cases, this is a non trivial amount of allocations / copies and becomes more noticeable (investigating local indices with embedded that mirror a primary server under a different cluster). While it may not be a huge issue in most cases, having an API that requires those allocations and copies for a kv store is unusual. |
@smarterclayton the embedded case is still sort of pricey without #4709 since the client will still need to go through a socket (e.g., unix socket) to get to the embedded server. Ideally the embedded client would have implementations of KV/Txn/etc that make direct method calls into Anyway, the reasoning behind the copy, just to clarify that some thought was put behind it, is strings communicate/enforce constness while |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions. |
ClientV3 OptPut requires a value in string - for clients that set binary output, if OpPut is not inlined then at least one copy is required (one to turn []byte -> string and one to turn it back into []byte). This causes significant additional garbage collection in write heavy workloads.
I am not seeing OpPut inlined without an explicit call in kubernetes - for very small keys and values the overhead is 10% of total space:
and is high for objects:
Since key and val are private it is not possible to adjust the client library without dropping into raw protobuf.
The text was updated successfully, but these errors were encountered: