-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFE: Native Client3 - no socket for embedded servers. #4435
Comments
Talked about embedded etcd inside api server and avoiding extra network/cpu path with @bgrant0607. But @bgrant0607 mentioned this is not a good way to go. He might want to share some opinions. |
@xiang90 If it's hidden behind the client library, it shouldn't matter. It's a deployment constraint. |
@timothysc I understand that part. I am just not sure if k8s would actually embed etcd... |
Upstream may not, but that doesn't prevent anyone downstream. |
@timothysc I am fine with this idea. etcd server's/client's API layer is pretty decoupled with other component. It is quite easy to add support. But this would not be a high priority thing until we make v3 stable. |
ack. |
Long-lived sessions could reduce handshake overhead. There are multiple disadvantages of linking etcd into the apiserver in real clusters (would be great for hyperkube, though):
Borg uses a paxos log, not a key-value store. It has a highly compressed representation and supports multi-object transactions, but assumes that the state can only be written and read by the elected master, that all the state is reconstructed in memory before it can serve requests (i.e., post-crash recovery takes a while), and doesn't support Watch. In Omega, we split out the store. |
I don't believe this is any different from today, for openshift deploys.
Yup, That would be a trade off.
The strategy outlined would "allow" for the embedding, not force. Meaning dealers choice on the matter. Depending on the deployment, we can allow for multiple strategies. |
/cc @jeremyeder |
@xiang90 Do the caches make any sense in the native client case, wouldn't etcd hold the copy in memory in this case anyways. |
@timothysc etcd will hold all recent data in memory. It does not make a lot of sense to do client side caching in this case unless you are going to have a very tight loop. |
Keep them separated, but do kernel bypass for loopback connections...this is called tcp_friends and we failed, repeatedly, to get it upstream 4 years ago. Solaris and Windows can do it. @netoptimizer -- any thoughts from Spain? Onbox/hot-path etcd<-->kube via unix domain socket...can we do that now? |
@jeremyeder It's still overhead + extra caching. Making the client native makes little difference to the higher level code. We've already smashed the executables into a monolith as is, so why not make the simple transition to not pretend, in the smashed deployment config. It "should be" a single initialization config change. As is today, it's like talking over cell phones even though your in the same room. Buffering + encode + decode + latency. |
Ah if it's that simple then great. tcp_friends was to accelerate apps that weren't so simply dealt with. |
We (unfortunately) don't have any plans for tcp_friends ... I'm fixing
|
@timothysc I am closing this in favor of #4709. This is a dup with #4709. And #4709 provides a more convincing reason to do it. It would benefit you the same if implemented. |
Currently the client interface(s) to etcd all make the supposition that is talking over a client socket which has a number performance constraints if you potentially wish to embed etcd into a master/worker.
The main premise behind this would be to eliminate the encode<>decode<>transport + sec handshakes etc from the critical path of distributed systems.
In the case of kubernetes it would be smashing etcd into the api-server very similar to what is outlined by the borg-master in the paper. This would also allow the api-server to simply reference the client library and still enable external etcd if needed.
/cc @wojtek-t @hongchaodeng @xiang90
The text was updated successfully, but these errors were encountered: