Skip to content
This repository has been archived by the owner on Jan 30, 2020. It is now read-only.

Use gRPC to communicate the engine and agents #1426

Merged
merged 34 commits into from
Aug 12, 2016

Conversation

hectorj2f
Copy link
Contributor

This PR aims to provide a new communication mechanism to improve the performance, data transmission and unit state sharing between the fleet engine and agents in a fleet cluster.

Motivation: In our infrastructure, we have experienced some issues with fleet in terms of scalability, performance and fault-tolerance. Therefore we'd like to present our ideas to help improve those areas.

We use gRPC/HTTP2 as framework to expose all the required operations (schedule unit, destroy unit, save state, ...) that will allow to coordinate the engine (the fleet node elected as leader) with the agents. In this implementation, we provide a new registry that stores all the information in-memory. Nevertheless, it's also possible to use the etcd registry.

Generally, this implementation provides two solutions as mentioned above. You can use etcd if that fits better your architecture or requirements. OR you can use the in-memory registry to reduce the dependencies with etcd (but not to avoid using etcd). In that direction, we found out in our infrastructure that a high workload over etcd induces into a poor or wrong behavior of fleet. Besides, we believe that the use of etcd to provide inter-process communication for the agents, it could end up into potential bottlenecks, as well as it has an impact into the fault tolerance of fleet.

Additional information and plots about the motivation of this PR can be found below: #1426 (comment)

This PR has been labeled as WIP, we are still working on improvements, fault tolerance, bug fixing, etc :)

NOTE: If you want to try this PR, you need to rebuild the Go dependencies, and preferably to use Go v1.5.1. This PR was so big that we were forced to exclude our new Go dependencies.

@kayrus
Copy link
Contributor

kayrus commented Feb 5, 2016

@hectorj2f could you provide more info on use cases of this PR? You are trying to implement new protocol for fleet daemons communication and avoid etcd because it is slow? Is that correct?

@hectorj2f
Copy link
Contributor Author

could you provide more info on use cases of this PR?

@kayrus Yes, we'll add more information to this PR description such as motivation, use cases, etc...

You are trying to implement new protocol for fleet daemons communication and avoid etcd because it is slow? Is that correct?

Generally, this implementation provides two solutions as mentioned above. You can use etcd if that fits better your architecture or requirements. OR you can use the in-memory registry to reduce the dependencies with etcd (but not to avoid using etcd). In that direction, we found out in our infrastructure that a high workload over etcd induces into a poor or wrong behavior of fleet.

@mischief
Copy link
Contributor

mischief commented Feb 5, 2016

@hectorj2f perhaps it would be worth your time to investigate using fleet with the etcd v3 api, which uses gRPC.

@hectorj2f
Copy link
Contributor Author

@mischief yes, that won't be a bad idea although we believe that the use of etcd to provide inter-process communication for the agents, it could end up into potential bottlenecks, as well as it has an impact into the fault tolerance of fleet. Perhaps, @htr could add more light into this to clarify our intention with this PR.

On the other hand and related to your suggestion, Do you have benchmarking results to analyze the benefits of etcd v2 VS v3 ? We also use etcd in other projects :).

@kayrus
Copy link
Contributor

kayrus commented Feb 8, 2016

@htr
Copy link
Contributor

htr commented Feb 9, 2016

This PR addresses issues related with our fleet usage pattern. Usually our fleet cluster isn't composed by many nodes (<50), but it is expected to handle a non trivial (at this time) amount of units (>3000).

Currently, fleet's only registry implementation uses etcd to store transient and persistent data. More units means more unitStates being updated per second, which after some time causes some instability (an update can be very quick (<50ms) but also quite slow (~ 1s)).

This PR implements a multiplexing layer on top of the former EtcdRegistry. This means:

  • from the agent perspective:
    • each time an engine change is detected, a grpc connection to the new engine is established
    • any Registry operation is forwarded (via grpc) to the engine
  • from the engine perspective:
    • every time a new engine gets "elected", it starts a grpcserver
    • when an engine looses it's leadership, it stops the grpcserver, terminating existing connections
    • all the transient data is stored inmemory
    • the persistent data (units, desired state and schedule) is stored in etcd

Another issue that we encountered was the lack of cleanup (or GC) in fleet's etcd directory. This PR doesn't address that issue, but having a single writer makes the development of a solution much easier.

This implementation can coexist with the traditional etcd-centric implementation.

We have a micro benchmark (note the emphasis on micro benchmark) that we use to compare tweaks, configuration parameters, etc:

  • start NN units spaced by 100ms and wait until NN-10 units are actually started

The time between the start api call (or change desired state) and the actual start (each unit curls an endpoint with an unique identifier) is measured and stored. This delay is expected to increase with the number of units running in the system.

fleet 0.11.3 with some caching:
current fleet

this branch:
github com-htr-grpc-wip

@htr
Copy link
Contributor

htr commented Feb 9, 2016

perhaps it would be worth your time to investigate using fleet with the etcd v3 api, which uses gRPC.

@mischief we might, but I still think separating the storage of transient and persistent data is important.

"github.com/coreos/fleet/registry"
)

func (e *Engine) rpcLeadership(leaseTTL time.Duration, machID string) lease.Lease {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiang90 Yes, this fleet version doesn't use the etcd clientv3 yet. But I always keep one eye on your changes.

@hectorj2f hectorj2f changed the title [WIP] Use gRPC to communicate the engine and agents Use gRPC to communicate the engine and agents Apr 4, 2016
@hectorj2f
Copy link
Contributor Author

@kayrus @tixxdz @jonboulle I will update the PR to the latest version. But, do you want me to include grpc dependencies into the Godeps folder ? This could make harder to review this PR due to the amount of the files changed.

@kayrus
Copy link
Contributor

kayrus commented Apr 4, 2016

@hectorj2f Just create additional PR for godeps.

@hectorj2f
Copy link
Contributor Author

@kayrus Alright, thanks 👍

@jonboulle
Copy link
Contributor

hmm, additional commit should be fine..?

On Mon, Apr 4, 2016 at 5:02 PM, kayrus notifications@github.com wrote:

@hectorj2f https://github.com/hectorj2f Just create additional PR for
godeps.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#1426 (comment)

@kayrus
Copy link
Contributor

kayrus commented Apr 4, 2016

@jonboulle it will be hard to review whole PR

@jonboulle
Copy link
Contributor

Eh, having it in a separate commit seems sufficient to me,but whatever

On Mon, Apr 4, 2016 at 5:24 PM kayrus notifications@github.com wrote:

@jonboulle https://github.com/jonboulle it will be hard to review whole
PR


You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub
#1426 (comment)

@hectorj2f
Copy link
Contributor Author

@jonboulle If we do so, there would be approx. 230 files changed and github doesn't work well with those kind of PR :/.

@dongsupark
Copy link
Contributor

Recently I have been looking into this PR.
First of all, I totally agree with the original problem analysis and its approach with using GRPC. This seems to be the best way to avoid bottleneck on the side of etcd. Given the fact that etcd v3 is being implemented based on GRPC, I think this is the way to go.

On the other hand, it was not easy for me to follow each commit in this PR, from older to newer ones. Is there any chance to reorganize the PR, to reduce number of commits? (Though I can already see that it would not be trivial...)

And today I rebased this PR on top of master, fixing a minor thing to compile. (see the branch endocode/dongsu/grpc_engine_communication) And I ran functional tests.
Good news is that it runs without errors, with enable_grpc disabled.
Bad news is that 16 of 20 tests failed, with enable_grpc enabled. I suppose it could be the first issue for us to investigate.

@hectorj2f
Copy link
Contributor Author

Bad news is that 16 of 20 tests failed, with enable_grpc enabled. I suppose it could be the first issue for us to investigate.

@dongsupark I will upgrade this branch against the latest version and fix the tests, asap. I'd inform you if I find any blocking error. But the purpose is to have all the tests passing or adapt them if need it.

Is there any chance to reorganize the PR, to reduce number of commits? (Though I can already see that it would not be trivial...)

I feel your pain. I can squash our commits but we can do that once tests are happy, and code is LGTMed.

@antrik
Copy link
Contributor

antrik commented Apr 12, 2016

@hectorj2f I think the point is that having a messy history actually makes the review harder... While github favours reviewing all commits as one big diff, that's not the traditional way to deal with pull requests; and it can be quite daunting for complex changes. Thus it's better to clean up history before submitting upstream, to help the review.

Note that "clean up" doesn't typically mean to squash everything into a single commit -- that wouldn't really help. Such complex changes can usually be made as a logical series of individual commits, each representing some meaningful change that can be explained on its own, and doesn't result in anything breaking...

@hectorj2f
Copy link
Contributor Author

@antrik Alright!. It makes sense, I'll group the commits and add helpful messages for them.

@dongsupark
Copy link
Contributor

@hectorj2f

16 of 20 tests failed, with enable_grpc enabled

Ah, sorry. This was due to my mistake. For some reason I set ExecStart=/opt/fleet/fleetd --enable_grpc=1 in util/config.go, which was not supported. I should have configured enable_grpc=1 in /opt/fleet/fleet.conf. Doing that, functional test is working correctly. Sorry for the noise. (Fundamentally I don't like nc.Fleetctl failing silently without printing out stderr, but that's another issue..)

As for the logical structure of commits, I think it was not that bad in the beginning. The first 10 commits were relatively good to understand, well-structured. It's just that at some point it started to grow like that. :-)

@hectorj2f
Copy link
Contributor Author

@dongsupark I rebased against master but I am getting some errors in the tests of fleetctl: https://gist.github.com/hectorj2f/c53edf850f67f6279559749a30727862

Does it ring any bell ? Where could I look at ?

@hectorj2f
Copy link
Contributor Author

@dongsupark Should I add the LICENSE to the header of the new files ? or you'd do it once it is merged.

@dongsupark
Copy link
Contributor

@hectorj2f That looks like an error from unit tests. I haven't seen such an error.
Sorry, but I'm on vacation this week. I'll be able to look into that probably next week.

@hectorj2f
Copy link
Contributor Author

Alright @dongsupark ! I'd see what I can do.

@dongsupark
Copy link
Contributor

FTR, the error in the unit test was already solved earlier in this week.
Merge conflict between #1426 and #1657 was also solved yesterday.

LGTM.
I'll create another PR just for running functional tests on semaphoreci.
If that passes, I'd like to merge #1426, #1657, and #1656 tomorrow.

@hectorj2f
Copy link
Contributor Author

Great @dongsupark

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants