-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
reworking routing for performance and being more powerful #23
Comments
Maybe you can get rid of use-case E by adding "weight" to each endpoint. Anyway balancing with regexps is very hard and inefficient and I doubt that anybody will use if if they got A + weights. |
I’m using carbon-relay-hg to take old graphite compatible apps inside influxdb server so I’m not using any of the functionality you’re talking about (right now). Don’t plane to use them either as I’ll remove everything graphite as soon as all my clients have been migrated to native influxdb… Thanks anyway. On 28Sep, 2014, at 21:57, Dieter Plaetinck notifications@github.com wrote:
|
@vladimir-smirnov-sociomantic yeah frankly i'm not so sure about E. I just added it because i assume that's why @rcrowley added the "first_only" parameter. i'ld love to know @rcrowley 's background/context so we can maybe get rid of the first_only stuff. |
looking at the inital commit, it seems @rcrowley did this so you could send all your metrics of staging and prod to the same relay, but send them to separate servers, like so:
so that's basically E, it probably makes sense to keep support for this, but as a pool of endpoints within 1 route. |
I am very interested in having load balancing similar to the C carbon relay floating around out there. I have several cyanite instances running (4 currently but expecting that to double) and I am taking in stead streams of data from several applications in the company and statsd so being able to load balance that traffic (round robin) is very useful as a scale out for me. |
ok so the more I think about it, the more the proposed design makes sense to me. note: when I say "match" i mean you can match on one or more of: prefix_string, substring, or regex (you can save lots of performance by bypassing regex checks)
the key benefit here is that we can combine various use cases without the route matching and expected behaviors getting in each other's ways. thoughts/feedback ? |
think i'm going to start working on this in a |
+1
|
so I've been doing some work and pushed it to https://github.com/graphite-ng/carbon-relay-ng/tree/next I would like fellow developers to look at https://github.com/graphite-ng/carbon-relay-ng/blob/next/HACKING.txt which lists the main todo's. |
i've pushed a bunch of work and did some very basic testing. |
Can you give an example of new config? I'll try to test it. |
i just run it with the included config for testing |
pushed a bunch of updates again. it needs some more work around resending lines that were sent when a connection broke (there's also some test cases for this, that "almost" work), but for the most part it seems to work fine. i just ran it in our pipeline and it held up fine :) (without using the admin interfaces). |
it's been a month, progress has been going well enough. i merged |
the new relay (in master) is ready for more testing. the web and tcp admin interfaces are a WIP but you can run the relay from a config file and test. |
The main readme states that round robin of a list of destinations in a route is not implemented. Is that accurate? Did this issue not include round-robin support? |
it may not have included it initially but the relay has supported it for a while now |
Hi everybody.
I'ld like to hear your thoughts on this. 1 and 2 are fairly obvious, but 3 could really use input on how you are using, or want to use the relay.
regex has much overhead. I don't think much can be done about this, so we should try to avoid regex matching when we can. I'm thinking of introducing "contains" and "starts_with" options for string checks in addition to regex, so that in many cases we could use those instead of regex and it will perform much better.
specifically, if the regex pattern is "", we shouldn't even do the regex Match() like we currently do.
I've noticed that first_only, while useful for some setups (see E below), prevents us from "also sending traffic elsewhere". so i started thinking about getting rid of first_only and making the routing/dispatching more powerful. that said, it's a neat mechanism because you can make routing decisions with just a bool check, no pattern checking needed.
so, taking a step back, i'ld like to collect use cases of how people want to use carbon-relay-ng, especially since i know some people are interested (and working on) sharding/consistent hashing, and round robin, and i've been pondering how to keep routing/dispatching performant, yet viable for various/all use cases. (ideally, without running multiple carbon-relay-ng instances if we can)
I see the following use cases
A consistent hashing / round robin to a pool of servers, for loadbalancing and HA (each pool is just one route)
B mirroring the same load to multiple machines, possibly a subset of traffic using regex pattern (for example to test influxdb next to graphite)
C sending a subset of metrics to anomaly detection
D sending all or some - of the metrics to carbon-tagger
E poor man's load balancing: sending certain metrics to server A, some others to B and maybe some others to C, using regexes to control what goes where (maybe some storage is faster then others), and using first_only to make sure we don't store metrics on more than one system.
am i missing a case? first_only is really only used for E, but it doesn't really work anymore as soon as you want to send some metrics also elsewhere like in B, C or D.
What I'm thinking is, to keep the system powerful, and still efficient, without overcomplicating things too much, we should implement E as a multi-endpoint route just like we are approaching round robin and hashing/sharding. the key is that these are multiple endpoints within one route, instead of multiple routes in the "global level"
this way, on the global routing/dispatching level, we check the pattern (and string check) of every route, and if it matches, the route gets it. but a route can be a special one with multiple endpoints,
and send traffic to those endpoints using a dedicated mechanism:
thoughts?
cc @robinbowes @rcrowley @pauloconnor @willowpet @pwielgolaski @shiaho @vladimir-smirnov-sociomantic @prune998 @curtisgithub
The text was updated successfully, but these errors were encountered: