-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
topic capacity #307
Comments
+1 for the question. Would love to hear an update answer on this question from the great minds https://groups.google.com/forum/#!topic/nsq-users/uR_-vjQkTbs |
@mediafinance yea, I really need to get around to writing something longer form as this question comes up a lot. Another recent conversation on the mailing list, for example: https://groups.google.com/forum/#!topic/nsq-users/AeoiVMt37eE @visionmedia The answer is really context specific. It sounds like you're trying to route messages on the publishing side to specific consumers (perhaps external clients?). We didn't explicitly design NSQ for this use case. In fact we did quite the opposite - we try really hard to decouple publishers from consumers. This doesn't mean that you can't architect a system where things work harmoniously, though. First, to answer your question, But, one topic per client isn't the only way that this can be structured. As I had suggested in the link above, you can introduce multiplexing components that would be better suited to routing/targeting specific clients. I whipped up a quick visual of what this topology might look like: I think there are a lot of benefits to structuring the problem like this... How does this match up with your requirements? Can you provide a little more information as to what you're building? Hope this helps. |
thanks for the reply! yeah we can totally service the jobs that way, that's similar to what we're doing now through rabbitmq - hashing to 100 different topics and pull from those in an attempt to handle busy producers fairer. We could always bump that up to 1000 or so to help mitigate but it would be killer if the supported it. |
The primary and most important difference between this proposal and your rabbitmq setup would be that this would not require any hashing. The Also notice there's only one topic. Each There is certainly a bit of redundant work being performed by
|
re-read your suggestion, sounds good to me, but wouldn't there still be a fairness issue? For example if producer A sent 15,000 messages at once, and then B / C sent 100 each, we'd be stuck processing A for a while if it's all in one topic (unless our maxInFlight chews through them fast enough) |
jumping in on this thread. NSQ message flow control is driven by the clients; This means that it's up to the client to distribute it's If however you had a single consumer and maxInFlight of 1, you'd get be connected to [1]: since this is client dependent; i'm speaking mostly of go-nsq behavior. semantics might be slightly different amongst other client languages. |
I should clarify some more, all the terms come off sounding like a client/producer node haha. What I really meant was a client as in one of our customers. They send us data (20,000+ of them), and we need a mechanism to mitigate the issue of one customer sending a ton of data at once, meanwhile the rest are stuck waiting in line while we process those. I can see the mux technique working, but with our number of customers growing every day I imagine all those copies might become a decent burden on the system? I was thinking maybe as an alternative (since we know before-hand) we could group all the small customers that don't produce much into one topic, and then the larger enterprise customers could each have their own, but it's hard to say how large that will get as well. Currently when one gets saturated in rabbit, all of the others that are hashed into that queue are left waiting, which is definitely not ideal, delivery time in our case definitely matters |
@visionmedia according to your description, it doesn't sound like you even need the I suspect some of your current issues aren't as relevant in the suggested NSQ setup (everything but the Now, this is all predicated on the fact that your consumer layer has sufficient resources to handle varying aggregate volume. At least, in this situation, either everything backs up or nothing, which I believe is desirable. Specifically, I'm suggesting:
|
cool thanks for the feedback! I'll so some more testing and see how things behave |
hey! just a general question since I'm not super familiar with the internals yet, would you say it would be bad practice to use hundreds of thousands of topics? I was planning on using one per-client, in the past we've just hashed but that obviously has visibility and fair-queue cons, but I can't see things scaling too well if we go with per-project topic mapping either haha. really enjoying nsq so far! thanks :D
The text was updated successfully, but these errors were encountered: