-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nsqd: channel sampling #286
Conversation
It is my opinion that sampling is a property of the client. This is similar to the state we're in now where sampling can only be implemented in the client by dropping messages on the floor. My proposal for this implementation is to move this to an equivalent-in-semantics, server-side feature that simply makes everything more efficient (which is often one of the goals of sampling in the first place). For the record, as things stand today, there exists no guarantee that 2 clients subscribed to the same channel will receive 50/50 distribution. It's likely that they will, but there's nothing that guarantees it. I think the claim that this approach makes it possible that two clients can be subscribed to the same channel with different sampling rates is equivalent to the likelihood of subscribing to the wrong channel, i.e. not likely. I acknowledge that it's obviously possible, but I believe naming convention and an improved |
❤️ it I know you're not going to like this old friend, but we need a test :) |
The test has been added and tests for sampling within a 3% variance of the sample rate. The final verdict was that this is being done through the |
The last commit was an addition assertion to ensure we've seen a minimum number of messages. The original test only checked for variance above the sample rate not below. |
I'm happy with this... any objections @jehiah ? |
I'm good. |
outputBufferTimeoutUpdateChan = nil | ||
case interval := <-heartbeatUpdateChan: | ||
client.Heartbeat.Stop() | ||
if interval > 0 { | ||
client.Heartbeat = time.NewTicker(interval) | ||
} | ||
// you can't update heartbeat anymore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you undo these unrelated changes that I had added from this line up to 226?
I want it in a separate commit with an explanation as to why it's happening... I will follow up with a separate PR.
This is still a work in progress, but I think a few things merit discussion before finishing.
The basics are done here for adding sampling through the
IDENTIFY
. I added a channelProperties struct like in the original commit to store meta information about the channel. The reason I added this again was for new clients. This way when a new client connects, if they have a different sampleRate from the another client on the same channel, either we can choose to throw an error (so all clients maintain the same sample rate) or we can allow it.I believe @mreiferson was suggesting having the channel have no knowledge of a client's sampling rate. I'm not sure if this is a good idea. Take for example, clientA sampling only 10% of the messages on channelA and clientB sampling 50% of the messages on channelA, what would we expect each client to get? 60%, 40%, 10% (because that client came in first), 50% because the client came last. I think it's not obvious what is to be expected by the client in this scenario. I like the idea of setting channelProperties (using the struct) because can then subscribe to the channel and be "upgraded" (or downgraded depending on how you view it) to sampling or to the correct samplingRate if the channel is already being sampled.