Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nsqd: fixing race with sending messages and topic/channel associations changing #828

Closed
wants to merge 1 commit into from

Conversation

stephensearles
Copy link

fixes #826

case msg = <-memoryMsgChan:
case buf = <-backendChan:
msg, err = decodeMessage(buf)
if err != nil {
t.ctx.nsqd.logf("ERROR: failed to decode message - %s", err)
continue
}
case <-t.channelUpdateChan:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the crux of the problem was actually that the select statement here could possibly never choose this path, or more likely, choose one of the message chan paths before choosing this one.

@@ -263,34 +269,49 @@ func (t *Topic) messagePump() {
goto exit
}

for i, channel := range chans {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and then this loop happens while some other process could be trying to change the set of channels. I'm not sure that's a problem, but putting this under the topic's lock will help make channel operations more atomic.

@mreiferson
Copy link
Member

@stephensearles thanks for the PR and the investigation.

You're correct that it's theoretically possible for channel creation to be handled out-of-order from message routing (in the Go runtime we trust), but I don't think it's a practical issue worth solving and certainly not at the expense of larger critical sections between locks. More specifically, the edge cases where this would occur imply that other conditions exist that could cause messages to be "lost" to a new channel (e.g. publishing to the topic while creating said channel).

I'm 👎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nsqd: not properly creating channels registered in lookupd on seeing a new topic
2 participants