-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nsqd: panic when running two instances with same -data-path #583
Conversation
The way diskqueue filenames are currently implemented, it's not going to work. I don't have strong feelings on this either. It sounds like something we should probably just document? It isn't too hard of a requirement to ask that data paths be unique. Thoughts @jehiah? |
I'm not sure there is a portable way to solve this, but could we open |
Easy enough on platforms with flock(2): https://github.com/cespare/kvcache/blob/master/db.go#L376-L391 |
@cespare cool. Would you be interested in contributing that for the platforms where we can support it? |
I don't think it's as simple as locking on the metadata file - the nsqd would be competing trying to deliver + cleanup any persisted message backlog. To me, the question is: do we actually want you to be able to point two nsqd at the same data path? |
Oh, @mreiferson you mean because we write to a temp file and replace the metadata locking is somewhat a fools errand. We don't want to allow two nsqd to point at the same data path, i was thinking this as a way to ensure that you don't accidentally end up with that happening. (fail fast, fail early. |
@jehiah got it - maybe we just explicitly add a lock file to the data path that we can rely on to detect and fail fast? |
@mreiferson @jehiah |
You create the file if it doesn't exist, and then try to lock it. You don't need to delete it. When nsqd exits (maybe because it crashes), the flock is released. |
(Also if you have a dir to put the lock file, you might as well just lock the dir instead.) |
👍 cool :) |
👍 to locking dir |
RFR @jehiah Windows isn't implemented - if someone who's running on Windows wants to contribute those code paths that would be great! |
cf71b67
to
2505202
Compare
} | ||
n.swapOpts(opts) | ||
|
||
err := n.dl.Lock() | ||
if err != nil { | ||
n.logf("FATAL: --data-path=%s in use", dataPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be helpful if this error gave a little bit of a hint that it's in use by another nsqd, or that the resolution is to use a different/unique dirpath.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
2505202
to
404c09c
Compare
404c09c
to
c1362f3
Compare
updated |
nsqd: panic when running two instances with same -data-path
I'm getting: nsqd --lookupd-tcp-address=localhost:4160 --data-path=/data
[nsqd] 2017/04/21 16:18:59.356983 FATAL: --data-path=/data in use (possibly by another instance of nsqd) On a Mac OS machine, but that' the only process starting nsqd, could someone help me to setup a directory to store disk-backed messages |
please ask this kind of question at https://groups.google.com/d/forum/nsq-users |
This panic above is cause by
I assume that maybe
msgSize
is negative, and running multiple instances in the same directory is possible?