-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nsqd: option to bound disk footprint #549
Comments
thoughts @jehiah ? |
To make sure i follow, you are suggesting that past this limit nsqd throws away messages from the oldest file, not that it refuses new PUB's right? |
Yes. |
Interesting, i can think of some desire to bound disk size, but i would normally equate that to wanting backpressure. It feels like throwing away messages on disk would be more natural based on the count of messages on disk. Would that target the same need here, or is there a demonstrable difference in use case between the two approaches? |
@jehiah Here's the IRC conversation I had with @mreiferson: https://gist.github.com/cespare/a353b739e4511842aeb1 I would prefer to bound disk space by bytes, rather than by number of messages, since I know how much disk I would like to be made available to NSQ. (To be honest that's what I'd want for the size of the in-memory channel bounds, too, which are currently specified with |
The only argument I can think of for using count would be consistency with the existing
Yea, me too 😦 - it makes the most sense operationally. Assuming we're all on the same page that this feature is reasonable from an operational perspective, I think my only real concern is for it to be as future-proof as possible. p.s. the other thing I forgot to mention was that this new config option would be disabled by default (i.e. retain the current "infinite" behavior). |
|
@mreiferson Sleeping on this issue has made me think about the value of a cleaner separation between ephemeral (not persisting topic/channel structure beyond connections) and max-message depths where overflows are discarded (ignoring temporarily weather the most recent or oldest or any is thrown away) either by size or number in memory or on disk or combined. That separation could also open up the opportunity to have a backpressure setting to determine either to refuses new PUBs (like our current disk write errors) or discards messages beyond set limits. (I haven't completely thought through how these do or don't play nice with planned future disk storage changes) @cespare I mention message count as a limit as there are several spots where i would find that more useful than byte size limits. We also use a script like this to drop messages beyond a threshold in some cases (like a dev instance). |
@jehiah these are interesting ideas - I think there was some IRC discussion recently debating some of this separation. I think the real tricky part is that all of this "configuration" needs to happen at runtime so talking about the potential ways to implement that is important. Do you feel like this conversation needs to happen as it pertains to this issue (a knob to bound disk footprint)? I lean towards these being separate. I do like the idea of being able to use either of size/count to configure these options, though. That does have some relevance to this discussion (and resolves the potential inconsistency we were about to introduce). |
I think if you don't give user an option, proper way is to stop accepting new messages with an error, once the limit is hit. |
@earwin I agree. We're okay with getting a failed PUB, tossing out old messages would not be desirable. Both use cases seem valid. |
I would be in the favor to stop accepting new messages. Deleting old messages is not desirable. |
The option of deleting old messages and the option of stop accepting new messages both have their use cases. |
There are two separate concerns: functional (how application wants to deal with overflows), and operational (what happens to the machine when overflow occurs). Applications might want to delete old ones, new ones, low priority ones, every second ones, ... — supporting this is a long and painful road, which I'm not sure nsq should take. As for operational concerns — we have a de-facto contract: nsq eats up disk space, stops accepting new messages. Rejecting incoming messages when hitting disk usage limit breaks nothing and fixes two points above. |
I see two different use cases with different requirements.
So we have three options
|
Option to drop newest instead of rejecting them is stupid (for a lack of better word). A client is perfectly capable of dropping the message itself after rejection (or maybe put into another queue, or a gazillion other options), it's not nsq's place to decide. While you see two usecases, there are much more. |
I agree completely This work would definitely make me happy regarding this problem. #625 |
An option to bound the disk queue size would be useful for us. We have staging and production consumers. Currently if staging stops consuming, all the disk space could be consumed by the staging channel, which would affect production as well. Are there any plans for implementing the limit for disk usage? What changes would be necessary to implement a global option to limit channel/topic disk size? |
In a conversation with @cespare on IRC, it would be useful to have a configuration option to bound the on-disk footprint of a given
DiskQueue
(topic/channel).I propose we add a configuration option/flag
--max-bytes-per-<INSERT GOOD NAME HERE>
that will denote the maximum size for aDiskQueue
. This means that you will effectively configure this at thensqd
level, but it will apply to an individual topic / channel'sDiskQueue
.Since
DiskQueue
is chunked by--max-bytes-per-file
(and it probably doesn't make sense to require this new maximum to be divisible by the chunk size), the easiest implementation would ephemerally track the aggregate size of those files (rounding down), and unlink the oldest file (updating metadata as appropriate).The text was updated successfully, but these errors were encountered: