-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add queue inspector facility #29
Conversation
I mentioned this in the gitter in passing but am more than aware I'm going outside protocol by barging in with a PR with not even an issue to validate the idea; Please do not hesitate to bat this back if a more appropriate solution should be used. The doc update is speculative in nature; please feel free to deny it. Likewise, I'm happy to remove the formatting changes, as they're a touch on the obnoxious side ;) |
7cf874d
to
b56e8e0
Compare
This is more or less ready to review from my perspective. The third commit (below) has the meat - the preceding two are debatable, and I'm open to removing them; nobody likes a sticky-beak ;) If the tests should be moved to another file (or should lean less on excessive sleeping; I could change to use signalling if you wish), or you'd like me to add a short test to the core tests in |
b56e8e0
to
01aeb00
Compare
Thanks for the PR! A couple of initial thoughts - Is it possible (and not too much trouble) to send the For the monitoring mechanism, is the regular reporting a necessity or a nice-to-have for you? E.g. would a simpler callback-based If the regular report is the main aim, could we abstract away the underlying queue design, and just pass the queue's size and maximum capacity to the callback? Is there any action the callback could take that would alleviate the problem or modify the queue? Thanks again! |
(The formatting/xUnit updates also look like a win, to me 👍 ) |
Thanks Nick! Will split off the Re monitoring, having the pushing be managed/owned by sink was more an "easiest way to get something going" assume more people are interested thing. An My stretch goal for this is to be able to use a similar pattern to batch audit transmissions going to kafka with a way to alert on delays (i.e. if the design can be similar, tjat would be a good sign that it's a flexible scheme). Another thing that springs to mind is that having a monitor (esp on a push basis, all-but-covers the In my initial thinking I was going to pass 2 args to the action (count and limit), but replaced that with the whole object on the basis that there might be other things people would be interested to grab. However I can see that wrapping it would give important isolation, and it also takes some namespaces out of the picture. One complication is that the caller needs to consider the lifetime / ObjectDisposedException possibilities (on reflection they probably had to anyway, and an encapsulating wrapper can concea; that.) Is there prior art for having the config DSL to hand out a ref to an inner object? I'll get cracking on calving off the tidying into a separate PR for now... |
@jose3m @cocowalla Do either of you guys happen to have ideas as to what you'd like to see in an API such as this - can this scheme possibly address the desire behind #13 ? |
01aeb00
to
cd58cbd
Compare
cd58cbd
to
579a0cf
Compare
Having written far better tests in the preceding PR, I'm hating the one I wrote here ;) I'm thinking that regardless of whether the API remains as an active Timer-driven callback or we provide a way to reference the inner buffer in the course of configuration, I'm happy to work on that when we reach an agreement as to how it should look; right now all I'm thinking is "you don't want to go emitting refs to buffers as an out arg", without having any ideas that are not in line with the API that I have as it is. (Spiking an |
@nblumhardt Should hopefully be more acceptable now - the names can probably benefit from a second set of eyes and the readme perhaps needs to be less preachy, but this provides a more flexible hook without consuming a Timer or adding another target as I had previously done. I've left the commit that implements the first scheme on the branch for posterity but am more than open to eliding it - your call. Wild stab: Arguably the test is informative enough that it merits moving into the core Test Class ? (though instinct tells me this is equally likely to just be confusing) |
3534d8e
to
7ff190f
Compare
e78f22f
to
efe288c
Compare
Main thing I'd want is the possibility to do out-of-band alerting, which your changes certainly cover - I also like how simple this change is, while still providing a mechanism for the dev to plugin whatever monitoring mechanism they choose.
I wonder about this - is this done in any other sinks? Mind, I'm not sure Serilog actually exposes any mechanisms for the dev to otherwise get a reference to an individual ISink! @nblumhardt also asked if there was anything a callback could do that mutates the queue in some way as to alleviate the problem. |
@cocowalla I missed the point about the mutation, thanks for raising this. My general view on this sort of thing is that we're best off keeping it simple in this impl (call it CQS if you wish!). Having the role of I can absolutely picture such a facility being useful for any number of tricks (one thing that springs to mind is doing compression out of band iff there is a tailback), I think starting off with the constraint of not having it is a good thing - it might force fanning out into some form of The other aspect is (if you look at my first commit), things get more hairy the more you expose in terms of portability and general multitargetting busywork; exposing the info only via a constraining wrapper which honours the ISP already takes a potentially complex interaction with a specific implementation out of the picture (and one might in time imagine another sink implementing such an introspection interface to allow one to inspect backlogs). Right now, this is an easy to reason about/review/trust implementation and I'd like it to remain that way (I'm here as a refuge from what happens if your Async trickery gets too complex in a competing system). Re the specific idea of dumping the contents of the queue somewhere, that's a useful idea though; the way the Seq Sink works in Durable mode has overlaps with this too. In my case, the sort of problem I'm looking to solve for is for an ephemeral instance in a cluster being able to store and forward telemetry type info but being able to deal with a) an ingester being down b) not letting stuff get orphaned when shutting down an instance c) using the RAM thats available under normal conditions and not going to disk to optimize throughput given one would be using Disk more as a fallback when problems happen than as a first class strategy for Durability. As part of this, you want an Async to isolate one from the latency variation, and then a varying strategy behind that. In general doing another As for the |
I'm thinking this PR (primarily in service of clarifying the tests - asserting that messages actually got dropped not just be asserting they don't make it through), could even cover #13 by maintaining a forever-accumulating |
Actually on reflection, having this count is as valuable to me as being able to glean the point in time state with regard to queue backlog. @nblumhardt what are your thoughts on this - is it breaking the PR SRP too much to add this in or do you think we should :kill: :allthebirds: ? |
efe288c
to
dcd24ac
Compare
Added |
06b8a92
to
d3b5309
Compare
/// <summary> | ||
/// Provides a way to inspect the current state of Async wrapper's ingestion queue. | ||
/// </summary> | ||
public interface IQueueState |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit- would be preferable to have this in its own file.
Another nit :-) ... it's closer to being the sink state than the queue state, since the dropped count isn't logically a property of the queue..?
Just looking at namespacing etc., could we tuck this away under Serilog.Sinks.Async.IAsyncLogEventSinkState
or something along those lines, so that it's less likely to conflict with other types? I can't imagine there are too many IQueueState
interfaces out there, but since it's not something a user will type regularly, a more descriptive name might not hurt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my language of choice it should not be a separate file ;) Agree and will do.
The specific name is way better (the original was a speculative generality land grab) - dropped events are not a thing one wants as a universal 'feature'
Looks great, Ruben 👍 I've just added some minor comments on naming/namespacing, open to ideas. Also did wonder whether the Any thoughts? |
Not quite getting how a Can you expand a little on how you can see it being shaped ? ( The other open question is what the out param (if it is to remain one) should be called given a type name of |
/// <summary> | ||
/// Accumulated number of messages dropped due to attempted submission having breached <see cref="BufferSize"/> limit. | ||
/// </summary> | ||
long DroppedMessagesCount { get; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any advance on this name?
This is ready to merge from my perspective, but then I don't really understand what this needs to do to play well with the config mechanism; awaiting further guidance ;) |
Thanks for the follow-up :-) I'm not wedded to this idea BTW, but just to clarify my suggestion about turning the mechanism inside out so that it works with config (which I didn't think through or explain very well), here's a bit more detail: First, just to clarify, the PR as it's currently shaped won't (shouldn't) break configuration - it just won't be possible to use the new monitor feature via configuration. The internals in a config-friendly version would be the same as they are in the PR, but instead of receiving the sink as an var monitor = new MyAsyncBufferMonitor();
// <snip>
// Passing the monitor in, instead of getting the state out
.WriteTo.Async(..., monitor: monitor) I.e. we pass something in, rather than out. Inside the monitor.MonitorState(queueState); The monitor interface itself would get a single method, and the sink would dispose the implementation, if it implements The config tie-in then allows a fully-qualified type name to be specified for the Rushing this reply out in transit, apologies if this is still vague, please let me know what you think :-) Cheers! |
Ah, thanks for taking the time to explain clearly; thankfully I [hope I] get it ;) I agree this scheme should work well and it does resolve the lifetime management more clearly. |
Added a skeleton implementation. Some observations:
This makes me think we should avoid adding the complexity (surfacing and documenting another interface that there are no known users of) at this point for now. The other thing for me is that the specific way one configures an Async wrapper is not the sort of thing one actually wants to be coding in a config file given the subtle and complex behaviour and perf implications. For me, such work is best done programmatically in a We could park this last commit in a PR in my repo (or the main repo if you prefer), with an issue marking the fact that a config interface is a TODO. Then, at the point where someone actually wants/needs this, the finessing can be done with much more context as to how someone is going to actually wire it up in practice. Or am I wrong ? (I do hope I am and hence arrive back here in a few months with the perfect abstraction to straddle multiple hosting environments.) |
f74117f
to
6546958
Compare
@nblumhardt Let me know your thoughts - if you still believe this approach is more idiomatic and/or feel I'm worrying too much, I am of course happy to add the tests and/or remove the |
Hey Ruben, thanks for all of the analysis and thoughts 👍 I would personally have not expected async to be used much from configuration, either, but it ended up being the number one feature request we received from when the async wrapper was created through to the addition of configuration support, e.g. here, here, here and here. Hence, although I agree that it's not the prettiest API in the world 😀, I think it will avoid future work if we can preempt the inevitable requests for configuration support, and add it here.
Very good point; I think it was a mistake for me to suggest we dispose the monitor, since we didn't (always) create it; calling
Agreed; we'd be better off with only the one way of doing things, as you've implemented in the PR.
👍 - I think what's there is probably fine for now. What do you think? |
For avoidance of doubt: the branch as it is has both in situ It's certainly hard to argue that people don't want it. By the same token, I'm pretty confident it gets misused more than a little too! So we're saying I should
|
Ah, thanks - sorry, I'd missed that. Yes, I think that's the way to go 👍 |
bcbf490
to
59452ab
Compare
OK, I think this feels pretty final, including updating the README and the test - I think the README example works out OK, or at least better than I was expecting yesterday. Let me know if there's more to do (NB esp the README and the xmldoc need a thorough nitpicking; I've traversed it way too many times, but we'll obviously pay the price N times if we don't take the time to polish it - i.e., please be as harsh as you can on it so I can (ideally) get it polished and put to bed at some point tomorrow). |
59452ab
to
ab156fe
Compare
ab156fe
to
17e0779
Compare
Epic! Thank you for sticking with this, I think (hope?! :-) ) that we got a great solution out of all the back-and-forth. Dev build incoming! 🎉 |
This PR adds a Health Monitoring hook, as described in the included
README.md
edit:-UPDATE: at the moment, it also addresses #13 as a side-effect