-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Try to drain incoming and peer queues for an amount of time #1440
Conversation
|
||
// let't try to process as many spans as we can in the next 100ms | ||
// TODO: make timer configurable? | ||
timer := time.NewTimer(time.Millisecond * 100) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now each collect loop iteration takes between 10ms to 20ms. Setting the timer to 100ms means we probably are going to process 5-10 items off of the queue each time. Is that enough?
I wonder if we should drain the queue based on the number of items process instead of a timer to make sure we are always draining X amount of items for each iteration.
I don't know which strategy is better. We can try the timer based first to see how it goes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the timer approach is worth trying. If we want it to be more flexible, we could add a temporary config option while we're testing it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, maybe we can add a metric here to see how many items we drain from the queue each time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love to give this a try in production!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks interesting. Let's try it!
Which problem is this PR solving?
We're seeing that incoming and peer queues can sometimes build up and it takes a long time to recover. A theory is that this is because the collect loop has so many options, we should allow multiple messages to be read from the incoming and peer queues whenever that it's selected to be worked on.
Short description of the changes