Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#1109 Make queue size of MetricJsonListener configurable #1114

Merged
merged 3 commits into from
Mar 11, 2016
Merged

#1109 Make queue size of MetricJsonListener configurable #1114

merged 3 commits into from
Mar 11, 2016

Conversation

anuragw
Copy link
Contributor

@anuragw anuragw commented Mar 2, 2016

The queue size of the MetricJsonListener used by the HystrixMetricsStreamServlet has been made configurable in two ways:

  • new property "hystrix.stream.defaultMetricListenerQueueSize", which if unspecified will be set to 1000
  • optional request parameter "queueSize", which can override the default queue size for each request

The changes have been made on the 1.4.x, but can also be propagated to the 1.5.x RC branch. Please let me know if you need any further information or would like to suggest changes.

@cloudbees-pull-request-builder

NetflixOSS » Hystrix » Hystrix-pull-requests #369 SUCCESS
This pull request looks good

MetricJsonListener jsonListener = new MetricJsonListener();
int queueSize = defaultMetricListenerQueueSize.get();
try {
String q = request.getParameter("queueSize");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you let the request set the queue size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a system talks to multiple, dynamically scaleable endpoints, it could be cumbersome to change the queue size only via a property. Providing a request parameter would allow an easy way for a visualization service (hystrix-dashboard) or perhaps an aggregation service (turbine) to reconfigure the queue size by just updating parameters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, the velocity of metrics is independent of who is consuming the metrics. What problem do you intend to solve by allowing the request to set the queue size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mention in my previous comment, there doesn't seem to be a way of scaling the queue size if the number of backend endpoints is scaled up or down.

For instance, a user-facing email service may only need 25 instances of a backend email retrieval service during off-peak but auto-scale to 100-150 during peak periods. In such a case, we would need to estimate the peak load, and set the queue size accordingly. If the volume of metrics exceeded that, the hystrix stream servlet would need to be reconfigured each time.

The consumer of metrics would be in a better position to anticipate the typical patterns in load, and adjust accordingly if it noticed the stats collection to be failing due to hitting queue size caps.

@mattrjacobs
Copy link
Contributor

@anuragw I don't follow your reasoning.

The HystrixMetricsPoller gets invoked within an infinite loop in HystrixMetricsStreamServlet. It loops over all of the metrics and gets the current value of the metric. There's a delay parameter in the URL that determines how frequently to generate these metrics (default is 500ms). Call this value D.

The HystrixMetricsPoller produces 1 String for each command + threadpool + collapser in your running system. Call that value M. So the rate at which the queue fills up is M/D.

The metrics consumption happens in the servlet thread and also has the same delay. It reads all values in the queue.

So the delay parameter to the servlet governs how quickly to both produce/consume metrics. But the number of metrics in the system doesn't really change over time, unless you're adding commands/threadpools/collapsers dynamically.

@anuragw
Copy link
Contributor Author

anuragw commented Mar 10, 2016

@mattrjacobs your last line says it all, "unless you're adding commands/threadpools/collapsers dynamically", our system does have this behavior.

One other option might be to configure the queueSize property to 100K or 1M, something huge, then use a very tiny delay, say 1ms. But wouldn't this tiny delay be very close to normal processing overheads in the servlet, making it hard to interpret the reported metrics?

@cloudbees-pull-request-builder

NetflixOSS » Hystrix » Hystrix-pull-requests #378 SUCCESS
This pull request looks good

@anuragw anuragw closed this Mar 10, 2016
@cloudbees-pull-request-builder

NetflixOSS » Hystrix » Hystrix-pull-requests #379 SUCCESS
This pull request looks good

awazalwar added 2 commits March 10, 2016 13:59
…tional queueSize=<int> parameter in stream query
…y hystrix.stream.defaultMetricListenerQueueSize (still defaults to 1000 if unspecified)
@cloudbees-pull-request-builder

NetflixOSS » Hystrix » Hystrix-pull-requests #380 FAILURE
Looks like there's a problem with this pull request

@anuragw anuragw reopened this Mar 10, 2016
@cloudbees-pull-request-builder

NetflixOSS » Hystrix » Hystrix-pull-requests #382 SUCCESS
This pull request looks good

@anuragw
Copy link
Contributor Author

anuragw commented Mar 10, 2016

I've fixed the issue I had on my forked branch, and have pushed again. The PR got auto-closed when I reset to the base fork before replaying my two commits, and so I've reopened it. I'm working on the build failure now and will update once that's done.

Can we make a call on whether or not to include the queueSize in the request parameters soon? I wouldn't mind keeping the property as the sole way of setting the queueSize, but I do think the request parameter could be useful too, in some scenarios.

@anuragw
Copy link
Contributor Author

anuragw commented Mar 10, 2016

I'm not sure what the issue is, the gradle build seems to pass on my system. Any ideas why this may be failing?

@mattrjacobs
Copy link
Contributor

Even in the case where commands are being added dynamically, there's still a single set for the JVM. So I still don't see how defining a queueSize per request helps you.

I'm trying to clean up a bunch of the unit tests to get rid of the flaky failures of unit tests. They almost always have to do with Travis running things more slowly than our local machines. I hope to have that significantly cleaned up in the next couple of days

@anuragw
Copy link
Contributor Author

anuragw commented Mar 11, 2016

@mattrjacobs Let's agree to disagree on whether or not including the queueSize in the request makes sense. I'll remove the request parameter logic and push out the new code today. Once the unit tests are fixed, is it safe to assume that the PR merge shouldn't take much longer?

@cloudbees-pull-request-builder

NetflixOSS » Hystrix » Hystrix-pull-requests #408 SUCCESS
This pull request looks good

mattrjacobs added a commit that referenced this pull request Mar 11, 2016
#1109 Make queue size of MetricJsonListener configurable
@mattrjacobs mattrjacobs merged commit 6f55c0a into Netflix:1.4.x Mar 11, 2016
@mattrjacobs
Copy link
Contributor

Thanks for the good discussion, and contribution, @anuragw. I plan on getting a release with this change out early next week

@anuragw
Copy link
Contributor Author

anuragw commented Mar 14, 2016

@mattrjacobs, I'm always looking forward to contributing! Thanks for your help too, and I'll keep an eye out for the next 1.4.x release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants