-
Notifications
You must be signed in to change notification settings - Fork 58
Home
The Hystrix Dashboard allows you to monitor Hystrix metrics in real time.
When Netflix began to use this dashboard, their operations improved by reducing the time needed to discover and recover from operational events. The duration of most production incidents (already less frequent due to Hystrix) became far shorter, with diminished impact, due to the real-time insights into system behavior provided by the Hystrix Dashboard.
When a “circuit” is failing it changes colors (on a gradient from green through yellow, orange, and red) like this:
The diagram below shows one circuit from the dashboard along with explanations of what all of the data represents.
Hystrix packs a lot of information into the dashboard so that engineers can quickly consume and correlate data.
The Hystrix Dashboard allows you to monitor a single server or a cluster of servers aggregated using Turbine, with low latency (typically around 1 or 2 seconds when aggregating a cluster, subsecond with a single server).
Here is another example from the Netflix API dashboard monitoring 476 servers aggregated using Turbine:
Here is how to run the dashboard by issuing a Gradle command:
$ git clone git@github.com:Netflix-Skunkworks/hystrix-dashboard.git
$ cd hystrix-dashboard
$ ./gradlew jettyRun
> Building > :jettyRun > Running at http://localhost:7979/hystrix-dashboard
Once you see that the dashboard has reached the “Running” state, open http://localhost:7979/hystrix-dashboard.
@mlabouardy has published a Docker image for the Hystrix dashboard. See https://github.com/mlabouardy/hystrix-dashboard-docker. This is an external project, so please get feedback to @mlabouardy directly. Thanks @mlabouardy!
- Download hystrix-dashboard-#.#.#.war
- Install it in a servlet container such as Apache Tomcat 7
The usage examples below will assume that you install it into /webapps/hystrix-dashboard.war
To build the Hystrix Dashboard with Gradle and then install it into the servlet container, issue the following commands:
./gradlew build
cp build/libs/hystrix-dashboard-*.war ./apache-tomcat-7.*/webapps/hystrix-dashboard.war (or other servlet container)
The hystrix-metrics-event-stream
module exposes metrics in a text/event-stream formatted stream that continues as long as a client holds the connection.
The Hystrix Dashboard expects data in the format that this module emits.
See its README for installation instructions.
- Download turbine-web-1.0.0.war
- Install in servlet container such as Apache Tomcat 7
The usage examples below will assume that you install it to /webapps/turbine.war
To build Turbine with Gradle and then install it into the servlet container, issue the following commands:
git clone git://github.com/Netflix/Turbine.git
./gradlew build
cp turbine-web/build/libs/turbine-web-*.war ./apache-tomcat-7.*/webapps/turbine.war (or other servlet container)
You can find Turbine configuration details on its Configuration Wiki. It also supports custom plugins for Instance Discovery.
To get started as a “Hello World!” example, you can use a static configuration file pointing to specific instances, such as the following.
Create a file, config.properties
, that lists hosts to aggregate. This example includes two EC2 instances:
turbine.ConfigPropertyBasedDiscovery.default.instances=ec2-72-44-38-203.compute-1.amazonaws.com,ec2-23-20-84-255.compute-1.amazonaws.com
turbine.instanceUrlSuffix=:8080/hystrix.stream
The value of the turbine.instanceUrlSuffix
property will be appended to each hostname to create a URL that will result in the hystrix-metrics-event-stream
.
The config.properties
file can be:
- placed on the classpath such as in
/WEB-INF/classes
- specified using a JVM property such as
-Darchaius.configurationSource.additionalUrls=file:///path/to/config.properties
You can test that Turbine is correctly accessing instances and streaming metrics by issuing a command like this:
curl http://hostname:port/turbine/turbine.stream
If that command is successful you should see something like this:
$ curl http://ec2-23-20-84-255.compute-1.amazonaws.com:8080/turbine/turbine.stream
: ping
data: {"rollingCountFailure":0,"propertyValue_executionIsolationThreadInterruptOnTimeout":true,"rollingCountTimeout":0,"rollingCountExceptionsThrown":0,"rollingCountFallbackSuccess":0,"errorCount":0,"type":"HystrixCommand","propertyValue_circuitBreakerEnabled":true,"reportingHosts":1,"latencyTotal":{"0":0,"95":0,"99.5":0,"90":0,"25":0,"99":0,"75":0,"100":0,"50":0},"currentConcurrentExecutionCount":0,"rollingCountSemaphoreRejected":0,"rollingCountFallbackRejection":0,"rollingCountShortCircuited":0,"rollingCountResponsesFromCache":0,"propertyValue_circuitBreakerForceClosed":false,"name":"IdentityCookieAuthSwitchProfile","propertyValue_executionIsolationThreadPoolKeyOverride":"null","rollingCountSuccess":0,"propertyValue_requestLogEnabled":true,"requestCount":0,"rollingCountCollapsedRequests":0,"errorPercentage":0,"propertyValue_circuitBreakerSleepWindowInMilliseconds":5000,"latencyTotal_mean":0,"propertyValue_circuitBreakerForceOpen":false,"propertyValue_circuitBreakerRequestVolumeThreshold":20,"propertyValue_circuitBreakerErrorThresholdPercentage":50,"propertyValue_executionIsolationStrategy":"THREAD","rollingCountFallbackFailure":0,"isCircuitBreakerOpen":false,"propertyValue_executionIsolationSemaphoreMaxConcurrentRequests":20,"propertyValue_executionIsolationThreadTimeoutInMilliseconds":1000,"propertyValue_metricsRollingStatisticalWindowInMilliseconds":10000,"propertyValue_fallbackIsolationSemaphoreMaxConcurrentRequests":10,"latencyExecute":{"0":0,"95":0,"99.5":0,"90":0,"25":0,"99":0,"75":0,"100":0,"50":0},"group":"IDENTITY","latencyExecute_mean":0,"propertyValue_requestCacheEnabled":true,"rollingCountThreadPoolRejected":0}
When you access the Hystrix Dashboard homepage you should see something like this:
To monitor a single server you would use a URL such as:
http://hostname:port/application/hystrix.stream
To monitor an aggregate stream via Turbine the URL would be like this:
http://hostname:port/turbine/turbine.stream
The landing page does nothing more than generate the /monitor/monitor.html
URLs that you can then bookmark.
The delay
parameter controls the latency that is injected between polling cycles on the server to slow down the stream. You can use this to reduce the network and CPU usage on the client.
The title
parameter is used by the monitor.html
page to display a nice title in the browser instead of the raw URL.
Because you may want to embed the dashboard functionality into your own existing dashboard, the app is very simple — primary just HTML, Javascript, and CSS, in modules that can be dropped into any app.
The only portion that is server-side is a proxy servlet that proxies streams between the browser and back end, since EventSource CORS support was a work in progress at development-time.
To display HystrixCommand
monitors on an existing page, simply import the javascript module, instantiate it with a div
, and give it an EventStream
, like this:
var hystrixMonitor = new HystrixCommandMonitor('dependencies', {includeDetailIcon:false});
// start the EventSource which will open a streaming connection to the server
var source = new EventSource("http://hostname:port/hystrix.stream");
// add the listener that will process incoming events
source.addEventListener('message', hystrixMonitor.eventSourceMessageListener, false);
If you have UI improvements that you feel would benefit everyone please create a pull request and contribute back to the project and feel free to ask questions and file bugs.