-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add insights instrumentation - events and metrics #539
Conversation
This adds a `Honeybadger::Instrumentation` class to help facilitate gathering metrics within the gem. It's used by calling class methods for the various types of metrics. The `Honeybadger::InstrumentationHelper` module can be included into any class for a cleaner metric DSL. For this round of implementaiton, the metrics are simply sent off as events to Insights for collection.
This adds new configuration options for customizing the embedded Insights experience in the gem. `insights.enabled` - Allows you to turn off the event worker, this is enabled by default. `insights.metrics` - Allows you to turn off any plugin Insights data gathering. This includes metrics and plugin events. This is disabled by default. Plugin specific configuration changes: `[plugin name].insights.enabled` - Toggle the plugin Insights data gather. This includes any metrics, or events the plugin emits. `[plugin name].insights.cluster_configuration` - Allows you to toggle cluster collection metrics. `[plugin name].insights.collection_interval` - Adjust the frequency of when the collection interval runs for the specified plugin (defaults to 1 per second).
This adds the ability for a plugin to defined a `collect` block, similar to the `execution` block. During load, these `collect` blocks are gathered and wrapped in a `CollectionExecution` class and passed to a queue in a `CollectorWorker`. The `CollectorWorker` is setup to sychronously execute each collect block around 1 time per second. A collect block may be configured to increase it's interval by passing the `interval` option: ``` collect(interval: 10) do end ``` The above collect block will only be exectued every 10 intervals (about 10 seconds). The `CollectionExecution` class includes the `Honeybadger::InstrumentationHelper` module, so it has access to any of the metric methods. This allows for polling of resources and reporting them to Insights.
This adds subscribers to a few ActiveSupport::Notification events and emits them to Insights.
This adds a few events and cluster metrics gathering for Sidekiq.
This adds a Puma plugin for Honeybadger. Metrics on the puma process are gathered in a background thread every 1 second and should work for both single threaded and clustered mode.
There has been some changes since the last approvals. Requesting again for a final review. |
Co-authored-by: Benjamin Curtis <ben@bencurtis.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks awesome @roelbondoc, great work!
I do have one quick question about the agent methods. If I were to instantiate a second agent and then call instrumentation methods on it, would the events report from that agent, or globally?
end | ||
|
||
def parsed_uri_data(request_data) | ||
uri = request_data.uri || build_uri(request_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a concern with sensitive data in the URI here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question. There could be sensitive data here that a user may not know is being logged. Perhaps we need a mechanism to disable this by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a default so this is disabled by default and must be explicitly enabled:
net_http:
insights:
enabled: true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about leaving it enabled but reporting just the host name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That works for me. I'll make it configurable to allow the full uri if set to true.
Co-authored-by: Benjamin Curtis <ben@bencurtis.com>
That's a good question. Right now some of the internals rely on the global singleton agent, but with a few tweaks this shouldn't be too hard to fix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @roelbondoc. Can't wait to start using this.
}, | ||
:'net_http.insights.enabled' => { | ||
description: 'Allow automatic instrumentation of Net::HTTP requests.', | ||
default: true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this supposed to be false
by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a conversation about disabling it by default, but ultimately we decided to enable it by default, but only log the domain, with an optional configuration to log the full path.
This PR adds automatic Insights data collection to the gem for certain libraries.
By default, all automatic data collection is disabled. To enable collection, add the following to your
config/honeybadger.yml
The list below are the names of the plugins that contain insight metrics. Some plugins will emit
events
that contain aduration
attribute indicating how long that particular event took to invoke. Other plugins will poll for metrics. The polling occurs every second. As with regular plugins, they only load if the required gem is available, and they may be disabled through theskipped_plugins
configuration setting.rails
events
process_action.action_controller
send_file.action_controller
redirect_to.action_controller
halted_callback.action_controller
write_fragment.action_controller
read_fragment.action_controller
expire_fragment.action_controller
render_template.action_view
render_partial.action_view
render_collection.action_view
sql.active_record
process.action_mailer
service_upload.active_storage
service_download.active_storage
active_job
events
enqueue_at.active_job
enqueue.active_job
enqueue_retry.active_job
enqueue_all.active_job
perform.active_job
retry_stopped.active_job
discard.active_job
sidekiq
events
perform.sidekiq
enqueue.sidekiq
metrics
active_workers
active_processes
jobs_processed
jobs_failed
jobs_scheduled
jobs_enqueued
jobs_dead
jobs_retry
queue_latency
queue_depth
queue_busy
capacity
utilization
karafka
events
consumer.consumed.karafka
solid_queue (new)
metrics
jobs_in_progress
jobs_blocked
jobs_failed
jobs_scheduled
jobs_processed
active_workers
active_dispatchers
queue_depth
net_http (new, disabled by default)
events
request.net_http
To enable automatic instrumentation of Net::HTTP requests, add the following to the config:
autotuner (new)
events
report.autotuner
metrics
diff.minor_gc_count
diff.major_gc_count
diff.time
request_time
heap_pages
puma (new)
metrics
pool_capacity
max_threads
requests_count
backlog
running
system (new)
events
report.system
Further Configuration
You may also disable insights for a specific plugin by adding a setting with the plugin name:
Some plugins also collect metrics for a cluster. These metrics do not need to be collected from every running instance. To disable the collection of cluster metrics, add the following configuration:
This is available for the
sidekiq
andsolid_queue
plugins. It is recommended to enable this for a single instance. If you are running Sidekiq Enterprise and have this enabled, it will only run on the current leader instance.By default, polled metrics collection occurs every one second, this can be configured per plugin:
This will modify the collector so that sidekiq metrics are checked every 10 seconds.
Ignoring events:
Metrics and Registry
Components
Honeybadger::Registry
- This class stores all tracked metrics registered in the gem. The registry ensures that only one instance of each unique metric is stored at any time. A metric is defined as unique by it's type, name, and set of attributes.Honeybadger::RegistryExecution
- This class is injected into theMetricsWorker
and is run on a configurable interval. When called, it iterates over all registered metrics, compiles payloads for each, and sent as an event. The registry is then flushed.Honeybadger::Metric
- A base metric class from which all other metric classes inherit from. These metric classes record data and compute/track values as needed.Shape of the Data
Each metric type's data is shaped differently. Below are examples of what that data looks like for each type.
counter
gauge
timer
Note: This may look the same as a gauge, however, the interface to use the metric provides a timing mechanism.
Example:
histogram
Changes to events
The event payloads will now include the
hostname
by default. This can be changed by setting theevents.attach_hostname
configuration parameter tofalse
.The event work now has its own config
events.max_queue_size
which defaults to10_000
. This will allow the agent to backlog more events. The batch size (events.batch_size
) default has also been increased to1_000
.Before submitting a pull request, please make sure the following is done:
rake spec
in the repository root.