Postgres::Vacuum::Monitor provides queries that provide information about the number of dead tuples and long running queries. This information helps to diagnose and monitor two things:
- That the current auto vacuum settings are working and keeping up.
- That there are no long running transactions affecting the auto vacuuming daemon.
Add this line to your application's Gemfile:
gem 'postgres-vacuum-monitor'
And then execute:
$ bundle install
Or install it yourself as:
$ gem install postgres-vacuum-monitor
The job itself needs a class to report the information and can be configured by doing:
Postgres::Vacuum::Monitor.configure do |config|
config.monitor_reporter_class_name = 'MetricsReporter'
# Optionally change the default threshold of 5 minutes for reporting long running transactions
config.long_running_transaction_threshold_seconds = 10 * 60
# Optionally change `max_attempts` of the monitor job (default 1)
config.monitor_max_attempts = 3
# Optionally change `max_run_time` of the monitor job (default 60 seconds)
config.monitor_max_run_time_seconds = 5
# Optionally change the statement timeout of queries (default 10 seconds)
config.monitor_statement_timeout_seconds = 5
end
The class needs to follow this interface:
class MetricsReporter
def report_event(name, attributes)
end
end
For long running transactions, the event name is LongTransactions
and the attributes are:
{
database_name: # The name of the database.
start_time: # When the transaction started .
running_time: # How long has it been running in seconds.
application_name: # What's the application name that is running the query.
most_recent_query: # The last query started by the transaction
state: # The state of the transaction - either "active" or "idle in transaction"
wait_event_type: # The type of lock the transaction is waiting for if applicable
transaction_id: # The transaction_id which will be null for read-only transactions
min_transaction_id: # The mininum transaction id horizon
}
For auto vacuum the attributes are the following:
{
database_name: # The name of the database.
table: # Table name.
table_size: # How big is the table.
dead_tuples: # How many dead tuples are in the table.
tuples_over_limit: # How many dead tuples are over the auto vacuumer threshold.
}
To run the rspec tests, you need to create a DB called postgres_vacuum_monitor_test
.
I use New relic and use the following NRQL to create dashboards:
SELECT percentile(tuples_over_limit, 95) from AutoVacuumLagging facet table where appName = 'my-app' TIMESERIES 30 minutes since 1 day ago
SELECT percentile(dead_tuples) FROM AutoVacuumLagging facet table where appName = 'my-app' SINCE 1 DAY AGO TIMESERIES
SELECT application_name, state, most_recent_query, running_time, start_time FROM LongTransactions
SELECT uniques(table) FROM AutoVacuumLagging where appName = 'my-app' since 30 minutes ago
Bug reports and pull requests are welcome on GitHub at https://github.com/salsify/postgres-vacuum-monitor.