-
Notifications
You must be signed in to change notification settings - Fork 51
/
README
1328 lines (950 loc) · 54.3 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
NAME
check_pgactivity - PostgreSQL plugin for Nagios
SYNOPSIS
check_pgactivity {-w|--warning THRESHOLD} {-c|--critical THRESHOLD} [-s|--service SERVICE ] [-h|--host HOST] [-U|--username ROLE] [-p|--port PORT] [-d|--dbname DATABASE] [-S|--dbservice SERVICE_NAME] [-P|--psql PATH] [--debug] [--status-file FILE] [--path PATH] [-t|--timemout TIMEOUT]
check_pgactivity [-l|--list]
check_pgactivity [--help]
DESCRIPTION
check_pgactivity is designed to monitor PostgreSQL clusters from Nagios.
It offers many options to measure and monitor useful performance
metrics.
COMPATIBILITY
Each service is available from a different PostgreSQL version, from 7.4,
as documented below.
Ideally, the psql client should be 8.4 at least, but it can be used with
an older server. If your client is older, see "--psql" and "--old-psql"
for more information.
Please report any undocumented incompatibility.
-s, --service SERVICE
The Nagios service to run. See section SERVICES for a description of
available services or use "--list" for a short service and
description list.
-h, --host HOST
Database server host or socket directory (default: $PGHOST or
"localhost")
See section "CONNECTIONS" for more informations.
-U, --username ROLE
Database user name (default: $PGUSER or "postgres").
See section "CONNECTIONS" for more informations.
-p, --port PORT
Database server port (default: $PGPORT or "5432").
See section "CONNECTIONS" for more informations.
-d, --dbname DATABASE
Database name to connect to (default: $PGDATABASE or "template1").
WARNING! This is not necessarily one of the database that will be
checked. See "--dbinclude" and "--dbexclude" .
See section "CONNECTIONS" for more informations.
-S, --dbservice SERVICE_NAME
The connection service name from pg_service.conf to use.
See section "CONNECTIONS" for more informations.
--dbexclude REGEXP
Some services automatically check all the databases of your cluster
(note: that does not mean they always need to connect on all of them
to check them though). "--dbexclude" excludes any database whose
name matches the given Perl regular expression. Repeat this option
as many time as needed.
See "--dbinclude" as well. If a database match both dbexclude and
dbinclude arguments, it is excluded.
--dbinclude REGEXP
Some services automatically check all the databases of your cluster
(note: that does not imply that they always need to connect to all
of them though). Some always exclude the 'postgres' database and
templates. "--dbinclude" checks ONLY databases whose names match the
given Perl regular expression. Repeat this option as many time as
needed.
See "--dbexclude" as well. If a database match both dbexclude and
dbinclude arguments, it is excluded.
-w, --warning THRESHOLD
The Warning threshold.
-c, --critical THRESHOLD
The Critical threshold.
-F, --format OUTPUT_FORMAT
The output format. Supported output are: "binary", "debug", "human",
"nagios", "nagios_strict", "json" and "json_strict".
Using the "binary" format, the results are written in a binary file
(using perl module "Storable") given in argument "--output". If no
output is given, defaults to file "check_pgactivity.out" in the same
directory as the script.
The "nagios_strict" and "json_strict" formats are equivalent to the
"nagios" and "json" formats respectively. The only difference is
that they enforce the units to follow the strict Nagios specs: B, c,
s or %. Any unit absent from this list is dropped (Bps, Tps, etc).
--tmpdir DIRECTORY
Path to a directory where the script can create temporary files. The
script relies on the system default temporary directory if possible.
-P, --psql FILE
Path to the "psql" executable (default: "psql").
Because check_pgactivity uses "psql -w", "psql" must be version 8.4
at least, but the server can be older. If your "psql" is older and
does not support "-w", look at argument "--old-psql".
--old-psql
Stay compatible with psql 8.3 and below.
Can also be set be setting environment variable
"CHECK_PGA_OLD_PSQL".
This option removes the "-w" argument from the "psql" argument list.
This means that if your password-less authentication is not
correctly setup, "psql" might either wait indefinitely for a
password, effectively blocking "check_pgactivity", or fail
immediately with an explicit error message.
--status-file PATH
Path to the file where service status information is kept between
successive calls. Default is to save a file called
"check_pgactivity.data" in the same directory as the script.
Note that this file is protected from concurrent writes using a lock
file located in the same directory, having the same name than the
status file, but with the extension ".lock".
On some plateform, network filesystems may not be supported
correctly by the locking mechanism. See "perldoc -f flock" for more
information.
--dump-status-file
Dump the content of the status file and exit. This is useful for
debugging purpose.
--dump-bin-file [PATH]
Dump the content of the given binary file previously created using
"--format binary". If no path is given, defaults to file
"check_pgactivity.out" in the same directory as the script.
-t, --timeout TIMEOUT
Timeout (default: "30s"), as raw (in seconds) or as an interval.
This timeout will be used as "statement_timeout" for psql and URL
timeout for "minor_version" service.
-l, --list
List available services.
-V, --version
Print version and exit.
--debug
Print some debug messages.
-?, --help
Show this help page.
THRESHOLDS
THRESHOLDS provided as warning and critical values can be raw numbers,
percentages, intervals or sizes. Each available service supports one or
more formats (eg. a size and a percentage).
Percentage
If THRESHOLD is a percentage, the value should end with a '%' (no
space). For instance: 95%.
Interval
If THRESHOLD is an interval, the following units are accepted (not
case sensitive): s (second), m (minute), h (hour), d (day). You can
use more than one unit per given value. If not set, the last unit is
in seconds. For instance: "1h 55m 6" = "1h55m6s".
Size
If THRESHOLD is a size, the following units are accepted (not case
sensitive): b (Byte), k (KB), m (MB), g (GB), t (TB), p (PB), e (EB)
or Z (ZB). Only integers are accepted. Eg. "1.5MB" will be refused,
use "1500kB".
The factor between units is 1024 bytes. Eg. "1g = 1G =
1024*1024*1024."
CONNECTIONS
check_pgactivity allows two different connection specifications: by
service or by specifying values for host, user, port, and database. Some
services can run on multiple hosts, or needs to connect to multiple
hosts.
You might specify one of the parameters below to connect to your
PostgreSQL instance. If you don't, no connection parameters are given to
psql: connection relies on binary defaults and environment.
The format for connection parameters is:
Parameter "--dbservice SERVICE_NAME"
Define a new host using the given service. Multiple hosts can be
defined by listing multiple services separated by a comma. Eg.
--dbservice service1,service2
For more information about service definition, see:
<https://www.postgresql.org/docs/current/libpq-pgservice.html>
Parameters "--host HOST", "--port PORT", "--user ROLE" or "--dbname
DATABASE"
One parameter is enough to define a new host. Usual environment
variables (PGHOST, PGPORT, PGDATABASE, PGUSER, PGSERVICE,
PGPASSWORD) or default values are used for missing parameters.
As for usual PostgreSQL tools, there is no command line argument to
set the password, to avoid exposing it. Use PGPASSWORD, .pgpass or a
service file (recommended).
If multiple values are given, define as many host as maximum given
values.
Values are associated by position. Eg.:
--host h1,h2 --port 5432,5433
Means "host=h1 port=5432" and "host=h2 port=5433".
If the number of values is different between parameters, any host
missing a parameter will use the first given value for this
parameter. Eg.:
--host h1,h2 --port 5433
Means: "host=h1 port=5433" and "host=h2 port=5433".
Services are defined first
For instance:
--dbservice s1 --host h1 --port 5433
means: use "service=s1" and "host=h1 port=5433" in this order. If
the service supports only one host, the second host is ignored.
Mutual exclusion between both methods
You can not overwrite services connections variables with parameters
"--host HOST", "--port PORT", "--user ROLE" or "--dbname DATABASE"
SERVICES
Descriptions and parameters of available services.
archive_folder
Check if all archived WALs exist between the oldest and the latest
WAL in the archive folder and make sure they are 16MB. The given
folder must have archived files from ONE cluster. The version of
PostgreSQL that created the archives is only checked on the last
one, for performance consideration.
This service requires the argument "--path" on the command line to
specify the archive folder path to check. Obviously, it must have
access to this folder at the filesystem level: you may have to
execute it on the archiving server rather than on the PostgreSQL
instance.
The optional argument "--suffix" defines the suffix of your archived
WALs; this is useful for compressed WALs (eg. .gz, .bz2, ...).
Default is no suffix.
This service needs to read the header of one of the archives to
define how many segments a WAL owns. Check_pgactivity automatically
handles files with extensions .gz, .bz2, .xz, .zip or .7z using the
following commands:
gzip -dc
bzip2 -dc
xz -dc
unzip -qqp
7z x -so
If needed, provide your own command that writes the uncompressed
file to standard output with the "--unarchiver" argument.
Optional argument "--ignore-wal-size" skips the WAL size check. This
is useful if your archived WALs are compressed and check_pgactivity
is unable to guess the original size. Here are the commands
check_pgactivity uses to guess the original size of .gz, .xz or .zip
files:
gzip -ql
xz -ql
unzip -qql
Default behaviour is to check the WALs size.
Perfdata contains the number of archived WALs and the age of the
most recent one.
Critical and Warning define the max age of the latest archived WAL
as an interval (eg. 5m or 300s ).
Required privileges: unprivileged role; the system user needs read
access to archived WAL files.
Sample commands:
check_pgactivity -s archive_folder --path /path/to/archives -w 15m -c 30m
check_pgactivity -s archive_folder --path /path/to/archives --suffix .gz -w 15m -c 30m
check_pgactivity -s archive_folder --path /path/to/archives --ignore-wal-size --suffix .bz2 -w 15m -c 30m
check_pgactivity -s archive_folder --path /path/to/archives --unarchiver "unrar p" --ignore-wal-size --suffix .rar -w 15m -c 30m
archiver (8.1+)
Check if the archiver is working properly and the number of WAL
files ready to archive.
Perfdata returns the number of WAL files waiting to be archived.
Critical and Warning thresholds are optional. They apply on the
number of files waiting to be archived. They only accept a raw
number of files.
Whatever the given threshold, a critical alert is raised if the
archiver process did not archive the oldest waiting WAL to be
archived since last call.
Required privileges: superuser (<v10), <v10: superuser, v10:
nonsuper user but output will lack perfdata oldest_ready_wal or
superuser, v11+: grant execute on function pg_stat_file(text).
autovacuum (8.1+)
Check the autovacuum activity on the cluster.
Perfdata contains the age of oldest running autovacuum and the
number of workers by type (VACUUM, VACUUM ANALYZE, ANALYZE, VACUUM
FREEZE).
Thresholds, if any, are ignored.
Required privileges: unprivileged role.
backends (all)
Check the total number of connections in the PostgreSQL cluster.
Perfdata contains the number of connections per database.
Critical and Warning thresholds accept either a raw number or a
percentage (eg. 80%). When a threshold is a percentage, it is
compared to the difference between the cluster parameters
"max_connections" and "superuser_reserved_connections".
Required privileges: an unprivileged user only sees its own queries;
a pg_monitor (10+) or superuser (<10) role is required to see all
queries.
backends_status (8.2+)
Check the status of all backends. Depending on your PostgreSQL
version, statuses are: "idle", "idle in transaction", "idle in
transaction (aborted)" (>=9.0 only), "fastpath function call",
"active", "waiting for lock", "undefined", "disabled" and
"insufficient privilege". insufficient privilege appears when you
are not allowed to see the statuses of other connections.
This service supports the argument "--exclude REGEX" to exclude
queries matching the given regular expression.
You can use multiple "--exclude REGEX" arguments.
Critical and Warning thresholds are optional. They accept a list of
'status_label=value' separated by a comma. Available labels are
"idle", "idle_xact", "aborted_xact", "fastpath", "active" and
"waiting". Values are raw numbers or time units and empty lists are
forbidden. Here is an example:
-w 'waiting=5,idle_xact=10' -c 'waiting=20,idle_xact=30,active=1d'
Perfdata contains the number of backends for each status and the
oldest one for each of them, for 8.2+.
Note that the number of backends reported in Nagios message includes
excluded backends.
Required privileges: an unprivileged user only sees its own queries;
a pg_monitor (10+) or superuser (<10) role is required to see all
queries.
checksum_errors (12+)
Check for data checksums error, reported in pg_stat_database.
This service requires that data checksums are enabled on the target
instance. UNKNOWN will be returned if that's not the case.
Critical and Warning thresholds are optional. They only accept a raw
number of checksums errors per database. If the thresholds are not
provided, a default value of `1` will be used for both thresholds.
Checksums errors are CRITICAL issues, so it's highly recommended to
keep default threshold, as immediate action should be taken as soon
as such a problem arises.
Perfdata contains the number of error per database.
Required privileges: unprivileged user.
backup_label_age (8.1+)
Check the age of the backup label file.
Perfdata returns the age of the backup_label file, -1 if not
present.
Critical and Warning thresholds only accept an interval (eg.
1h30m25s).
Required privileges: grant execute on function pg_stat_file(text,
boolean) (pg12+); unprivileged role (9.3+); superuser (<9.3)
bgwriter (8.3+)
Check the percentage of pages written by backends since last check.
This service uses the status file (see "--status-file" parameter).
Perfdata contains the ratio per second for each "pg_stat_bgwriter"
counter since last execution. Units Nps for checkpoints, max written
clean and fsyncs are the number of "events" per second.
Critical and Warning thresholds are optional. If set, they *only*
accept a percentage.
Required privileges: unprivileged role.
btree_bloat
Estimate bloat on B-tree indexes.
Warning and critical thresholds accept a comma-separated list of
either raw number(for a size), size (eg. 125M) or percentage. The
thresholds apply to bloat size, not object size. If a percentage is
given, the threshold will apply to the bloat size compared to the
total index size. If multiple threshold values are passed,
check_pgactivity will choose the largest (bloat size) value.
This service supports both "--dbexclude" and "--dbinclude"
parameters. The 'postgres' database and templates are always
excluded.
It also supports a "--exclude REGEX" parameter to exclude relations
matching a regular expression. The regular expression applies to
"database.schema_name.relation_name". This enables you to filter
either on a relation name for all schemas and databases, on a
qualified named relation (schema + relation) for all databases or on
a qualified named relation in only one database.
You can use multiple "--exclude REGEX" parameters.
Perfdata will return the number of indexes of concern, by warning
and critical threshold per database.
A list of the bloated indexes will be returned after the perfdata.
This list contains the fully qualified bloated index name, the
estimated bloat size, the index size and the bloat percentage.
Required privileges: superuser (<10) able to log in all databases,
or at least those in "--dbinclude"; superuser (<10); on PostgreSQL
10+, a user with the role pg_monitor suffices, provided that you
grant SELECT on the system table pg_statistic to the pg_monitor
role, in each database of the cluster: "GRANT SELECT ON pg_statistic
TO pg_monitor;"
session_stats (14+)
Gather miscellaneous session statistics.
This service uses the status file (see --status-file parameter).
Perfdata contains the session / active / idle-in-transaction times
for each database since last call, as well as the number of sessions
per second, and the number of sessions killed / abandoned /
terminated by fatal errors.
Required privileges: unprivileged role.
commit_ratio (all)
Check the commit and rollback rate per second since last call.
This service uses the status file (see --status-file parameter).
Perfdata contains the commit rate, rollback rate, transaction rate
and rollback ratio for each database since last call.
Critical and Warning thresholds are optional. They accept a list of
comma separated 'label=value'. Available labels are rollbacks,
rollback_rate and rollback_ratio, which will be compared to the
number of rollbacks, the rollback rate and the rollback ratio of
each database. Warning or critical will be raised if the reported
value is greater than rollbacks, rollback_rate or rollback_ratio.
Required privileges: unprivileged role.
configuration (8.0+)
Check the most important settings.
Warning and Critical thresholds are ignored.
Specific parameters are : "--work_mem", "--maintenance_work_mem",
"--shared_buffers","--wal_buffers", "--checkpoint_segments",
"--effective_cache_size", "--no_check_autovacuum",
"--no_check_fsync", "--no_check_enable", "--no_check_track_counts".
Required privileges: unprivileged role.
connection (all)
Perform a simple connection test.
No perfdata is returned.
This service ignores critical and warning arguments.
Required privileges: unprivileged role.
custom_query (all)
Perform the given user query.
Specify the query with "--query". The first column will be used to
perform the test for the status if warning and critical are
provided.
The warning and critical arguments are optional. They can be of
format integer (default), size or time depending on the "--type"
argument. Warning and Critical will be raised if they are greater
than the first column, or less if the "--reverse" option is used.
All other columns will be used to generate the perfdata. Each field
name is used as the name of the perfdata. The field value must
contain your perfdata value and its unit appended to it. You can add
as many fields as needed. Eg.:
SELECT pg_database_size('postgres'),
pg_database_size('postgres')||'B' AS db_size
Required privileges: unprivileged role (depends on the query).
database_size (8.1+)
Check the variation of database sizes, and return the size of every
databases.
This service uses the status file (see "--status-file" parameter).
Perfdata contains the size of each database and their size delta
since last call.
Critical and Warning thresholds are optional. They are a list of
optional 'label=value' separated by a comma. It allows to fine tune
the alert based on the absolute "size" and/or the "delta" size. Eg.:
-w 'size=500GB' -c 'size=600GB'
-w 'delta=1%' -c 'delta=10%'
-w 'size=500GB,delta=1%' -c 'size=600GB,delta=10GB'
The "size" label accepts either a raw number or a size and checks
the total database size. The "delta" label accepts either a raw
number, a percentage, or a size. The aim of the delta parameter is
to detect unexpected database size variations. Delta thresholds are
absolute value, and delta percentages are computed against the
previous database size. A same label must be filled for both warning
and critical.
For backward compatibility, if a single raw number or percentage or
size is given with no label, it applies on the size difference for
each database since the last execution. Both threshold bellow are
equivalent:
-w 'delta=1%' -c 'delta=10%'
-w '1%' -c '10%'
This service supports both "--dbexclude" and "--dbinclude"
parameters.
Required privileges: unprivileged role.
extensions_versions (9.1+)
Check all extensions installed in all databases (including
templates) and raise a critical alert if the current version is not
the default version available on the instance (according to
pg_available_extensions).
Typically, it is used to detect forgotten extension upgrades after
package upgrades or a pg_upgrade.
Perfdata returns the number of outdated extensions in each database.
This service supports both "--dbexclude" and "--dbinclude"
parameters. Schemas are ignored, as an extension cannot be installed
more than once in a database.
This service supports multiple "--exclude" argument to exclude one
or more extensions from the check. To ignore an extension only in a
particular database, use 'dbname/extension_name' syntax.
Examples:
--dbexclude 'devdb' --exclude 'testdb/postgis' --exclude 'testdb/postgis_topology'
--dbinclude 'proddb' --dbinclude 'testdb' --exclude 'powa'
Required privileges: unprivileged role able to log in all databases
hit_ratio (all)
Check the cache hit ratio on the cluster.
This service uses the status file (see "--status-file" parameter).
Perfdata returns the cache hit ratio per database. Template
databases and databases that do not allow connections will not be
checked, nor will the databases which have never been accessed.
Critical and Warning thresholds are optional. They only accept a
percentage.
This service supports both "--dbexclude" and "--dbinclude"
parameters.
Required privileges: unprivileged role.
hot_standby_delta (9.0)
Check the data delta between a cluster and its hot standbys.
You must give the connection parameters for two or more clusters.
Perfdata returns the data delta in bytes between the master and each
hot standby cluster listed.
Critical and Warning thresholds are optional. They can take one or
two values separated by a comma. If only one value given, it applies
to both received and replayed data. If two values are given, the
first one applies to received data, the second one to replayed ones.
These thresholds only accept a size (eg. 2.5G).
This service raises a Critical if it doesn't find exactly ONE valid
master cluster (ie. critical when 0 or 2 and more masters).
Required privileges: unprivileged role.
is_hot_standby (9.0+)
Checks if the cluster is in recovery and accepts read only queries.
This service ignores critical and warning arguments.
No perfdata is returned.
Required privileges: unprivileged role.
is_master (all)
Checks if the cluster accepts read and/or write queries. This state
is reported as "in production" by pg_controldata.
This service ignores critical and warning arguments.
No perfdata is returned.
Required privileges: unprivileged role.
invalid_indexes (8.2+)
Check if there are invalid indexes in a database.
A critical alert is raised if an invalid index is detected.
This service supports both "--dbexclude" and "--dbinclude"
parameters. The 'postgres' database and templates are always
excluded.
This service supports a "--exclude REGEX" parameter to exclude
indexes matching a regular expression. The regular expression
applies to "database.schema_name.index_name". This enables you to
filter either on a relation name for all schemas and databases, on a
qualified named index (schema + index) for all databases or on a
qualified named index in only one database.
You can use multiple "--exclude REGEX" parameters.
Perfdata will return the number of invalid indexes per database.
A list of invalid indexes will be returned after the perfdata. This
list contains the fully qualified index name. If excluded index is
set, the number of exclude indexes is returned.
Required privileges: unprivileged role able to log in all databases.
is_replay_paused (9.1+)
Checks if the replication is paused. The service will return UNKNOWN
if executed on a master server.
Thresholds are optional. They must be specified as interval. OK will
always be returned if the standby is not paused, even if replication
delta time hits the thresholds.
Critical or warning are raised if last reported replayed timestamp
is greater than given threshold AND some data received from the
master are not applied yet. OK will always be returned if the
standby is paused, or if the standby has already replayed everything
from master and until some write activity happens on the master.
Perfdata returned: * paused status (0 no, 1 yes, NaN if master) *
lag time (in second) * data delta with master (0 no, 1 yes)
Required privileges: unprivileged role.
last_analyze (8.2+)
Check on each databases that the oldest "analyze" (from autovacuum
or not) is not older than the given threshold.
This service uses the status file (see "--status-file" parameter)
with PostgreSQL 9.1+.
Perfdata returns oldest "analyze" per database in seconds. With
PostgreSQL 9.1+, the number of [auto]analyses per database since
last call is also returned.
Critical and Warning thresholds only accept an interval (eg.
1h30m25s) and apply to the oldest execution of analyse.
Tables that were never analyzed, or whose analyze date was lost due
to a crash, will raise a critical alert.
NOTE: this service does not raise alerts if the database had
strictly no writes since last call. In consequence, a read-only
database can have its oldest analyze reported in perfdata way after
your thresholds, but not raise any alerts.
This service supports both "--dbexclude" and "--dbinclude"
parameters. The 'postgres' database and templates are always
excluded.
Required privileges: unprivileged role able to log in all databases.
last_vacuum (8.2+)
Check that the oldest vacuum (from autovacuum or otherwise) in each
database in the cluster is not older than the given threshold.
This service uses the status file (see "--status-file" parameter)
with PostgreSQL 9.1+.
Perfdata returns oldest vacuum per database in seconds. With
PostgreSQL 9.1+, it also returns the number of [auto]vacuums per
database since last execution.
Critical and Warning thresholds only accept an interval (eg.
1h30m25s) and apply to the oldest vacuum.
Tables that were never vacuumed, or whose vacuum date was lost due
to a crash, will raise a critical alert.
NOTE: this service does not raise alerts if the database had
strictly no writes since last call. In consequence, a read-only
database can have its oldest vacuum reported in perfdata way after
your thresholds, but not raise any alerts.
This service supports both "--dbexclude" and "--dbinclude"
parameters. The 'postgres' database and templates are always
excluded.
Required privileges: unprivileged role able to log in all databases.
locks (all)
Check the number of locks on the hosts.
Perfdata returns the number of locks, by type.
Critical and Warning thresholds accept either a raw number of locks
or a percentage. For percentage, it is computed using the following
limits for 7.4 to 8.1:
max_locks_per_transaction * max_connections
for 8.2+:
max_locks_per_transaction * (max_connections + max_prepared_transactions)
for 9.1+, regarding lockmode :
max_locks_per_transaction * (max_connections + max_prepared_transactions)
or max_pred_locks_per_transaction * (max_connections + max_prepared_transactions)
Required privileges: unprivileged role.
longest_query (all)
Check the longest running query in the cluster.
Perfdata contains the max/avg/min running time and the number of
queries per database.
Critical and Warning thresholds only accept an interval.
This service supports both "--dbexclude" and "--dbinclude"
parameters.
It also supports argument "--exclude REGEX" to exclude queries
matching the given regular expression from the check.
Above 9.0, it also supports "--exclude REGEX" to filter out
application_name.
You can use multiple "--exclude REGEX" parameters.
Required privileges: an unprivileged role only checks its own
queries; a pg_monitor (10+) or superuser (<10) role is required to
check all queries.
max_freeze_age (all)
Checks oldest database by transaction age.
Critical and Warning thresholds are optional. They accept either a
raw number or percentage for PostgreSQL 8.2 and more. If percentage
is given, the thresholds are computed based on the
"autovacuum_freeze_max_age" parameter. 100% means that some table(s)
reached the maximum age and will trigger an autovacuum freeze.
Percentage thresholds should therefore be greater than 100%.
Even with no threshold, this service will raise a critical alert if
a database has a negative age.
Perfdata returns the age of each database.
This service supports both "--dbexclude" and "--dbinclude"
parameters.
Required privileges: unprivileged role.
minor_version (all)
Check if the cluster is running the most recent minor version of
PostgreSQL.
Latest versions of PostgreSQL can be fetched from PostgreSQL
official website if check_pgactivity has access to it, or must be
given as a parameter.
Without "--critical" or "--warning" parameters, this service
attempts to fetch the latest version numbers online. A critical
alert is raised if the minor version is not the most recent.
You can optionally set the path to your prefered retrieval tool
using the "--path" parameter (eg. "--path '/usr/bin/wget'").
Supported programs are: GET, wget, curl, fetch, lynx, links, links2.
If you do not want to (or cannot) query the PostgreSQL website,
provide the expected versions using either "--warning" OR
"--critical", depending on which return value you want to raise.
The given string must contain one or more MINOR versions separated
by anything but a '.'. For instance, the following parameters are
all equivalent:
--critical "10.1 9.6.6 9.5.10 9.4.15 9.3.20 9.2.24 9.1.24 9.0.23 8.4.22"
--critical "10.1, 9.6.6, 9.5.10, 9.4.15, 9.3.20, 9.2.24, 9.1.24, 9.0.23, 8.4.22"
--critical "10.1,9.6.6,9.5.10,9.4.15,9.3.20,9.2.24,9.1.24,9.0.23,8.4.22"
--critical "10.1/9.6.6/9.5.10/9.4.15/9.3.20/9.2.24/9.1.24/9.0.23/8.4.22"
Any other value than 3 numbers separated by dots (before version
10.x) or 2 numbers separated by dots (version 10 and above) will be
ignored. If the running PostgreSQL major version is not found, the
service raises an unknown status.
Perfdata returns the numerical version of PostgreSQL.
Required privileges: unprivileged role; access to
http://www.postgresql.org required to download version numbers.
oldest_2pc (8.1+)
Check the oldest *two-phase commit transaction* (aka. prepared
transaction) in the cluster.
Perfdata contains the max/avg age time and the number of prepared
transactions per databases.
Critical and Warning thresholds only accept an interval.
Required privileges: unprivileged role.
oldest_idlexact (8.3+)
Check the oldest *idle* transaction.
Perfdata contains the max/avg age and the number of idle
transactions per databases.
Critical and Warning thresholds only accept an interval.
This service supports both "--dbexclude" and "--dbinclude"
parameters.
Above 9.2, it supports "--exclude" to filter out connections. Eg.,
to filter out pg_dump and pg_dumpall, set this to
'pg_dump,pg_dumpall'.
Before 9.2, this services checks for idle transaction with their
start time. Thus, the service can mistakenly take account of
transaction transiently in idle state. From 9.2 and up, the service
checks for transaction that really had no activity since the given
thresholds.
Required privileges: an unprivileged role checks only its own
queries; a pg_monitor (10+) or superuser (<10) role is required to
check all queries.
oldest_xmin (8.4+)
Check the xmin *horizon* from distinct sources of xmin retention.
Per default, Perfdata outputs the oldest known xmin age for each
database among running queries, opened or idle transactions, pending
prepared transactions, replication slots and walsender. For versions
prior to 9.4, only "2pc" source of xmin retention is checked.
Using "--detailed", Perfdata contains the oldest xmin and maximum
age for the following source of xmin retention: "query" (a running
query), "active_xact" (an opened transaction currently executing a
query), "idle_xact" (an opened transaction being idle), "2pc" (a
pending prepared transaction), "repslot" (a replication slot) and
"walwender" (a WAL sender replication process), for each connectable
database. If a source doesn't retain any transaction for a database,
NaN is returned. For versions prior to 9.4, only "2pc" source of
xmin retention is available, so other sources won't appear in the
perfdata. Note that xmin retention from walsender is only set if
"hot_standby_feedback" is enabled on remote standby.
Critical and Warning thresholds are optional. They only accept a raw
number of transaction.
This service supports both "--dbexclude"" and "--dbinclude""
parameters.
Required privileges: a pg_read_all_stats (10+) or superuser (<10)
role is required to check pg_stat_replication. 2PC,
pg_stat_activity, and replication slots don't require special
privileges.
pg_dump_backup
Check the age and size of backups.
This service uses the status file (see "--status-file" parameter).
The "--path" argument contains the location to the backup folder.
The supported format is a glob pattern matching every folder or file
that you need to check.
The "--pattern" is required, and must contain a regular expression
matching the backup file name, extracting the database name from the
first matching group.
Optionally, a "--global-pattern" option can be supplied to check for
an additional global file.
Examples:
To monitor backups like:
/var/lib/backups/mydb-20150803.dump
/var/lib/backups/otherdb-20150803.dump
/var/lib/backups/mydb-20150804.dump
/var/lib/backups/otherdb-20150804.dump
you must set:
--path '/var/lib/backups/*'
--pattern '(\w+)-\d+.dump'
If the path contains the date, like this:
/var/lib/backups/2015-08-03-daily/mydb.dump
/var/lib/backups/2015-08-03-daily/otherdb.dump
then you can set:
--path '/var/lib/backups/*/*.dump'
--pattern '/\d+-\d+-\d+-daily/(.*).dump'
For compatibility with pg_back (https://github.com/orgrim/pg_back),
you should use:
--path '/path/*{dump,sql}'
--pattern '(\w+)_[0-9-_]+.dump'
--global-pattern 'pg_global_[0-9-_]+.sql'
The "--critical" and "--warning" thresholds are optional. They
accept a list of 'metric=value' separated by a comma. Available
metrics are "oldest" and "newest", respectively the age of the
oldest and newest backups, and "size", which must be the maximum
variation of size since the last check, expressed as a size or a
percentage. "mindeltasize", expressed in B, is the minimum variation
of size needed to raise an alert.
This service supports the "--dbinclude" and "--dbexclude" arguments,
to respectively test for the presence of include or exclude files.
The argument "--exclude" enables you to exclude files younger than
an interval. This is useful to ignore files from a backup in
progress. Eg., if your backup process takes 2h, set this to '125m'.
Perfdata returns the age of the oldest and newest backups, as well
as the size of the newest backups.
Required privileges: unprivileged role; the system user needs read
access on the directory containing the dumps (but not on the dumps
themselves).
pga_version
Check if this script is running the given version of
check_pgactivity. You must provide the expected version using either
"--warning" OR "--critical".
No perfdata is returned.