-
Notifications
You must be signed in to change notification settings - Fork 1
A high performance SNMP poller (fork of RTG)
License
mprovost/rated
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Rated is a fork of the RTG project (http://rtg.sourceforge.net/). Specifically it's a fork of the rtgpoll daemon. Rated is a high performance threaded SNMP poller written in C which stores the time series data in a SQL database. It uses the net-snmp library (http://www.net-snmp.org/) for retrieving the SNMP data and has a modular backend for connecting to various databases. At the moment it has drivers for MySQL, PostgreSQL and Oracle (experimental). WARNING: currently only the postgres driver compiles. You may have to run configure with the --without-mysql argument. The database drivers are implemented as shared libraries (using libtool) and are loaded as needed on startup. See the file DBI.TXT for details. If you are running rated from the src directory where it compiles, you will need to add the .libs directory to your LD_LIBRARY_PATH so it can find the database libraries. Rated uses SNMP get-next calls to poll its targets. This is similar to net-snmp's snmpwalk or the bulk retrieval method outlined in RFC 1187. This is different from mrtg/rtg/spine which use the get operation. This has been done to reduce the size of the target configuration and to eliminate the need for a separate configuration generator (ie cfgmaker/targetmaker). For a given target in the configuration it will send get-next requests until it reaches a different part of the tree or the next OID is smaller than the previous one (it's looping). There is an option on some Cisco routers to stop virtual interfaces from registering with the SNMP engine: no virtual-template snmp This will prevent the interfaces list from filling up with virtual interface information if that data isn't required. At startup the target configuration file (-t option) is parsed and a linked list of hosts is created. For each host, another linked list of targets is built. When polling, each thread takes a single host from the list and goes through its targets in order. This means that a single host will never be polled by two threads simultaneously which could cause it to become overloaded. Each result of a get-next request is stored in memory along with a timestamp so that the delta and rate of change can be calculated between rounds. Since get-nexts return the results in order (in well-behaved SNMP agents) they can be stored in an efficient linked list. Rated can detect when new OIDS need to be inserted or deleted from the list but in the common case where the agent returns the same OIDS in the same order during each polling round, it can walk through the list (so it just follows a pointer every time instead of searching). The type of each response (gauge or counter) and size (32 or 64 bits) is detected automatically. For counters, after the second successful poll, the delta between the current value of the counter and the previous value is calculated as well as the rate of change (ie difference divided by the time between polls) per second. Gauges can be inserted after the first poll. The delta and rate (for counters) are then sent to the database driver. Rated does database inserts synchronously for each OID as a means of throttling the frequency of SNMP requests being sent to the target host which often has limited CPU. Polls resulting in a zero delta are not sent to the database unless the -z command-line option is used. The database driver will automatically create a table that maps OIDs to an iid which is an internal integer which is used to uniquely identify each OID. It will look up OIDs in that table and if they aren't found, will insert them. Once it has an iid, it will look for a data table with the name of the host (from the target config file). If one isn't found, it will create one before doing the first insert. This means that there are no scripts that need to be run to create database tables before running rated. Rated will detect and correct counter wraps, when the (32 bit) counter has overflowed and the new value is less than the previous value. This can also occur when the device is rebooted. To detect the latter case, an SNMP get query for the sysUpTime OID is done at the beginning of each polling round for each host. If it detects that the device has rebooted since the last poll, all of the counters are zeroed. Wrapping can be avoided by using 64 bit counters where available. If there is a problem inserting into the database, rated will keep the last value and timestamp the same. When the database comes back, it will do a single insert with the delta and rate since then. So it loses precision but maintains the correct values. Cacheing all of the intermediate results can quickly use up the available memory and flood the database with inserts when it reconnects. When rated is done doing getnexts for a target OID for a host, it will record a single entry in the host's table for the target OID with the number of getnexts and the rate (per second) at which they were polled (and inserted). This allows monitoring the performance of the poller itself and of the SNMP responses from the host. After the entire polling round is complete, the overall statistics for the poll are inserted into the 'rated' table in the database. Sending a HUP signal to the rated poller forces it to reread the target config file at the start of the next poll. Currently this regenerates the hosts/targets lists from scratch so any previous poll's data is discarded and it will treat every target as a first poll. Here's a sample rated.conf file (specified with -c on the command line): Interval 30 DB_Host localhost DB_Database rated DB_User rated DB_Pass rated DB_Driver libratedpgsql.so Threads 5 The interval specifies how often each target will be polled. For example, if the interval is 30 seconds and the poll takes 10 seconds, rated will sleep for 20 seconds before starting the next round. The sleep time is determined at the end of every round. Here's a sample target configuration file (specified with -t): template switch { # ifHCInOctets target .1.3.6.1.2.1.31.1.1.1.6 # ifHCOutOctets target .1.3.6.1.2.1.31.1.1.1.10 # ifHCInUcastPkts target .1.3.6.1.2.1.31.1.1.1.7 # ifHCOutUcastPkts target .1.3.6.1.2.1.31.1.1.1.11 community public { snmpver 2 { host myswitch1 address 192.168.1.1 host myswitch2 address 192.168.1.2 } } community private { snmpver 1 { host myfirewall1 address 10.1.1.1 } } template Foundry-RX { # enterprises.foundry.products.switch.snAgentSys.snAgentCpu.snAgentCpuUtilTable.snAgentCpuUtilEntry.snAgentCpuUtilValue target .1.3.6.1.4.1.1991.1.1.2.11.1.1.4 # enterprises.foundry.products.switch.snAgentSys.snAgentTemp.snAgentTempTable.snAgentTempEntry.snAgentTempValue target .1.3.6.1.4.1.1991.1.1.2.13.1.1.4 community public { snmpver 2 { host myrxswitch1 address 192.168.1.3 } } } } Comments start with a # (shell style) or // (C++ style) and go to the end of the line. Target OIDs are grouped into templates so that they apply to multiple hosts and to reduce the size of the configuration. The target OID is used by rated in the first getnext request, so the first result will be the next value in the OID tree. This is the same behaviour as the snmpwalk utility. Templates can be nested and inherit the targets from the parent template(s). At this time the template names are symbolic and not actually used by the program. Within each template are one or more communities. This is the SNMP community string to be used for the enclosed hosts. Then the hosts are grouped by SNMP version (1 or 2). Finally the hosts are listed, one per line.
About
A high performance SNMP poller (fork of RTG)
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published