Skip to content

mprovost/rated

Repository files navigation

Rated is a fork of the RTG project (http://rtg.sourceforge.net/).
Specifically it's a fork of the rtgpoll daemon.

Rated is a high performance threaded SNMP poller written in C which
stores the time series data in a SQL database. It uses the net-snmp
library (http://www.net-snmp.org/) for retrieving the SNMP data and
has a modular backend for connecting to various databases. At the
moment it has drivers for MySQL, PostgreSQL and Oracle (experimental).

WARNING: currently only the postgres driver compiles. You may have to
run configure with the --without-mysql argument.

The database drivers are implemented as shared libraries (using
libtool) and are loaded as needed on startup. See the file DBI.TXT for
details. If you are running rated from the src directory where it
compiles, you will need to add the .libs directory to your
LD_LIBRARY_PATH so it can find the database libraries.

Rated uses SNMP get-next calls to poll its targets. This is similar to
net-snmp's snmpwalk or the bulk retrieval method outlined in RFC 1187.
This is different from mrtg/rtg/spine which use the get operation.
This has been done to reduce the size of the target configuration and
to eliminate the need for a separate configuration generator (ie
cfgmaker/targetmaker). For a given target in the configuration it will
send get-next requests until it reaches a different part of the tree
or the next OID is smaller than the previous one (it's looping).

There is an option on some Cisco routers to stop virtual interfaces
from registering with the SNMP engine:

no virtual-template snmp

This will prevent the interfaces list from filling up with virtual
interface information if that data isn't required.

At startup the target configuration file (-t option) is parsed and a
linked list of hosts is created. For each host, another linked list of
targets is built. When polling, each thread takes a single host from
the list and goes through its targets in order. This means that a
single host will never be polled by two threads simultaneously which
could cause it to become overloaded. 

Each result of a get-next request is stored in memory along with a
timestamp so that the delta and rate of change can be calculated
between rounds. Since get-nexts return the results in order (in
well-behaved SNMP agents) they can be stored in an efficient linked
list. Rated can detect when new OIDS need to be inserted or deleted
from the list but in the common case where the agent returns the same
OIDS in the same order during each polling round, it can walk through
the list (so it just follows a pointer every time instead of
searching).

The type of each response (gauge or counter) and size (32 or 64 bits)
is detected automatically.

For counters, after the second successful poll, the delta between the
current value of the counter and the previous value is calculated as
well as the rate of change (ie difference divided by the time between
polls) per second. Gauges can be inserted after the first poll.

The delta and rate (for counters) are then sent to the database
driver. Rated does database inserts synchronously for each OID as a
means of throttling the frequency of SNMP requests being sent to the
target host which often has limited CPU.

Polls resulting in a zero delta are not sent to the database unless
the -z command-line option is used.

The database driver will automatically create a table that maps OIDs
to an iid which is an internal integer which is used to uniquely
identify each OID. It will look up OIDs in that table and if they
aren't found, will insert them.

Once it has an iid, it will look for a data table with the name of the
host (from the target config file). If one isn't found, it will create
one before doing the first insert.

This means that there are no scripts that need to be run to create
database tables before running rated.

Rated will detect and correct counter wraps, when the (32 bit) counter
has overflowed and the new value is less than the previous value. This
can also occur when the device is rebooted. To detect the latter case,
an SNMP get query for the sysUpTime OID is done at the beginning of
each polling round for each host. If it detects that the device has
rebooted since the last poll, all of the counters are zeroed.

Wrapping can be avoided by using 64 bit counters where available.

If there is a problem inserting into the database, rated will keep the
last value and timestamp the same. When the database comes back, it
will do a single insert with the delta and rate since then. So it
loses precision but maintains the correct values. Cacheing all of the
intermediate results can quickly use up the available memory and flood
the database with inserts when it reconnects.

When rated is done doing getnexts for a target OID for a host, it will
record a single entry in the host's table for the target OID with the
number of getnexts and the rate (per second) at which they were polled
(and inserted). This allows monitoring the performance of the poller
itself and of the SNMP responses from the host.

After the entire polling round is complete, the overall statistics for
the poll are inserted into the 'rated' table in the database.

Sending a HUP signal to the rated poller forces it to reread the
target config file at the start of the next poll. Currently this
regenerates the hosts/targets lists from scratch so any previous
poll's data is discarded and it will treat every target as a first poll.

Here's a sample rated.conf file (specified with -c on the command line):

Interval    30
DB_Host     localhost
DB_Database rated
DB_User     rated
DB_Pass     rated
DB_Driver   libratedpgsql.so
Threads     5

The interval specifies how often each target will be polled.
For example, if the interval is 30 seconds and the poll takes 10
seconds, rated will sleep for 20 seconds before starting the next
round. The sleep time is determined at the end of every round.

Here's a sample target configuration file (specified with -t):

template switch {
    # ifHCInOctets
    target .1.3.6.1.2.1.31.1.1.1.6
    # ifHCOutOctets
    target .1.3.6.1.2.1.31.1.1.1.10
    # ifHCInUcastPkts
    target .1.3.6.1.2.1.31.1.1.1.7
    # ifHCOutUcastPkts
    target .1.3.6.1.2.1.31.1.1.1.11
    community public {
        snmpver 2 {
            host myswitch1 address 192.168.1.1
            host myswitch2 address 192.168.1.2
        }
    }
    community private {
        snmpver 1 {
            host myfirewall1 address 10.1.1.1
        }
    }
    template Foundry-RX {
        # enterprises.foundry.products.switch.snAgentSys.snAgentCpu.snAgentCpuUtilTable.snAgentCpuUtilEntry.snAgentCpuUtilValue
        target .1.3.6.1.4.1.1991.1.1.2.11.1.1.4
        # enterprises.foundry.products.switch.snAgentSys.snAgentTemp.snAgentTempTable.snAgentTempEntry.snAgentTempValue
        target .1.3.6.1.4.1.1991.1.1.2.13.1.1.4
        community public {
            snmpver 2 {
                host myrxswitch1 address 192.168.1.3
            }
        }
    }
}

Comments start with a # (shell style) or // (C++ style) and go to the
end of the line.

Target OIDs are grouped into templates so that they apply to multiple
hosts and to reduce the size of the configuration. The target OID is used
by rated in the first getnext request, so the first result will be the
next value in the OID tree. This is the same behaviour as the snmpwalk
utility. Templates can be nested and inherit the targets from the
parent template(s). At this time the template names are symbolic and
not actually used by the program.

Within each template are one or more communities. This is the SNMP
community string to be used for the enclosed hosts. Then the hosts are
grouped by SNMP version (1 or 2). Finally the hosts are listed, one
per line.

About

A high performance SNMP poller (fork of RTG)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published