Skip to content

Logging Access Stats

Kelly McLaughlin edited this page Jun 11, 2012 · 13 revisions

Logging User Access Statistics

Access stats are tracked on a per-user basis, as rollups for slices of time. They are stored in the same Riak cluster as other MOSS data, in the moss.access bucket.

For information about querying access stats, please read Querying Access Stats.

High Level

The 10k ft. view of RCS access logging is:

  1. Each RCS Webmachine resource adds a note to its wrq structure once it has figured out who (if anyone) should be billed for the access.
  2. RCS uses a custom Webmachine logger module that watches for this note on the final wrq resource. When that logger finds this note, it picks out some stats (detailed below) about the request, and passes them to an aggegating gen_server.
  3. The aggregating gen_server periodically sends its accumulated log to an archival process.
  4. The archival process sums all recorded accesses for each user, and stores a record for each user for the time slice.

Log retrieval is then simply requesting from Riak all slice objects for a user in a time period.

Logging in a Resource

Setting the User

No access data will be logged unless the user for the access is known. A resource should record the user for the access by calling riak_moss_access_logger:set_user/2 with the #moss_user{} record for the user, and the webmachine request data:

forbidden(RD, Ctx) ->
   ...
   {ok, User} = Auth:authenticate(RD, Args),
   AccessRD = riak_moss_access_logger:set_user(User, RD),
   ...
   {false, AccessRD, Ctx}.

Note that a new request data structure is returned with the note added for the logger to pick up when the request finishes.

Tracked Stats

With nothing more than setting the user for the request, several stats are logged automatically:

  • Count -- the number of times this operation was used; each request counts as one (1)
  • BytesIn -- the number of bytes that were included in the request body
  • BytesOut -- the number of bytes that were sent in the response body

For successful requests, each of these stats is logged under the name given. For unsuccessful requests, they are logged under this name with a prefix of either SystemError, for requests that end in response codes 500+, or UserError, for requests that end in response codes 400-499. For example, if a user tries to download a non-existant file, it will be logged under UserErrorCount with the bytes of the message logged under UserErrorBytesOut.

These three metrics are logged for each operation separately. The access logger determines the operation type by comparing the method, resource module, and path to a known table. For example, it knows that a GET on the riak_moss_wm_key module with the acl query parameter in the path is a KeyReadACL operation. A PUT to the same resource without the acl query param is a KeyWrite operation. See Querying Access Stats for a list of all operation types.

Special note for BytesIn

The BytesIn stat is normally extracted from log information automatically, by using the value the client gave in the Content-Length header. This behavior can be overridden by returning the modified wrq structure returned by calling riak_moss_access_logger:set_bytes_in(Count, RD). This will cause the access logger to log Count as the number of BytesIn for the request, instead of the automatically extracted value. This is especially useful if the upload did not complete (i.e. the client said it would upload 5GB, but only made it through 3GB before disconnecting).

Special note for BytesOut

The BytesOut stat is normally extracted from the log information automatically, by using the value that Webmachine has counted for the response length. A resource may also hint to the logger how many bytes it expects to send in the response by returning the modified wrq structure returned by calling riak_moss_access_logger:expect_bytes_out(Count, RD). If this hint is present, then the number that Webmachine has counted will be compared to it. If the hint and Webmachine's count do not match, the bytes are logged under BytesOutIncomplete instead of BytesOut. This allows an RCS provider to decide whether or not to bill for partial downloads.

Log Accumulation

As resources finish their processing, the riak_moss_access_logger module is called by Webmachine to log the access. This module implements a gen_server that finds all of the access notes in the request's log data and stores them until the current interval ends.

Storage is currently a simple ETS duplicate_bag table. The keys for the table are user key IDs, such that all accesses for a user are stored under the same key. Values for the table are 2-tuples of {Operation, Proplist}, containing the stats logged for each access.

This should be fairly easy to convert to a simple append-only disk log if ETS proves to behave poorly under load, or if more durable storage for the current interval is desired.

When the current interval ends, riak_moss_access_logger transfers ownership of its accumlated ETS table to riak_moss_access_archiver. The logger module then opens a fresh ETS table for logging the next slice's accesses.

Interval Duration

The length of the log flushing interval is configured by the riak_moss application environment variable access_log_flush_factor. The value is expressed as an integer divisor of the access_archive_period setting. That is, if access_log_flush_factor is 5, and access_archive_period is 3600 (== 1 hour) seconds, the log will be flushed every 720 seconds (== 12 minutes), which is 5 times per archive period.

The value of access_log_flush_factor must be an integer factor of access_archive_period. If the factor does not divide the period evenly, an error will be printed in the log, and the Riak CS node will refuse to start.

The default value for access_log_flush_factor is 1 (once per archive period).

Log Size Trigger for Archival

Archival of the access log will also be triggered if the number of records waiting to be archived reaches a configurable level. When the threshold is reached, all accumulated records are transfered to the archiver, which writes out a sample with "now" as the end-time. Accumulation is then restarted with "now" as the start-time, and will continue until either the end of the time interval, or until the log threshold is reached again.

This level is configured by the riak_moss application environment variable access_log_flush_size. Its default value is 1000000 (one million).

Backlog Caveat

If the logger finds itself so far behind that it would need to schedule its next archival in the past (that is, after sending a log accumulation for interval N to the archiver, it finds that the end of interval N+1 has already passed), the logger will drop the backlog in its message box by exiting and allowing its supervisor to restart it. Just before exiting, it will print an error message describing how far behind it was:

09:56:02.584 [error] Access logger is running 302 seconds behind, skipping 0 log messages to catch up

With the default one-hour archive period, this case will only be encountered when the logger is an hour behind. This behavior is meant as a safety valve to prevent that hour lag from growing due to memory pressure from the logger's mailbox.

Manually Triggering Archival

When taking a machine out of service, it may be desirable to trigger log archival before the end of the interval. To do so, use the riak-cs-access script with the command flush.

By default, the script will wait up to 50 seconds for the logger to acknowledge that it has passed its accumulation to the archiver, and another 50 seconds for the archiver to acknowledge that it has finished archiving all accumulations it has received. To wait longer, use the -w parameter on the command line with an integer number of 5-second intervals to wait. That is, to wait for 100 seconds for each phase, use:

riak-cs-access flush -w 20

Log Archival

When riak_moss_access_archiver receives an ETS table, it iterates through the user keys stored there. For each user, the values for each operation in its associated proplists are summed. That is, if four accesses were recorded for a user, with operations like

{<<"KeyWrite">>,[{<<"Count">>,1},{<<"BytesIn">>,50}]}
{<<"KeyWrite">>,[{<<"Count">>,1},{<<"BytesIn">>,25}]}
{<<"KeyRead">>,[{<<"Count">>,1},{<<"BytesOut">>,42}]}
{<<"KeyRead">>,[{<<"SystemErrorCount">>,1},
                {<<"SystemErrorBytesOut">>,80}]}

the archiver will produce a summed operation list of

[{<<"KeyWrite">>,[{<<"Count">>,2},{<<"BytesIn">>,75}]},
 {<<"KeyRead">>,[{<<"Count">>,1},{<<"BytesOut">>,42},
                 {<<"SystemErrorCount">>,1},
                 {<<"SystemErrorBytesOut">>,80}]}]

The Erlang node name of the MOSS node is added to this summed proplist, for reference during retrieval.

Each user's summed proplist is then stored in a Riak object for the time slice. Proplists are converted to JSON objects for storage, for maximum portability. When the archiver has finished iterating through all users logged, it destroys the table.

Archive Key

The key for the archive object is of the form EndingTimestampISO8601.UserKeyID. The timestamp in that key is the start of the slice in which this data falls. The period used here is distinct from the period used for flushing the log accumulation. This is to allow a decoupling of durability and access efficiency. That is, it may be beneficial to flush accumulated logs more often (more durability), and yet keep the number of objects created in Riak low (more read-time efficiency).

This archive period configured by the riak_moss application environment variable access_archive_period. It is also expressed as an integer number of seconds, and the intervals also start at midnight and repeat through the day. The default is 3600 (one hour), matching the log flush interval.

Archive Retrieval

When a request is recieved for a user's access stats over some time period, the objects for all intervals in that time period must be retrieved.

It is important to note that the archival process does not attempt a read/modify/write cycle when writing a slice record. The moss.access bucket should have the allow_mult=true flag set, and so multiple moss nodes writing the same slice record for the same user will create siblings. MOSS attempts to check and set the allow_mult bucket property when it starts up, and will print a warning in the log about being "unable to configure" or "unable to verify" bucket settings if it fails.

Siblings should be handled at read time. Sibling resolution should be nothing more than a set union of all records. The HTTP resource serving the statistics expects to provide them on a node-accumulated basis, so it is important to set a unique Erlang node name for each MOSS node.

Clone this wiki locally