Using composer:
php composer.phar require kassner/log-parser:~1.0
Simply instantiate the class :
$parser = new \Kassner\LogParser\LogParser();
And then parse the lines of your access log file :
$lines = file('/var/log/apache2/access.log', FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
foreach ($lines as $line) {
$entry = $parser->parse($line);
}
Where $entry
object will hold all data parsed.
stdClass Object
(
[host] => 193.191.216.76
[logname] => -
[user] => www-data
[stamp] => 1390794676
[time] => 27/Jan/2014:04:51:16 +0100
[request] => GET /wp-content/uploads/2013/11/whatever.jpg HTTP/1.1
[status] => 200
[responseBytes] => 58678
)
You may customize the log format (by default it matches the Apache common log format)
# default Nginx format :
$parser->setFormat('%h %l %u %t "%r" %>s %O "%{Referer}i" \"%{User-Agent}i"');
Here is the full list of log format strings supported by Apache, and whether they are supported by the library :
Supported? | Format String | Property name | Description |
---|---|---|---|
Y | %% | percent | The percent sign |
Y | %> | status | status |
Y | %A | localIp | Local IP-address |
Y | %a | remoteIp | Remote IP-address |
N | %B | - | Size of response in bytes, excluding HTTP headers. |
Y | %b | responseBytes | Size of response in bytes, excluding HTTP headers. In CLF format, i.e. a '-' rather than a 0 when no bytes are sent. |
N | %D | - | The time taken to serve the request, in microseconds. |
N | %f | - | Filename |
Y | %h | host | Remote host |
N | %H | - | The request protocol |
Y | %I | receivedBytes | Bytes received, including request and headers, cannot be zero. You need to enable mod_logio to use this. |
N | %k | - | Number of keepalive requests handled on this connection. Interesting if KeepAlive is being used, so that, for example, a '1' means the first keepalive request after the initial one, '2' the second, etc...; otherwise this is always 0 (Y indicating the initial request). Available in versions 2.2.11 and later. |
Y | %l | logname | Remote logname (from identd, if supplied). This will return a dash unless mod_ident is present and IdentityCheck is set On. |
Y | %m | requestMethod | The request method |
Y | %O | sentBytes | Bytes sent, including headers, cannot be zero. You need to enable mod_logio to use this. |
Y | %p | port | The canonical port of the server serving the request |
N | %P | - | The process ID of the child that serviced the request. |
N | %q | - | The query string (prepended with a ? if a query string exists, otherwise an empty string) |
Y | %r | request | First line of request |
N | %R | - | The handler generating the response (if any). |
N | %s | - | Status. For requests that got internally redirected, this is the status of the original request --- %>s for the last. |
X | %T | requestTime | The time taken to serve the request, in seconds. This option is not consistent, Apache won't inform the milisecond part. |
Y | %t | time | Time the request was received (standard english format) |
Y | %u | user | Remote user (from auth; may be bogus if return status (%s) is 401) |
Y | %U | URL | The URL path requested, not including any query string. |
Y | %v | serverName | The canonical ServerName of the server serving the request. |
Y | %V | canonicalServerName | The server name according to the UseCanonicalName setting. |
N | %X | - | Connection status when response is completed: X = connection aborted before the response completed. + = connection may be kept alive after the response is sent. - = connection will be closed after the response is sent. |
N | %{Foobar}C | - | The contents of cookie Foobar in the request sent to the server. Only version 0 cookies are fully supported. |
N | %{Foobar}e | - | The contents of the environment variable FOOBAR |
Y | %{Foobar}i | *Header | The contents of Foobar: header line(s) in the request sent to the server. Changes made by other modules (e.g. mod_headers) affect this. If you're interested in what the request header was prior to when most modules would have modified it, use mod_setenvif to copy the header into an internal environment variable and log that value with the %{VARNAME}e described above. |
N | %{Foobar}n | - | The contents of note Foobar from another module. |
N | %{Foobar}o | - | The contents of Foobar: header line(s) in the reply. |
N | %{format}p | - | The canonical port of the server serving the request or the server's actual port or the client's actual port. Valid formats are canonical, local, or remote. |
N | %{format}P | - | The process ID or thread id of the child that serviced the request. Valid formats are pid, tid, and hextid. hextid requires APR 1.2.0 or higher. |
N | %{format}t | - | The time, in the form given by format, which should be in strftime(3) format. (potentially localized) (This directive was %c in late versions of Apache 1.3, but this conflicted with the historical ssl %{var}c syntax.) |
Beware: You should really read the notes when using a option that is marked with a
X
on theSupported?
column.
If a line does not match with the defined format, an \Kassner\LogParser\FormatException
will be thrown.