You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are using the Logs agent to collect all files matching a glob pattern, a new log file is created daily with the date as part of the filename. We have noticed that the very first items within the log are missing from the Datadog interface.
At the top of todays (20th February) log file are six lines all with the timestamp [2018-02-20 00:00:00], however none of these can be found in the Logs explorer. The first item to turn up has the timestamp [2018-02-20 00:01:54]. We do not see this issue with the other log files collected from this host which are rotated rather than going to a new file.
Our theory is that the scanner which looks for new files every 10 seconds sets up the tailer for the file from the end, leaving any records created before the scanner has picked up the file to be lost. In scanner.go the new tailer is setup but the tailFromBeginning argument is set to false, this leads to the tailer being setup with the last committed offset held by the auditor, the auditor returns SEEK_END I'm not entirely sure if either of these are in fact the cause.
Agent Status:
===================
Agent (v6.0.0-rc.1)
===================
Status date: 2018-02-20 12:00:21.366025 UTC
Pid: 26061
Python Version: 2.7.12
Logs:
Check Runners: 10
Log Level: info
Paths
=====
Config File: /etc/datadog-agent/datadog.yaml
conf.d: /etc/datadog-agent/conf.d
checks.d: /etc/datadog-agent/checks.d
Clocks
======
NTP offset: -0.000686572 s
System UTC time: 2018-02-20 12:00:21.366025 UTC
Host Info
=========
bootTime: 2017-07-13 08:18:20.000000 UTC
kernelVersion: 3.13.0-77-generic
os: linux
platform: ubuntu
platformFamily: debian
platformVersion: 16.04
procs: 163
uptime: 1.9128963e+07
virtualizationRole: host
virtualizationSystem: kvm
Hostnames
=========
hostname: hostnamehere
socket-fqdn: localhost
socket-hostname: hostnamehere
=========
Collector
=========
Running Checks
==============
cpu
---
Total Runs: 4343
Metrics: 6, Total Metrics: 26052
Events: 0, Total Events: 0
Service Checks: 0, Total Service Checks: 0
disk
----
Total Runs: 4343
Metrics: 74, Total Metrics: over 100K
Events: 0, Total Events: 0
Service Checks: 0, Total Service Checks: 0
dns_check
---------
Total Runs: 4343
Metrics: 1, Total Metrics: 4343
Events: 0, Total Events: 0
Service Checks: 1, Total Service Checks: 4343
file_handle
-----------
Total Runs: 4343
Metrics: 1, Total Metrics: 4343
Events: 0, Total Events: 0
Service Checks: 0, Total Service Checks: 0
io
--
Total Runs: 4343
Metrics: 26, Total Metrics: over 100K
Events: 0, Total Events: 0
Service Checks: 0, Total Service Checks: 0
load
----
Total Runs: 4343
Metrics: 6, Total Metrics: 26058
Events: 0, Total Events: 0
Service Checks: 0, Total Service Checks: 0
memory
------
Total Runs: 4343
Metrics: 14, Total Metrics: 60802
Events: 0, Total Events: 0
Service Checks: 0, Total Service Checks: 0
network
-------
Total Runs: 4343
Metrics: 26, Total Metrics: over 100K
Events: 0, Total Events: 0
Service Checks: 0, Total Service Checks: 0
ntp
---
Total Runs: 4343
Metrics: 1, Total Metrics: 4303
Events: 0, Total Events: 0
Service Checks: 1, Total Service Checks: 4343
pgbouncer
---------
Total Runs: 4343
Metrics: 48, Total Metrics: over 100K
Events: 0, Total Events: 0
Service Checks: 1, Total Service Checks: 4343
supervisord
-----------
Total Runs: 4343
Metrics: 14, Total Metrics: 60802
Events: 0, Total Events: 0
Service Checks: 12, Total Service Checks: 52116
uptime
------
Total Runs: 4343
Metrics: 1, Total Metrics: 4343
Events: 0, Total Events: 0
Service Checks: 0, Total Service Checks: 0
Loading Errors
==============
apm
---
Core Check Loader:
Check apm not found in Catalog
JMX Check Loader:
check is not a jmx check, or unable to determine if it's so
Python Check Loader:
No module named apm
========
JMXFetch
========
Initialized checks
==================
no checks
Failed checks
=============
no checks
=========
Forwarder
=========
CheckRunsV1: 4343
IntakeV1: 331
RetryQueueSize: 0
Success: 9017
TimeseriesV1: 4343
API Keys status
===============
https://6-0-0-app.agent.datadoghq.com,*************************c4dce: API Key valid
==========
Logs-agent
==========
laravel
-------
Type: file
Path: /srv/sites/shared/storage/logs/laravel-*.log
Status: OK
Inputs: /srv/sites/shared/storage/logs/laravel-2018-02-19.log /srv/sites/shared/storage/logs/laravel-2018-02-20.log
Type: file
Path: /var/log/site-*.log
Status: OK
Inputs: /var/log/site-worker-default-00.log /var/log/site-worker-high-00.log /var/log/site-worker-low-00.log /var/log/site-worker-medium-00.log /var/log/site-export-worker-00.log /var/log/site-export-worker-01.log /var/log/site-other-worker-00.log
=========
DogStatsD
=========
Checks Metric Sample: 1.050941e+06
Event: 1
Events Flushed: 1
Number Of Flushes: 4343
Series Flushed: 818602
Service Check: 117261
Service Checks Flushed: 121577
Dogstatsd Metric Sample: 337
The text was updated successfully, but these errors were encountered:
Thanks a lot for reporting this issue.
It seems indeed that your theory is correct and that we have an issue for new files as we always tail from the end of the files whereas it is not the expected behaviour once the agent is running.
Thanks again for reporting this issue and for your analysis.
Nils
Note: Agent status output is at the bottom.
We are using the Logs agent to collect all files matching a glob pattern, a new log file is created daily with the date as part of the filename. We have noticed that the very first items within the log are missing from the Datadog interface.
At the top of todays (20th February) log file are six lines all with the timestamp
[2018-02-20 00:00:00]
, however none of these can be found in the Logs explorer. The first item to turn up has the timestamp[2018-02-20 00:01:54]
. We do not see this issue with the other log files collected from this host which are rotated rather than going to a new file.Our theory is that the scanner which looks for new files every 10 seconds sets up the tailer for the file from the end, leaving any records created before the scanner has picked up the file to be lost. In scanner.go the new tailer is setup but the
tailFromBeginning
argument is set to false, this leads to the tailer being setup with the last committed offset held by the auditor, the auditor returns SEEK_END I'm not entirely sure if either of these are in fact the cause.Agent Status:
The text was updated successfully, but these errors were encountered: