Logs agent appears to miss start of new files #1302

pmcatominey · 2018-02-20T12:28:17Z

Note: Agent status output is at the bottom.

We are using the Logs agent to collect all files matching a glob pattern, a new log file is created daily with the date as part of the filename. We have noticed that the very first items within the log are missing from the Datadog interface.

At the top of todays (20th February) log file are six lines all with the timestamp [2018-02-20 00:00:00], however none of these can be found in the Logs explorer. The first item to turn up has the timestamp [2018-02-20 00:01:54]. We do not see this issue with the other log files collected from this host which are rotated rather than going to a new file.

Our theory is that the scanner which looks for new files every 10 seconds sets up the tailer for the file from the end, leaving any records created before the scanner has picked up the file to be lost. In scanner.go the new tailer is setup but the tailFromBeginning argument is set to false, this leads to the tailer being setup with the last committed offset held by the auditor, the auditor returns SEEK_END I'm not entirely sure if either of these are in fact the cause.

Agent Status:

===================
Agent (v6.0.0-rc.1)
===================

  Status date: 2018-02-20 12:00:21.366025 UTC
  Pid: 26061
  Python Version: 2.7.12
  Logs:
  Check Runners: 10
  Log Level: info

  Paths
  =====
    Config File: /etc/datadog-agent/datadog.yaml
    conf.d: /etc/datadog-agent/conf.d
    checks.d: /etc/datadog-agent/checks.d

  Clocks
  ======
    NTP offset: -0.000686572 s
    System UTC time: 2018-02-20 12:00:21.366025 UTC

  Host Info
  =========
    bootTime: 2017-07-13 08:18:20.000000 UTC
    kernelVersion: 3.13.0-77-generic
    os: linux
    platform: ubuntu
    platformFamily: debian
    platformVersion: 16.04
    procs: 163
    uptime: 1.9128963e&#43;07
    virtualizationRole: host
    virtualizationSystem: kvm

  Hostnames
  =========
    hostname: hostnamehere
    socket-fqdn: localhost
    socket-hostname: hostnamehere

=========
Collector
=========

  Running Checks
  ==============
    cpu
    ---
      Total Runs: 4343
      Metrics: 6, Total Metrics: 26052
      Events: 0, Total Events: 0
      Service Checks: 0, Total Service Checks: 0

    disk
    ----
      Total Runs: 4343
      Metrics: 74, Total Metrics: over 100K
      Events: 0, Total Events: 0
      Service Checks: 0, Total Service Checks: 0

    dns_check
    ---------
      Total Runs: 4343
      Metrics: 1, Total Metrics: 4343
      Events: 0, Total Events: 0
      Service Checks: 1, Total Service Checks: 4343

    file_handle
    -----------
      Total Runs: 4343
      Metrics: 1, Total Metrics: 4343
      Events: 0, Total Events: 0
      Service Checks: 0, Total Service Checks: 0

    io
    --
      Total Runs: 4343
      Metrics: 26, Total Metrics: over 100K
      Events: 0, Total Events: 0
      Service Checks: 0, Total Service Checks: 0

    load
    ----
      Total Runs: 4343
      Metrics: 6, Total Metrics: 26058
      Events: 0, Total Events: 0
      Service Checks: 0, Total Service Checks: 0

    memory
    ------
      Total Runs: 4343
      Metrics: 14, Total Metrics: 60802
      Events: 0, Total Events: 0
      Service Checks: 0, Total Service Checks: 0

    network
    -------
      Total Runs: 4343
      Metrics: 26, Total Metrics: over 100K
      Events: 0, Total Events: 0
      Service Checks: 0, Total Service Checks: 0

    ntp
    ---
      Total Runs: 4343
      Metrics: 1, Total Metrics: 4303
      Events: 0, Total Events: 0
      Service Checks: 1, Total Service Checks: 4343

    pgbouncer
    ---------
      Total Runs: 4343
      Metrics: 48, Total Metrics: over 100K
      Events: 0, Total Events: 0
      Service Checks: 1, Total Service Checks: 4343

    supervisord
    -----------
      Total Runs: 4343
      Metrics: 14, Total Metrics: 60802
      Events: 0, Total Events: 0
      Service Checks: 12, Total Service Checks: 52116

    uptime
    ------
      Total Runs: 4343
      Metrics: 1, Total Metrics: 4343
      Events: 0, Total Events: 0
      Service Checks: 0, Total Service Checks: 0

  Loading Errors
  ==============
    apm
    ---
      Core Check Loader:
        Check apm not found in Catalog

      JMX Check Loader:
        check is not a jmx check, or unable to determine if it's so

      Python Check Loader:
        No module named apm

========
JMXFetch
========

  Initialized checks
  ==================
    no checks

  Failed checks
  =============
    no checks

=========
Forwarder
=========

  CheckRunsV1: 4343
  IntakeV1: 331
  RetryQueueSize: 0
  Success: 9017
  TimeseriesV1: 4343

  API Keys status
  ===============
    https://6-0-0-app.agent.datadoghq.com,*************************c4dce: API Key valid

==========
Logs-agent
==========

  laravel
  -------
    Type: file
    Path: /srv/sites/shared/storage/logs/laravel-*.log
    Status: OK
    Inputs: /srv/sites/shared/storage/logs/laravel-2018-02-19.log /srv/sites/shared/storage/logs/laravel-2018-02-20.log

    Type: file
    Path: /var/log/site-*.log
    Status: OK
    Inputs: /var/log/site-worker-default-00.log /var/log/site-worker-high-00.log /var/log/site-worker-low-00.log /var/log/site-worker-medium-00.log /var/log/site-export-worker-00.log /var/log/site-export-worker-01.log /var/log/site-other-worker-00.log

=========
DogStatsD
=========

  Checks Metric Sample: 1.050941e&#43;06
  Event: 1
  Events Flushed: 1
  Number Of Flushes: 4343
  Series Flushed: 818602
  Service Check: 117261
  Service Checks Flushed: 121577
  Dogstatsd Metric Sample: 337

The text was updated successfully, but these errors were encountered:

NBParis · 2018-02-20T13:34:38Z

Hello @pmcatominey.

Thanks a lot for reporting this issue.
It seems indeed that your theory is correct and that we have an issue for new files as we always tail from the end of the files whereas it is not the expected behaviour once the agent is running.

Thanks again for reporting this issue and for your analysis.
Nils

ajacquemot · 2018-02-21T13:22:12Z

Hello @pmcatominey,

Thanks for sharing, we fixed the issue here.
Alexandre

olivielpeau added [deprecated] team/logs component/logs labels Feb 20, 2018

ajacquemot self-assigned this Feb 20, 2018

ajacquemot added this to the 6.0.0-rc.3 milestone Feb 20, 2018

This was referenced Feb 20, 2018

Fixed lines miss issue when tailing new files #1306

Merged

Fixed lines miss issue when tailing new files #1315

Merged

ajacquemot closed this as completed Feb 21, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Logs agent appears to miss start of new files #1302

Logs agent appears to miss start of new files #1302

pmcatominey commented Feb 20, 2018

NBParis commented Feb 20, 2018

ajacquemot commented Feb 21, 2018

Logs agent appears to miss start of new files #1302

Logs agent appears to miss start of new files #1302

Comments

pmcatominey commented Feb 20, 2018

NBParis commented Feb 20, 2018

ajacquemot commented Feb 21, 2018