Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logs agent appears to miss start of new files #1302

Closed
pmcatominey opened this issue Feb 20, 2018 · 2 comments
Closed

Logs agent appears to miss start of new files #1302

pmcatominey opened this issue Feb 20, 2018 · 2 comments

Comments

@pmcatominey
Copy link

Note: Agent status output is at the bottom.

We are using the Logs agent to collect all files matching a glob pattern, a new log file is created daily with the date as part of the filename. We have noticed that the very first items within the log are missing from the Datadog interface.

At the top of todays (20th February) log file are six lines all with the timestamp [2018-02-20 00:00:00], however none of these can be found in the Logs explorer. The first item to turn up has the timestamp [2018-02-20 00:01:54]. We do not see this issue with the other log files collected from this host which are rotated rather than going to a new file.

Our theory is that the scanner which looks for new files every 10 seconds sets up the tailer for the file from the end, leaving any records created before the scanner has picked up the file to be lost. In scanner.go the new tailer is setup but the tailFromBeginning argument is set to false, this leads to the tailer being setup with the last committed offset held by the auditor, the auditor returns SEEK_END I'm not entirely sure if either of these are in fact the cause.

Agent Status:

===================
Agent (v6.0.0-rc.1)
===================

  Status date: 2018-02-20 12:00:21.366025 UTC
  Pid: 26061
  Python Version: 2.7.12
  Logs:
  Check Runners: 10
  Log Level: info

  Paths
  =====
    Config File: /etc/datadog-agent/datadog.yaml
    conf.d: /etc/datadog-agent/conf.d
    checks.d: /etc/datadog-agent/checks.d

  Clocks
  ======
    NTP offset: -0.000686572 s
    System UTC time: 2018-02-20 12:00:21.366025 UTC

  Host Info
  =========
    bootTime: 2017-07-13 08:18:20.000000 UTC
    kernelVersion: 3.13.0-77-generic
    os: linux
    platform: ubuntu
    platformFamily: debian
    platformVersion: 16.04
    procs: 163
    uptime: 1.9128963e+07
    virtualizationRole: host
    virtualizationSystem: kvm

  Hostnames
  =========
    hostname: hostnamehere
    socket-fqdn: localhost
    socket-hostname: hostnamehere

=========
Collector
=========

  Running Checks
  ==============
    cpu
    ---
      Total Runs: 4343
      Metrics: 6, Total Metrics: 26052
      Events: 0, Total Events: 0
      Service Checks: 0, Total Service Checks: 0

    disk
    ----
      Total Runs: 4343
      Metrics: 74, Total Metrics: over 100K
      Events: 0, Total Events: 0
      Service Checks: 0, Total Service Checks: 0

    dns_check
    ---------
      Total Runs: 4343
      Metrics: 1, Total Metrics: 4343
      Events: 0, Total Events: 0
      Service Checks: 1, Total Service Checks: 4343

    file_handle
    -----------
      Total Runs: 4343
      Metrics: 1, Total Metrics: 4343
      Events: 0, Total Events: 0
      Service Checks: 0, Total Service Checks: 0

    io
    --
      Total Runs: 4343
      Metrics: 26, Total Metrics: over 100K
      Events: 0, Total Events: 0
      Service Checks: 0, Total Service Checks: 0

    load
    ----
      Total Runs: 4343
      Metrics: 6, Total Metrics: 26058
      Events: 0, Total Events: 0
      Service Checks: 0, Total Service Checks: 0

    memory
    ------
      Total Runs: 4343
      Metrics: 14, Total Metrics: 60802
      Events: 0, Total Events: 0
      Service Checks: 0, Total Service Checks: 0

    network
    -------
      Total Runs: 4343
      Metrics: 26, Total Metrics: over 100K
      Events: 0, Total Events: 0
      Service Checks: 0, Total Service Checks: 0

    ntp
    ---
      Total Runs: 4343
      Metrics: 1, Total Metrics: 4303
      Events: 0, Total Events: 0
      Service Checks: 1, Total Service Checks: 4343

    pgbouncer
    ---------
      Total Runs: 4343
      Metrics: 48, Total Metrics: over 100K
      Events: 0, Total Events: 0
      Service Checks: 1, Total Service Checks: 4343

    supervisord
    -----------
      Total Runs: 4343
      Metrics: 14, Total Metrics: 60802
      Events: 0, Total Events: 0
      Service Checks: 12, Total Service Checks: 52116

    uptime
    ------
      Total Runs: 4343
      Metrics: 1, Total Metrics: 4343
      Events: 0, Total Events: 0
      Service Checks: 0, Total Service Checks: 0

  Loading Errors
  ==============
    apm
    ---
      Core Check Loader:
        Check apm not found in Catalog

      JMX Check Loader:
        check is not a jmx check, or unable to determine if it's so

      Python Check Loader:
        No module named apm

========
JMXFetch
========

  Initialized checks
  ==================
    no checks

  Failed checks
  =============
    no checks

=========
Forwarder
=========

  CheckRunsV1: 4343
  IntakeV1: 331
  RetryQueueSize: 0
  Success: 9017
  TimeseriesV1: 4343

  API Keys status
  ===============
    https://6-0-0-app.agent.datadoghq.com,*************************c4dce: API Key valid

==========
Logs-agent
==========

  laravel
  -------
    Type: file
    Path: /srv/sites/shared/storage/logs/laravel-*.log
    Status: OK
    Inputs: /srv/sites/shared/storage/logs/laravel-2018-02-19.log /srv/sites/shared/storage/logs/laravel-2018-02-20.log

    Type: file
    Path: /var/log/site-*.log
    Status: OK
    Inputs: /var/log/site-worker-default-00.log /var/log/site-worker-high-00.log /var/log/site-worker-low-00.log /var/log/site-worker-medium-00.log /var/log/site-export-worker-00.log /var/log/site-export-worker-01.log /var/log/site-other-worker-00.log

=========
DogStatsD
=========

  Checks Metric Sample: 1.050941e+06
  Event: 1
  Events Flushed: 1
  Number Of Flushes: 4343
  Series Flushed: 818602
  Service Check: 117261
  Service Checks Flushed: 121577
  Dogstatsd Metric Sample: 337
@NBParis
Copy link

NBParis commented Feb 20, 2018

Hello @pmcatominey.

Thanks a lot for reporting this issue.
It seems indeed that your theory is correct and that we have an issue for new files as we always tail from the end of the files whereas it is not the expected behaviour once the agent is running.

Thanks again for reporting this issue and for your analysis.
Nils

@ajacquemot
Copy link
Contributor

Hello @pmcatominey,

Thanks for sharing, we fixed the issue here.
Alexandre

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants