Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot receive logs from docker #17747

Closed
aido93 opened this issue Jun 23, 2023 · 7 comments · Fixed by #18649
Closed

Cannot receive logs from docker #17747

aido93 opened this issue Jun 23, 2023 · 7 comments · Fixed by #18649
Labels
source: docker_logs Anything `docker_logs` source related type: bug A code related bug.

Comments

@aido93
Copy link

aido93 commented Jun 23, 2023

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Problem

I'm trying to collect logs from docker containers, but I'm getting errors like:

ERROR source{component_kind="source" component_id=docker_logs component_type=docker_logs component_name=docker_logs}: vector::internal_events::docker_logs: Received out of order log message. error_type="condition_failed" stage="receiving" container_id="795327b7a459bb1bffb30d5e287c726c27c7d6282cddf3d1eb2a25abae1e83e5" timestamp="2023-06-23T21:03:12.255522961Z" internal_log_rate_limit=true
ERROR source{component_kind="source" component_id=docker_logs component_type=docker_logs component_name=docker_logs}: vector::internal_events::docker_logs: Internal log [Received out of order log message.] is being suppressed to avoid flooding.

I expect that docker logs can be collected successfully.

Configuration

[sources.docker_logs]
type = "docker_logs"
auto_partial_merge=false
exclude_containers = [
  "vector",
  "openobserve"
]

Version

0.30.0

Debug Output

https://gist.github.com/aido93/2aeb512d2e77c8796bda08d773967514

Example Data

You can run some services in docker and see what's happening. Here is example docker-compose.yml that runs openobserve behind traefik. So when openobserver webpage is reloaded traefik writes to accesslog. I expect to see this accesslog in vector at first and in openobserve afterwards.

version: '3.7'

services:
  traefik:
    image: traefik:v2.10
    container_name: traefik
    command:
      - "--log.level=DEBUG"
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
      - "--entrypoints.web.address=:80"
      - "--api=true"
      - "--api.insecure=true"
      - "--accesslog=true"
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.api.rule=Host(`proxy.corp.com`)"
      - "traefik.http.routers.api.service=api@internal"
      - "traefik.http.services.traefik.loadbalancer.server.port=8080"
    network_mode: "host"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    depends_on:
      - openobserve
  openobserve:
    image: public.ecr.aws/zinclabs/openobserve:latest
    container_name: openobserve
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.openobserve.rule=Host(`logs.corp.com`)"
      - "traefik.http.routers.openobserve.entrypoints=web"
      - "traefik.http.services.openobserve.loadbalancer.server.port=5080"
    environment:
      ZO_ROOT_USER_EMAIL: "root@corp.com"
      ZO_ROOT_USER_PASSWORD: "Complexpass#123"
      ZO_LOCAL_MODE: "true"
      ZO_LOCAL_MODE_STORAGE: disk
    network_mode: "host"
    volumes:
      - openobserve-data:/data

  vector:
    image: timberio/vector:0.30.0-alpine
    container_name: vector
    command: -vvv
    environment:
      RUST_BACKTRACE: full
    volumes:
      - ./vector.toml:/etc/vector/vector.toml:ro
      - /var/run/docker.sock:/var/run/docker.sock
    network_mode: "host"
    depends_on:
      - openobserve

volumes:
  openobserve-data:

Additional Context

Error message is from src/sources/docker_logs/mod.rs:1013

References

No response

@aido93 aido93 added the type: bug A code related bug. label Jun 23, 2023
@dsmith3197
Copy link
Contributor

Hi @aido93,

Thank you for providing all this context!

As you have encountered, the Docker source intentionally drops logs that are received with out of order timestamps and emits the above error message when that happens. This also includes logs with a timestamp that occurred before Vector started. If you are seeing these error logs only shortly after Vector starts up, then this is likely the case.

Now, that still doesn't explain why you are not seeing the traefik access logs in openobserve. To help you with that, could you share the full Vector configuration you are using, including the sink? I took a look at the debug output you linked and Vector behaved as expected without any errors. It is consuming logs from the /traefik docker container and forwarding them to http://localhost:5080/api/corp/tcl/_json.

@dsmith3197 dsmith3197 added the source: docker_logs Anything `docker_logs` source related label Jul 10, 2023
@usermakename
Copy link

hello, i'v too having this problems, can you help please?
it's appease in logs not just after start, 1h-10h too

2023-09-22T14:32:40.493954Z ERROR source{component_kind="source" component_id=docker_local component_type=docker_logs component_name=docker_local}: vector::internal_events::docker_logs: Received out of order log message. error_type="condition_failed" stage="receiving" container_id="6c2b916a5fec6c01fde277bfa9395c17382720a26dcfaa3be710dc9c3fba78e7" timestamp="2023-09-22T14:32:39.038608453Z" internal_log_rate_limit=true
2023-09-22T14:32:40.494084Z ERROR source{component_kind="source" component_id=docker_local component_type=docker_logs component_name=docker_local}: vector::internal_events::docker_logs: Internal log [Received out of order log message.] is being suppressed to avoid flooding.

vector.toml:

#                                    __   __  __
#                                    \ \ / / / /
#                                     \ V / / /
#                                      \_/  \/
#
#                                    V E C T O R
#                                   Configuration
#
# ------------------------------------------------------------------------------
# Website: https://vector.dev
# Docs: https://vector.dev/docs
# Chat: https://chat.vector.dev
# ------------------------------------------------------------------------------

data_dir = "/etc/vector"

[api]
enabled = true
address = "10.128.0.17:8686"
playground = false # web interface for testing

[sources.internal_metrics]
  type = "internal_metrics"

[sinks.vector_metrics]
  type = "prometheus_exporter"
  inputs = [ "internal_metrics" ]
  address = "10.128.0.17:9598"

# Read vector logs

[sources.internal]
type = "internal_logs"
pid_key = "pid"

# Read logs from Docker API and Send to loki sink

[sources.docker_local]
  type = "docker_logs"
  docker_host = "/var/run/docker.sock"
  exclude_containers = []
  auto_partial_merge=false
  # Identify zero-width space as first line of a multiline block.
  multiline.condition_pattern = '^\x{200B}' # required
  multiline.mode = "halt_before" # required
  multiline.start_pattern = '^\x{200B}' # required
  multiline.timeout_ms = 1000 # required, milliseconds
  retry_backoff_secs = 10

@dsmith3197
Copy link
Contributor

I took a deeper look at the source code and this is not an actual error. I've opened #18649 that explains more and removes the error.

@ivantopo
Copy link

Hey @dsmith3197, just to confirm the expected behavior:

The docker_logs source will drop all log events with a timestamp before vector's init timestamp, meaning that if we temporarily stop or restart the vector service then all docker_logs generated while vector was down will be lost?

@dsmith3197
Copy link
Contributor

Hey @dsmith3197, just to confirm the expected behavior:

The docker_logs source will drop all log events with a timestamp before vector's init timestamp, meaning that if we temporarily stop or restart the vector service then all docker_logs generated while vector was down will be lost?

Correct, that is the behavior today.

@cdepillabout
Copy link

if we temporarily stop or restart the vector service then all docker_logs generated while vector was down will be lost

As a follow-up to this, is there some way to tell Vector to process all the Docker logs it can find that it hasn't already processed, even if those logs were created before Vector was started?

@dsmith3197
Copy link
Contributor

if we temporarily stop or restart the vector service then all docker_logs generated while vector was down will be lost

As a follow-up to this, is there some way to tell Vector to process all the Docker logs it can find that it hasn't already processed, even if those logs were created before Vector was started?

Not as of today. #7358 is an open feature request for checkpointing support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
source: docker_logs Anything `docker_logs` source related type: bug A code related bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants