Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/resourcedetection] system detector sets host.id to an empty value on containerized setups #24230

Closed
mx-psi opened this issue Jul 12, 2023 · 7 comments
Labels
bug Something isn't working priority:p2 Medium processor/resourcedetection Resource detection processor Stale

Comments

@mx-psi
Copy link
Member

mx-psi commented Jul 12, 2023

Component(s)

processor/resourcedetection

What happened?

Description

The host.id resource attribute is set by the system detector to an empty string value when running on containerized setups.

Steps to Reproduce

Run the Collector contrib Docker image with the configuration provided below. This is also reproducible with custom builds if one has some base images (e.g. alpine:3.16).

Expected Result

The detector works in accordance to the specification which states that (emphasis mine):

Unique host ID. For Cloud, this must be the instance_id assigned by the cloud provider. For non-containerized systems, this should be the machine-id. See the table below for the sources to use to determine the machine-id based on operating system.

AIUI, an empty string is not valid since it's not unique, and on containerized environments, this should not be the machine-id since a container is not 'really' a host.

My expectation would be either that on containerized environments the host.id resource attribute is not set or that it is set to a value that persists across restarts.

Actual Result

On v0.80.0, an empty string is set on host.id, which is not a "unique host ID". On v0.72.0 up to v0.79.0, a random UUID that varies on each container restart is set on host.id.

Collector version

v0.80.0

Environment information

Environment

Running the docker image otel/opentelemetry-collector-contrib:0.80.0 with the configuration provided below reproduces this.

OpenTelemetry Collector configuration

receivers:
  # Put a dummy receiver just to generate some metrics
  hostmetrics:
    collection_interval: 10s
    scrapers:
      load:

processors:
  resourcedetection:
    detectors: [system]

exporters:
  logging:
    verbosity: detailed

service:
  pipelines:
    metrics:
      receivers: [hostmetrics]
      processors: [resourcedetection]
      exporters: [logging]

Log output

2023-07-12T10:15:03.905Z	info	service/telemetry.go:81	Setting up own telemetry...
2023-07-12T10:15:03.907Z	info	service/telemetry.go:104	Serving Prometheus metrics	{"address": ":8888", "level": "Basic"}
2023-07-12T10:15:03.907Z	info	exporter@v0.80.0/exporter.go:275	Development component. May change in the future.	{"kind": "exporter", "data_type": "metrics", "name": "logging"}
2023-07-12T10:15:03.910Z	info	service/service.go:131	Starting otelcol-contrib...	{"Version": "0.80.0", "NumCPU": 20}
2023-07-12T10:15:03.910Z	info	extensions/extensions.go:30	Starting extensions...
2023-07-12T10:15:03.910Z	info	internal/resourcedetection.go:125	began detecting resource information	{"kind": "processor", "name": "resourcedetection", "pipeline": "metrics"}
2023-07-12T10:15:03.912Z	info	internal/resourcedetection.go:139	detected resource information	{"kind": "processor", "name": "resourcedetection", "pipeline": "metrics", "resource": {"host.id":"","host.name":"d5ad29786527","os.type":"linux"}}

Additional context

Prior to v0.80.0 (in particular, in between v0.72.0 that included #18618 and before #18740) this would generate a random UUID on each container restart. An example run of the same configuration on v0.79.0:

2023-07-07T11:04:20.764Z	info	service/telemetry.go:104	Setting up own telemetry...
2023-07-07T11:04:20.764Z	info	service/telemetry.go:127	Serving Prometheus metrics	{"address": ":8888", "level": "Basic"}
2023-07-07T11:04:20.764Z	info	exporter@v0.79.0/exporter.go:275	Development component. May change in the future.	{"kind": "exporter", "data_type": "metrics", "name": "logging"}
2023-07-07T11:04:20.766Z	info	service/service.go:131	Starting otelcol-contrib...	{"Version": "0.79.0", "NumCPU": 20}
2023-07-07T11:04:20.766Z	info	extensions/extensions.go:30	Starting extensions...
2023-07-07T11:04:20.766Z	info	internal/resourcedetection.go:125	began detecting resource information	{"kind": "processor", "name": "resourcedetection", "pipeline": "metrics"}
2023-07-07T11:04:20.766Z	info	internal/resourcedetection.go:139	detected resource information	{"kind": "processor", "name": "resourcedetection", "pipeline": "metrics", "resource": {"host.id":"d8aeac44-921a-449b-a1d9-aa1621c50d3d","host.name":"43a4683d77f0","os.type":"linux"}}

This was reported on #18618 (comment) and happens because this code path is taken on the dependency used by #18618 to run this.

@mx-psi mx-psi added bug Something isn't working priority:p2 Medium labels Jul 12, 2023
@github-actions github-actions bot added the processor/resourcedetection Resource detection processor label Jul 12, 2023
@github-actions
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@mx-psi
Copy link
Member Author

mx-psi commented Jul 12, 2023

cc @sumo-drosiek any ideas on what to do here? What is the behavior you wanted on containerized setups?

@sumo-drosiek
Copy link
Member

First of all I think that empty host.id should be considered as invalid behavior and we shouldn't return host.id then. I need to perform research around containerized setups if there is way to get unique id within the containers. Random value is not valid in this case

mx-psi added a commit that referenced this issue Jul 14, 2023
…24239)

**Description:**

Do not return empty host ID

**Link to tracking Issue:** #24230

**Testing:** Unit tests

**Documentation:** N/A

---------

Signed-off-by: Dominik Rosiek <drosiek@sumologic.com>
Co-authored-by: Pablo Baeyens <pbaeyens31+github@gmail.com>
@mx-psi
Copy link
Member Author

mx-psi commented Jul 14, 2023

Reported also upstream: open-telemetry/opentelemetry-go/issues/4312

@mwear
Copy link
Member

mwear commented Jul 14, 2023

I have a PR upstream: open-telemetry/opentelemetry-go#4317 with a fix, but am uncertain if it is the fix we want. The description on the PR lays out the problem and some possible alternative solutions. Let me know if anyone has thoughts or opinions.

@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Sep 13, 2023
@mx-psi
Copy link
Member Author

mx-psi commented Sep 13, 2023

We can close this, it was fixed upstream and we have a safeguard on the Collector as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority:p2 Medium processor/resourcedetection Resource detection processor Stale
Projects
None yet
Development

No branches or pull requests

3 participants