Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node_textfile_scrape_error not reported when failing to read metrics due to inconsistent HELP texts #2317

Closed
pieter-lautus opened this issue Mar 15, 2022 · 5 comments · Fixed by #2962

Comments

@pieter-lautus
Copy link

Host operating system:

Linux tau-gfa-uat 5.4.0-62-generic #70-Ubuntu SMP Tue Jan 12 12:45:47 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

node_exporter version:

node_exporter, version 1.3.1 (branch: HEAD, revision: a2321e7)
build user: root@243aafa5525c
build date: 20211205-11:09:49
go version: go1.17.3
platform: linux/amd64

node_exporter command line flags

/usr/local/bin/node_exporter \
  --collector.disable-defaults \
  --collector.cpu \
  --collector.cpufreq \
  --collector.diskstats \
  --collector.edac \
  --collector.filefd \
  --collector.filesystem \
  --collector.filesystem.fs-types-exclude \
  '^(autofs|binfmt_misc|bpf|cgroup2?|configfs|debugfs|devpts|devtmpfs|tmpfs|fusectl|fuse.*|hugetlbfs|mqueue|nsfs|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|selinuxfs|squashfs|sysfs|tracefs)$' \
  --collector.hwmon \
  --collector.loadavg \
  --collector.mdadm \
  --collector.meminfo \
  --collector.netdev \
  --collector.netstat \
  --collector.pressure \
  --collector.processes \
  --collector.schedstat \
  --collector.sockstat \
  --collector.stat \
  --collector.textfile \
  --collector.textfile.directory \
  /var/lib/node_exporter_textfile \
  --collector.time \
  --collector.timex \
  --collector.vmstat \
  --web.disable-exporter-metrics \
  --web.listen-address \
  :9100

Are you running node_exporter in Docker?

No

What did you do that produced an error?

For technical reasons, different parts of our system create different files for the same metric, but different labels. Due to code divergence, the files had inconsistent HELP messages.

For example:

# cat tau_infrastructure_performing_maintenance_task_foo.prom 
# HELP tau_infrastructure_performing_maintenance_task The server is performing some long-running maintenance task
# TYPE tau_infrastructure_performing_maintenance_task gauge
tau_infrastructure_performing_maintenance_task{main_task="deployment", sub_task="deployment_ansible", start_or_stop="start"} 1645624007.0

and

# cat tau_infrastructure_performing_maintenance_task_bar.prom 
# HELP tau_infrastructure_performing_maintenance_task At what timestamp a given task started or stopped, the last time it was run.
# TYPE tau_infrastructure_performing_maintenance_task gauge
tau_infrastructure_performing_maintenance_task{main_task="nightly",sub_task="main",start_or_stop="start"} 1647280801.98446

What did you expect to see?

I would expect either:

  1. To get all metrics, with node_exporter arbitrarily choosing one of the two HELP texts where there is a difference, or,
  2. if some metrics are dropped, for node_textfile_scrape_error to return non-zero for this job

What did you see instead?

The node_exporter logs an error about the situation, but not all of the metrics are scraped by prometheus, but node_textfile_scrape_error is nevertheless zero despite there being an issue preventing metrics from being exported.

Mar 15 10:43:25 REDACTED node_exporter[733184]: ts=2022-03-15T08:43:25.153Z caller=stdlib.go:105 level=error msg="error gathering metrics: 4 error(s) occurred:\n* [from Gatherer #2] collected metric tau_infrastructure_performing_maintenance_task label:<name:\"main_task\" value:\"nightly\" > label:<name:\"start_or_stop\" value:\"start\" > label:<name:\"sub_task\" value:\"main\" > gauge:<value:1.64728080198446e+09 >  has help \"At what timestamp a given task started or stopped, the last time it was run.\" but should have \"The server is performing some long-running maintenance task\"\n* [from Gatherer #2] collected metric tau_infrastructure_performing_maintenance_task label:<name:\"main_task\" value:\"nightly\" > label:<name:\"start_or_stop\" value:\"stop\" > label:<name:\"sub_task\" value:\"main\" > gauge:<value:1.64728240041946e+09 >  has help \"At what timestamp a given task started or stopped, the last time it was run.\" but should have \"The server is performing some long-running maintenance task\"\n* [from Gatherer #2] collected metric tau_infrastructure_performing_maintenance_task label:<name:\"main_task\" value:\"nightly\" > label:<name:\"start_or_stop\" value:\"start\" > label:<name:\"sub_task\" value:\"reporting\" > gauge:<value:1.64728080229161e+09 >  has help \"At what timestamp a given task started or stopped, the last time it was run.\" but should have \"The server is performing some long-running maintenance task\"\n* [from Gatherer #2] collected metric tau_infrastructure_performing_maintenance_task label:<name:\"main_task\" value:\"nightly\" > label:<name:\"start_or_stop\" value:\"stop\" > label:<name:\"sub_task\" value:\"reporting\" > gauge:<value:1.64728239993993e+09 >  has help \"At what timestamp a given task started or stopped, the last time it was run.\" but should have \"The server is performing some long-running maintenance task\""
@pieter-lautus
Copy link
Author

The impact of this bug is that a large part of our monitoring was "flying blind", without us knowing, despite us having taken care to monitor node_textfile_scrape_error to warns us about such situations.

@equinox0815
Copy link

We have a similar issue. In our case a bug in one of our text file collector scripts introduces duplicate entries on some hosts. The log of the node_exporter is full of messages like this:

level=error ts=2022-03-21T10:55:22.451Z caller=stdlib.go:105 msg="error gathering metrics: 5 error(s) occurred:\n* [from Gatherer #2] collected metric \"smartmon_device_smart_available\" { label:<name:\"device\" value:\"/dev/bus/0\" > label:<name:\"disk\" value:\"0\" > gauge:<value:1 > } was collected before with the same name and label values\n* [from Gatherer #2] collected metric \"smartmon_device_smart_enabled\" { label:<name:\"device\" value:\"/dev/bus/0\" > label:<name:\"disk\" value:\"0\" > gauge:<value:1 > } was collected before with the same name and label values\n* [from Gatherer #2] collected metric \"smartmon_device_smart_healthy\" { label:<name:\"device\" value:\"/dev/bus/0\" > label:<name:\"disk\" value:\"0\" > gauge:<value:1 > } was collected before with the same name and label values\n* [from Gatherer #2] collected metric \"smartmon_smartctl_run\" { label:<name:\"device\" value:\"/dev/bus/0\" > label:<name:\"disk\" value:\"0\" > gauge:<value:1.647860106e+09 > } was collected before with the same name and label values\n* [from Gatherer #2] collected metric \"smartmon_device_active\" { label:<name:\"device\" value:\"/dev/bus/0\" > label:<name:\"disk\" value:\"0\" > gauge:<value:1 > } was collected before with the same name and label values“

Despite the log clearly stating that there have been errors while gathering metrics node_textfile_scrape_error remains at 0.

@discordianfish
Copy link
Member

Yes, we should increase node_textfile_scrape_error in this case.

@rgroothuijsen
Copy link

This looks like a tricky one to solve. For the most part, scraping and metric validation -- including the error message above -- is abstracted into the client_golang module, which has no awareness of anything happening in node_exporter. The latter can only detect errors such as file reading failures, not whether the data that was gathered consisted of valid metrics.

@discordianfish
Copy link
Member

@rgroothuijsen Good point. We'd either need to wrap the registry in our own abstraction that does these checks or add a error counter to client_golang for such cases. Later would require some discussion on the prometheus-dev mailinglist I guess.

rexagod added a commit to rexagod/node_exporter that referenced this issue Mar 19, 2024
Avoid metrics with inconsistent help-texts. The earlier behaviour has
been preserved in the sense that the first encountered instance is still
used to generate metrics, whereas the subsequent inconsistent ones are
ignored along with a few peripheral changes.

```
 # HELP node_scrape_collector_duration_seconds node_exporter: Duration of a collector scrape.
 #TYPE node_scrape_collector_duration_seconds gauge
 node_scrape_collector_duration_seconds{collector="textfile"} 0.0004005
 # HELP node_scrape_collector_success node_exporter: Whether a collector succeeded.
 # TYPE node_scrape_collector_success gauge
 node_scrape_collector_success{collector="textfile"} 1
 # HELP node_textfile_mtime_seconds Unixtime mtime of textfiles successfully read.
 # TYPE node_textfile_mtime_seconds gauge
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-bar.prom"} 1.710812009e+09
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-foo.prom"} 1.710811982e+09
 # HELP node_textfile_scrape_error 1 if there was an error opening or reading a file, 0 otherwise
 # TYPE node_textfile_scrape_error gauge
 node_textfile_scrape_error 1
 # HELP promhttp_metric_handler_errors_total Total number of internal errors encountered by the promhttp metric handler.
 # TYPE promhttp_metric_handler_errors_total counter
 promhttp_metric_handler_errors_total{cause="encoding"} 0
 promhttp_metric_handler_errors_total{cause="gathering"} 0
 # HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
 # TYPE promhttp_metric_handler_requests_in_flight gauge
 promhttp_metric_handler_requests_in_flight 1
 # HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
 # TYPE promhttp_metric_handler_requests_total counter
 promhttp_metric_handler_requests_total{code="200"} 0
 promhttp_metric_handler_requests_total{code="500"} 0
 promhttp_metric_handler_requests_total{code="503"} 0
 # HELP tau_infrastructure_performing_maintenance_task At what timestamp a given task started or stopped, the last time it was run.
 # TYPE tau_infrastructure_performing_maintenance_task gauge
 tau_infrastructure_performing_maintenance_task{main_task="nightly",start_or_stop="start",sub_task="main"} 1.64728080198446e+09
```

Fixes: prometheus#2317
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
rexagod added a commit to rexagod/node_exporter that referenced this issue Mar 19, 2024
Avoid metrics with inconsistent help-texts. The earlier behaviour has
been preserved in the sense that the first encountered instance is still
used to generate metrics, whereas the subsequent inconsistent ones are
ignored along with a few peripheral changes.

```
 # HELP node_scrape_collector_duration_seconds node_exporter: Duration of a collector scrape.
 #TYPE node_scrape_collector_duration_seconds gauge
 node_scrape_collector_duration_seconds{collector="textfile"} 0.0004005
 # HELP node_scrape_collector_success node_exporter: Whether a collector succeeded.
 # TYPE node_scrape_collector_success gauge
 node_scrape_collector_success{collector="textfile"} 1
 # HELP node_textfile_mtime_seconds Unixtime mtime of textfiles successfully read.
 # TYPE node_textfile_mtime_seconds gauge
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-bar.prom"} 1.710812009e+09
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-foo.prom"} 1.710811982e+09
 # HELP node_textfile_scrape_error 1 if there was an error opening or reading a file, 0 otherwise
 # TYPE node_textfile_scrape_error gauge
 node_textfile_scrape_error 1
 # HELP promhttp_metric_handler_errors_total Total number of internal errors encountered by the promhttp metric handler.
 # TYPE promhttp_metric_handler_errors_total counter
 promhttp_metric_handler_errors_total{cause="encoding"} 0
 promhttp_metric_handler_errors_total{cause="gathering"} 0
 # HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
 # TYPE promhttp_metric_handler_requests_in_flight gauge
 promhttp_metric_handler_requests_in_flight 1
 # HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
 # TYPE promhttp_metric_handler_requests_total counter
 promhttp_metric_handler_requests_total{code="200"} 0
 promhttp_metric_handler_requests_total{code="500"} 0
 promhttp_metric_handler_requests_total{code="503"} 0
 # HELP tau_infrastructure_performing_maintenance_task At what timestamp a given task started or stopped, the last time it was run.
 # TYPE tau_infrastructure_performing_maintenance_task gauge
 tau_infrastructure_performing_maintenance_task{main_task="nightly",start_or_stop="start",sub_task="main"} 1.64728080198446e+09
```

Fixes: prometheus#2317
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
rexagod added a commit to rexagod/node_exporter that referenced this issue Mar 19, 2024
Avoid metrics with inconsistent help-texts. The earlier behaviour has
been preserved in the sense that the first encountered instance is still
used to generate metrics, whereas the subsequent inconsistent ones are
ignored along with a few peripheral changes.

```
 # HELP node_scrape_collector_duration_seconds node_exporter: Duration of a collector scrape.
 #TYPE node_scrape_collector_duration_seconds gauge
 node_scrape_collector_duration_seconds{collector="textfile"} 0.0004005
 # HELP node_scrape_collector_success node_exporter: Whether a collector succeeded.
 # TYPE node_scrape_collector_success gauge
 node_scrape_collector_success{collector="textfile"} 1
 # HELP node_textfile_mtime_seconds Unixtime mtime of textfiles successfully read.
 # TYPE node_textfile_mtime_seconds gauge
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-bar.prom"} 1.710812009e+09
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-foo.prom"} 1.710811982e+09
 # HELP node_textfile_scrape_error 1 if there was an error opening or reading a file, 0 otherwise
 # TYPE node_textfile_scrape_error gauge
 node_textfile_scrape_error 1
 # HELP promhttp_metric_handler_errors_total Total number of internal errors encountered by the promhttp metric handler.
 # TYPE promhttp_metric_handler_errors_total counter
 promhttp_metric_handler_errors_total{cause="encoding"} 0
 promhttp_metric_handler_errors_total{cause="gathering"} 0
 # HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
 # TYPE promhttp_metric_handler_requests_in_flight gauge
 promhttp_metric_handler_requests_in_flight 1
 # HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
 # TYPE promhttp_metric_handler_requests_total counter
 promhttp_metric_handler_requests_total{code="200"} 0
 promhttp_metric_handler_requests_total{code="500"} 0
 promhttp_metric_handler_requests_total{code="503"} 0
 # HELP tau_infrastructure_performing_maintenance_task At what timestamp a given task started or stopped, the last time it was run.
 # TYPE tau_infrastructure_performing_maintenance_task gauge
 tau_infrastructure_performing_maintenance_task{main_task="nightly",start_or_stop="start",sub_task="main"} 1.64728080198446e+09
```

Fixes: prometheus#2317
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
rexagod added a commit to rexagod/node_exporter that referenced this issue Mar 19, 2024
Avoid metrics with inconsistent help-texts. The earlier behaviour has
been preserved in the sense that the first encountered instance is still
used to generate metrics, whereas the subsequent inconsistent ones are
ignored along with a few peripheral changes.

```
 # HELP node_scrape_collector_duration_seconds node_exporter: Duration of a collector scrape.
 #TYPE node_scrape_collector_duration_seconds gauge
 node_scrape_collector_duration_seconds{collector="textfile"} 0.0004005
 # HELP node_scrape_collector_success node_exporter: Whether a collector succeeded.
 # TYPE node_scrape_collector_success gauge
 node_scrape_collector_success{collector="textfile"} 1
 # HELP node_textfile_mtime_seconds Unixtime mtime of textfiles successfully read.
 # TYPE node_textfile_mtime_seconds gauge
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-bar.prom"} 1.710812009e+09
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-foo.prom"} 1.710811982e+09
 # HELP node_textfile_scrape_error 1 if there was an error opening or reading a file, 0 otherwise
 # TYPE node_textfile_scrape_error gauge
 node_textfile_scrape_error 1
 # HELP promhttp_metric_handler_errors_total Total number of internal errors encountered by the promhttp metric handler.
 # TYPE promhttp_metric_handler_errors_total counter
 promhttp_metric_handler_errors_total{cause="encoding"} 0
 promhttp_metric_handler_errors_total{cause="gathering"} 0
 # HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
 # TYPE promhttp_metric_handler_requests_in_flight gauge
 promhttp_metric_handler_requests_in_flight 1
 # HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
 # TYPE promhttp_metric_handler_requests_total counter
 promhttp_metric_handler_requests_total{code="200"} 0
 promhttp_metric_handler_requests_total{code="500"} 0
 promhttp_metric_handler_requests_total{code="503"} 0
 # HELP tau_infrastructure_performing_maintenance_task At what timestamp a given task started or stopped, the last time it was run.
 # TYPE tau_infrastructure_performing_maintenance_task gauge
 tau_infrastructure_performing_maintenance_task{main_task="nightly",start_or_stop="start",sub_task="main"} 1.64728080198446e+09
```

Fixes: prometheus#2317
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
rexagod added a commit to rexagod/node_exporter that referenced this issue Mar 19, 2024
Avoid metrics with inconsistent help-texts. The earlier behaviour has
been preserved in the sense that the first encountered instance is still
used to generate metrics, whereas the subsequent inconsistent ones are
ignored along with a few peripheral changes.

```
 # HELP node_scrape_collector_duration_seconds node_exporter: Duration of a collector scrape.
 #TYPE node_scrape_collector_duration_seconds gauge
 node_scrape_collector_duration_seconds{collector="textfile"} 0.0004005
 # HELP node_scrape_collector_success node_exporter: Whether a collector succeeded.
 # TYPE node_scrape_collector_success gauge
 node_scrape_collector_success{collector="textfile"} 1
 # HELP node_textfile_mtime_seconds Unixtime mtime of textfiles successfully read.
 # TYPE node_textfile_mtime_seconds gauge
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-bar.prom"} 1.710812009e+09
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-foo.prom"} 1.710811982e+09
 # HELP node_textfile_scrape_error 1 if there was an error opening or reading a file, 0 otherwise
 # TYPE node_textfile_scrape_error gauge
 node_textfile_scrape_error 1
 # HELP promhttp_metric_handler_errors_total Total number of internal errors encountered by the promhttp metric handler.
 # TYPE promhttp_metric_handler_errors_total counter
 promhttp_metric_handler_errors_total{cause="encoding"} 0
 promhttp_metric_handler_errors_total{cause="gathering"} 0
 # HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
 # TYPE promhttp_metric_handler_requests_in_flight gauge
 promhttp_metric_handler_requests_in_flight 1
 # HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
 # TYPE promhttp_metric_handler_requests_total counter
 promhttp_metric_handler_requests_total{code="200"} 0
 promhttp_metric_handler_requests_total{code="500"} 0
 promhttp_metric_handler_requests_total{code="503"} 0
 # HELP tau_infrastructure_performing_maintenance_task At what timestamp a given task started or stopped, the last time it was run.
 # TYPE tau_infrastructure_performing_maintenance_task gauge
 tau_infrastructure_performing_maintenance_task{main_task="nightly",start_or_stop="start",sub_task="main"} 1.64728080198446e+09
```

Fixes: prometheus#2317
Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
SuperQ pushed a commit that referenced this issue Mar 24, 2024
Avoid metrics with inconsistent help-texts. The earlier behaviour has
been preserved in the sense that the first encountered instance is still
used to generate metrics, whereas the subsequent inconsistent ones are
ignored along with a few peripheral changes.

```
 # HELP node_scrape_collector_duration_seconds node_exporter: Duration of a collector scrape.
 #TYPE node_scrape_collector_duration_seconds gauge
 node_scrape_collector_duration_seconds{collector="textfile"} 0.0004005
 # HELP node_scrape_collector_success node_exporter: Whether a collector succeeded.
 # TYPE node_scrape_collector_success gauge
 node_scrape_collector_success{collector="textfile"} 1
 # HELP node_textfile_mtime_seconds Unixtime mtime of textfiles successfully read.
 # TYPE node_textfile_mtime_seconds gauge
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-bar.prom"} 1.710812009e+09
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-foo.prom"} 1.710811982e+09
 # HELP node_textfile_scrape_error 1 if there was an error opening or reading a file, 0 otherwise
 # TYPE node_textfile_scrape_error gauge
 node_textfile_scrape_error 1
 # HELP promhttp_metric_handler_errors_total Total number of internal errors encountered by the promhttp metric handler.
 # TYPE promhttp_metric_handler_errors_total counter
 promhttp_metric_handler_errors_total{cause="encoding"} 0
 promhttp_metric_handler_errors_total{cause="gathering"} 0
 # HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
 # TYPE promhttp_metric_handler_requests_in_flight gauge
 promhttp_metric_handler_requests_in_flight 1
 # HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
 # TYPE promhttp_metric_handler_requests_total counter
 promhttp_metric_handler_requests_total{code="200"} 0
 promhttp_metric_handler_requests_total{code="500"} 0
 promhttp_metric_handler_requests_total{code="503"} 0
 # HELP tau_infrastructure_performing_maintenance_task At what timestamp a given task started or stopped, the last time it was run.
 # TYPE tau_infrastructure_performing_maintenance_task gauge
 tau_infrastructure_performing_maintenance_task{main_task="nightly",start_or_stop="start",sub_task="main"} 1.64728080198446e+09
```

Fixes: #2317

Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
gitperr pushed a commit to gitperr/node_exporter that referenced this issue Apr 30, 2024
Avoid metrics with inconsistent help-texts. The earlier behaviour has
been preserved in the sense that the first encountered instance is still
used to generate metrics, whereas the subsequent inconsistent ones are
ignored along with a few peripheral changes.

```
 # HELP node_scrape_collector_duration_seconds node_exporter: Duration of a collector scrape.
 #TYPE node_scrape_collector_duration_seconds gauge
 node_scrape_collector_duration_seconds{collector="textfile"} 0.0004005
 # HELP node_scrape_collector_success node_exporter: Whether a collector succeeded.
 # TYPE node_scrape_collector_success gauge
 node_scrape_collector_success{collector="textfile"} 1
 # HELP node_textfile_mtime_seconds Unixtime mtime of textfiles successfully read.
 # TYPE node_textfile_mtime_seconds gauge
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-bar.prom"} 1.710812009e+09
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-foo.prom"} 1.710811982e+09
 # HELP node_textfile_scrape_error 1 if there was an error opening or reading a file, 0 otherwise
 # TYPE node_textfile_scrape_error gauge
 node_textfile_scrape_error 1
 # HELP promhttp_metric_handler_errors_total Total number of internal errors encountered by the promhttp metric handler.
 # TYPE promhttp_metric_handler_errors_total counter
 promhttp_metric_handler_errors_total{cause="encoding"} 0
 promhttp_metric_handler_errors_total{cause="gathering"} 0
 # HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
 # TYPE promhttp_metric_handler_requests_in_flight gauge
 promhttp_metric_handler_requests_in_flight 1
 # HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
 # TYPE promhttp_metric_handler_requests_total counter
 promhttp_metric_handler_requests_total{code="200"} 0
 promhttp_metric_handler_requests_total{code="500"} 0
 promhttp_metric_handler_requests_total{code="503"} 0
 # HELP tau_infrastructure_performing_maintenance_task At what timestamp a given task started or stopped, the last time it was run.
 # TYPE tau_infrastructure_performing_maintenance_task gauge
 tau_infrastructure_performing_maintenance_task{main_task="nightly",start_or_stop="start",sub_task="main"} 1.64728080198446e+09
```

Fixes: prometheus#2317

Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
gitperr pushed a commit to gitperr/node_exporter that referenced this issue Apr 30, 2024
Signed-off-by: David O'Rourke <david.orourke@gmail.com>

chore:remove constant from function (prometheus#2884)

Signed-off-by: tyltr <tylitianrui@126.com>

build(deps): bump github.com/jsimonetti/rtnetlink from 1.4.0 to 1.4.1 (prometheus#2909)

Bumps [github.com/jsimonetti/rtnetlink](https://github.com/jsimonetti/rtnetlink) from 1.4.0 to 1.4.1.
- [Release notes](https://github.com/jsimonetti/rtnetlink/releases)
- [Commits](jsimonetti/rtnetlink@v1.4.0...v1.4.1)

---
updated-dependencies:
- dependency-name: github.com/jsimonetti/rtnetlink
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

fix hwmon nil ptr (prometheus#2873)

* fix hwmon nil ptr

syslink maybe lost in some cases.

---------

Signed-off-by: TaoGe <6657718+yowenter@users.noreply.github.com>

Fix hwmon error capture (prometheus#2915)

Fix golangci-lint "ineffectual assignment" by correctly capturing any
errors within the hwmon gathering loop.

Signed-off-by: Ben Kochie <superq@gmail.com>

Update common Prometheus files (prometheus#2917)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

Revert "Add ZFS freebsd per dataset stats (prometheus#2753)" (prometheus#2925)

This reverts commit f34aaa6.

Signed-off-by: Caleb Webber <caleb@codingthemsoftly.com>

filesystem: fix mountTimeout not working issue (prometheus#2903)

Signed-off-by: DongWei <jiangxuege@hotmail.com>

Fix description for NodeDiskIOSaturation alert (prometheus#2929)

NodeDiskIOSaturation description should say 30m per the "for" clause

Signed-off-by: Taylor Sly <slyt@users.noreply.github.com>

Enforce no subprocess policy (prometheus#2926)

Add depguard to golangci-lint to enforce the no-os/exec policy.

Signed-off-by: Ben Kochie <superq@gmail.com>

filesystem: surface device errors (prometheus#2923)

filesystem: surface filesystem device error

Fixes: prometheus#2918
---------

Signed-off-by: Pamela Mei i540369 <pamela.mei@sap.com>

Revert "filesystem: fix mountTimeout not working issue (prometheus#2903)" (prometheus#2932)

This reverts commit 9f1f791.

Signed-off-by: Ben Kochie <superq@gmail.com>

Update common Prometheus files (prometheus#2939)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

Update common Prometheus files (prometheus#2946)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

Update common Prometheus files (prometheus#2949)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

Add multi-cluster support for Nodes dashboard (prometheus#2945)

Signed-off-by: Adrian Berger <adria.berger94@gmail.com>

disable selinux,fix end-to-end-test.sh error(prometheus#2934) (prometheus#2937)

Signed-off-by: heyitao <heyitao@uniontech.com>
Co-authored-by: heyitao <heyitao@uniontech.com>

Add new collector and metrics for watchdog (prometheus#2309) (prometheus#2880)

Signed-off-by: Gavin Lam <gavin.oss@tutamail.com>

Enable watchdog module by default; Add no data error (prometheus#2953)

Signed-off-by: Gavin Lam <gavin.oss@tutamail.com>

Update common Prometheus files (prometheus#2954)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

build(deps): bump google.golang.org/protobuf from 1.32.0 to 1.33.0 (prometheus#2955)

Bumps google.golang.org/protobuf from 1.32.0 to 1.33.0.

---
updated-dependencies:
- dependency-name: google.golang.org/protobuf
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Update common Prometheus files (prometheus#2959)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

Sanitize ethtool metric name keys

Apply the same metric name sanitization to the keys as to the metric
names. This avoids conflicting help strings in the metric registry.

Fixes: prometheus#2893

Signed-off-by: Ben Kochie <superq@gmail.com>

Update common Prometheus files

Signed-off-by: prombot <prometheus-team@googlegroups.com>

chore: fix some typos (prometheus#2974)

Signed-off-by: occupyhabit <wangmengjiao@outlook.com>

collector/textfile: Avoid inconsistent help-texts (prometheus#2962)

Avoid metrics with inconsistent help-texts. The earlier behaviour has
been preserved in the sense that the first encountered instance is still
used to generate metrics, whereas the subsequent inconsistent ones are
ignored along with a few peripheral changes.

```
 # HELP node_scrape_collector_duration_seconds node_exporter: Duration of a collector scrape.
 #TYPE node_scrape_collector_duration_seconds gauge
 node_scrape_collector_duration_seconds{collector="textfile"} 0.0004005
 # HELP node_scrape_collector_success node_exporter: Whether a collector succeeded.
 # TYPE node_scrape_collector_success gauge
 node_scrape_collector_success{collector="textfile"} 1
 # HELP node_textfile_mtime_seconds Unixtime mtime of textfiles successfully read.
 # TYPE node_textfile_mtime_seconds gauge
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-bar.prom"} 1.710812009e+09
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-foo.prom"} 1.710811982e+09
 # HELP node_textfile_scrape_error 1 if there was an error opening or reading a file, 0 otherwise
 # TYPE node_textfile_scrape_error gauge
 node_textfile_scrape_error 1
 # HELP promhttp_metric_handler_errors_total Total number of internal errors encountered by the promhttp metric handler.
 # TYPE promhttp_metric_handler_errors_total counter
 promhttp_metric_handler_errors_total{cause="encoding"} 0
 promhttp_metric_handler_errors_total{cause="gathering"} 0
 # HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
 # TYPE promhttp_metric_handler_requests_in_flight gauge
 promhttp_metric_handler_requests_in_flight 1
 # HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
 # TYPE promhttp_metric_handler_requests_total counter
 promhttp_metric_handler_requests_total{code="200"} 0
 promhttp_metric_handler_requests_total{code="500"} 0
 promhttp_metric_handler_requests_total{code="503"} 0
 # HELP tau_infrastructure_performing_maintenance_task At what timestamp a given task started or stopped, the last time it was run.
 # TYPE tau_infrastructure_performing_maintenance_task gauge
 tau_infrastructure_performing_maintenance_task{main_task="nightly",start_or_stop="start",sub_task="main"} 1.64728080198446e+09
```

Fixes: prometheus#2317

Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>

Update common Prometheus files (prometheus#2973)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

zfs: Log mib when sysctl read fails on FreeBSD

When the zfs collector fails on FreeBSD it doesn't log which `mib` triggered the issue. This makes diagnostics hard.

Incompatibilities in the list of supported mibs is not uncommon with major os updates. By adding this change, it'll be easier for users to report the specific mib that is triggering the failure.

Related to prometheus#2847

Signed-off-by: Daniel Kimsey <90741+dekimsey@users.noreply.github.com>

chore: fix typo in comment

Signed-off-by: looklose <shishuaiqun@yeah.net>

fibre_channel: update procfs to take into account optional attributes (prometheus#2933)

Signed-off-by: machine424 <ayoubmrini424@gmail.com>

refactor: Optimize code by using built-in constants in the standard library (prometheus#2989)

Signed-off-by: coderwander <770732124@qq.com>

os_release.go: Removed caching of modtime/filename of os-release file. (prometheus#2987)

Signed-off-by: Jonathan Davies <jpds@protonmail.com>

fix: data race of NetClassCollector metrics initialization when multiple requests happen (prometheus#2995)

Signed-off-by: John Guo <john@johng.cn>

Update common Prometheus files (prometheus#2992)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

Update build (prometheus#3000)

* Update Go to 1.22.
* Update Go modules.
* Use new version collector.
* Use standard library slices package.

Signed-off-by: Ben Kochie <superq@gmail.com>

Fix watchdog_test lint and test failures on macos. (prometheus#3003)

Ensure identical build flags embedded in both files.

Signed-off-by: Chris Cleeland <chris.cleeland@gmail.com>

Release v1.8.0 (prometheus#3002)

* [CHANGE] exec_bsd: Fix labels for `vm.stats.sys.v_syscall` sysctl prometheus#2895
* [CHANGE] diskstats: Ignore zram devices on linux systems prometheus#2898
* [CHANGE] textfile: Avoid inconsistent help-texts  prometheus#2962
* [CHANGE] os: Removed caching of modtime/filename of os-release file prometheus#2987
* [FEATURE] xfrm: Add new collector prometheus#2866
* [FEATURE] watchdog: Add new collector prometheus#2880
* [ENHANCEMENT] cpu_vulnerabilities: Add mitigation information label prometheus#2806
* [ENHANCEMENT] nfsd: Handle new `wdeleg_getattr` attribute prometheus#2810
* [ENHANCEMENT] netstat: Add TCPOFOQueue to default netstat metrics prometheus#2867
* [ENHANCEMENT] filesystem: surface device errors prometheus#2923
* [ENHANCEMENT] os: Add support end parsing prometheus#2982
* [ENHANCEMENT] zfs: Log mib when sysctl read fails on FreeBSD prometheus#2975
* [ENHANCEMENT] fibre_channel: update procfs to take into account optional attributes prometheus#2933
* [BUGFIX] cpu: Fix debug log in cpu collector prometheus#2857
* [BUGFIX] hwmon: Fix hwmon nil ptr prometheus#2873
* [BUGFIX] hwmon: Fix hwmon error capture prometheus#2915
* [BUGFIX] zfs: Revert "Add ZFS freebsd per dataset stats prometheus#2925
* [BUGFIX] ethtool: Sanitize ethtool metric name keys prometheus#2940
* [BUGFIX] fix: data race of NetClassCollector metrics initialization prometheus#2995

Signed-off-by: Ben Kochie <superq@gmail.com>

Add logging for ethtool device include/exclude and metrics include flags (prometheus#2979)

Signed-off-by: Sam Leiken <sam.k.leiken@gmail.com>
v-zhuravlev pushed a commit to grafana/node_exporter that referenced this issue Nov 1, 2024
Avoid metrics with inconsistent help-texts. The earlier behaviour has
been preserved in the sense that the first encountered instance is still
used to generate metrics, whereas the subsequent inconsistent ones are
ignored along with a few peripheral changes.

```
 # HELP node_scrape_collector_duration_seconds node_exporter: Duration of a collector scrape.
 #TYPE node_scrape_collector_duration_seconds gauge
 node_scrape_collector_duration_seconds{collector="textfile"} 0.0004005
 # HELP node_scrape_collector_success node_exporter: Whether a collector succeeded.
 # TYPE node_scrape_collector_success gauge
 node_scrape_collector_success{collector="textfile"} 1
 # HELP node_textfile_mtime_seconds Unixtime mtime of textfiles successfully read.
 # TYPE node_textfile_mtime_seconds gauge
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-bar.prom"} 1.710812009e+09
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-foo.prom"} 1.710811982e+09
 # HELP node_textfile_scrape_error 1 if there was an error opening or reading a file, 0 otherwise
 # TYPE node_textfile_scrape_error gauge
 node_textfile_scrape_error 1
 # HELP promhttp_metric_handler_errors_total Total number of internal errors encountered by the promhttp metric handler.
 # TYPE promhttp_metric_handler_errors_total counter
 promhttp_metric_handler_errors_total{cause="encoding"} 0
 promhttp_metric_handler_errors_total{cause="gathering"} 0
 # HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
 # TYPE promhttp_metric_handler_requests_in_flight gauge
 promhttp_metric_handler_requests_in_flight 1
 # HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
 # TYPE promhttp_metric_handler_requests_total counter
 promhttp_metric_handler_requests_total{code="200"} 0
 promhttp_metric_handler_requests_total{code="500"} 0
 promhttp_metric_handler_requests_total{code="503"} 0
 # HELP tau_infrastructure_performing_maintenance_task At what timestamp a given task started or stopped, the last time it was run.
 # TYPE tau_infrastructure_performing_maintenance_task gauge
 tau_infrastructure_performing_maintenance_task{main_task="nightly",start_or_stop="start",sub_task="main"} 1.64728080198446e+09
```

Fixes: prometheus#2317

Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
BupycHuk added a commit to percona/node_exporter that referenced this issue Nov 8, 2024
* Fix lint issues

Signed-off-by: jalev <qweet.ing@gmail.com>

* Bump perf-utils version to 0.6.0

This change updates the perf-utils library to 0.6.0 which has some fixes
for automatically detecting the correct tracefs mountpoint if available.

Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>

* Fix thermal_zone collector noise

Add a check for missing/unreadable thermal zone stats and ignore if not
availlable.

Fixes: https://github.com/prometheus/node_exporter/issues/2552

Signed-off-by: Ben Kochie <superq@gmail.com>

* Update common Prometheus files

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Enable uname collector on NetBSD too

This collector works just fine without any further changes.

Signed-off-by: Benny Siegert <bsiegert@gmail.com>

* build(deps): bump github.com/mdlayher/netlink from 1.7.0 to 1.7.1

Bumps [github.com/mdlayher/netlink](https://github.com/mdlayher/netlink) from 1.7.0 to 1.7.1.
- [Release notes](https://github.com/mdlayher/netlink/releases)
- [Changelog](https://github.com/mdlayher/netlink/blob/main/CHANGELOG.md)
- [Commits](https://github.com/mdlayher/netlink/compare/v1.7.0...v1.7.1)

---
updated-dependencies:
- dependency-name: github.com/mdlayher/netlink
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* build(deps): bump github.com/josharian/native from 1.0.0 to 1.1.0

Bumps [github.com/josharian/native](https://github.com/josharian/native) from 1.0.0 to 1.1.0.
- [Release notes](https://github.com/josharian/native/releases)
- [Commits](https://github.com/josharian/native/compare/v1.0.0...v1.1.0)

---
updated-dependencies:
- dependency-name: github.com/josharian/native
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* fix cpustat when some cpus are offline

Signed-off-by: Jia Xin <alexjx@gmail.com>

* build(deps): bump github.com/prometheus/common from 0.37.0 to 0.39.0

Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.37.0 to 0.39.0.
- [Release notes](https://github.com/prometheus/common/releases)
- [Commits](https://github.com/prometheus/common/compare/v0.37.0...v0.39.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/common
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update e2e output for new common version.

Signed-off-by: Ben Kochie <superq@gmail.com>

* Update common Prometheus files

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* NetBSD support for the meminfo collector

This depends on a recent change to golang.org/x/sys that adds a
unix.SysctlUvmexp function.

Signed-off-by: Benny Siegert <bsiegert@gmail.com>

* memory_bsd: Fix a problem fetching the user wire count on FreeBSD

Signed-off-by: David O'Rourke <david.orourke@gmail.com>

* Optimize cpufreq collector

Move metric descriptiions to package vars to avoid allocating them every
time `NewCPUFreqCollector()` is called.

Signed-off-by: Ben Kochie <superq@gmail.com>

* build(deps): bump github.com/hodgesds/perf-utils from 0.6.0 to 0.7.0

Bumps [github.com/hodgesds/perf-utils](https://github.com/hodgesds/perf-utils) from 0.6.0 to 0.7.0.
- [Release notes](https://github.com/hodgesds/perf-utils/releases)
- [Commits](https://github.com/hodgesds/perf-utils/compare/v0.6.0...v0.7.0)

---
updated-dependencies:
- dependency-name: github.com/hodgesds/perf-utils
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Deprecate ntp collector

The ntp collector has always been a source of confusion and problems.
The data it produces is more of a blackbox probe against an NTP server.
The time sync / offset data produced is not what users expect.

Mark this collector as deprecated to be removed in v2.0.0

Signed-off-by: Ben Kochie <superq@gmail.com>

* Update common Prometheus files

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* build(deps): bump golang.org/x/net from 0.4.0 to 0.7.0

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.4.0 to 0.7.0.
- [Release notes](https://github.com/golang/net/releases)
- [Commits](https://github.com/golang/net/compare/v0.4.0...v0.7.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

* Remove metrics of offline CPUs in CPU collector

Signed-off-by: Haoyu Sun <hasun@redhat.com>

* build(deps): bump github.com/jsimonetti/rtnetlink from 1.3.0 to 1.3.1

Bumps [github.com/jsimonetti/rtnetlink](https://github.com/jsimonetti/rtnetlink) from 1.3.0 to 1.3.1.
- [Release notes](https://github.com/jsimonetti/rtnetlink/releases)
- [Commits](https://github.com/jsimonetti/rtnetlink/compare/v1.3.0...v1.3.1)

---
updated-dependencies:
- dependency-name: github.com/jsimonetti/rtnetlink
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* build(deps): bump github.com/opencontainers/selinux

Bumps [github.com/opencontainers/selinux](https://github.com/opencontainers/selinux) from 1.10.2 to 1.11.0.
- [Release notes](https://github.com/opencontainers/selinux/releases)
- [Commits](https://github.com/opencontainers/selinux/compare/v1.10.2...v1.11.0)

---
updated-dependencies:
- dependency-name: github.com/opencontainers/selinux
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* build(deps): bump golang.org/x/sys from 0.5.0 to 0.6.0

Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.5.0 to 0.6.0.
- [Release notes](https://github.com/golang/sys/releases)
- [Commits](https://github.com/golang/sys/compare/v0.5.0...v0.6.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update exporter-toolkit

* Bump exporter-toolkit to the latest release.
* Use new toolkit landing page function.
* Update kingpin flags.

Signed-off-by: Ben Kochie <superq@gmail.com>

* Bump exporter-toolkit

Pick up the fixes for 32-bit mode and updated HTML template.

Signed-off-by: Ben Kochie <superq@gmail.com>

* Update build

* Update Go to 1.20
* Update golangci-lint.
* Update CI orb.
* Fix staticcheck issue in perf collector.

Signed-off-by: Ben Kochie <superq@gmail.com>

* Allow root path as metrics path. (#2590)

Signed-off-by: LamGC <lam827@lamgc.net>

* Fix spelling issues

Minor typo fixup.

Signed-off-by: Ben Kochie <superq@gmail.com>

* interrupts_linux: Fix fields on aarch64 (#2631)

* interrupts_linux: Fix fields on aarch64

Fixes #2557

---------

Signed-off-by: Daniël van Eeden <git@myname.nl>

* feat: add support for cpu freq governor metrics

Signed-off-by: Lukas Coppens <lukas.coppens@be-mobile.com>

* feat: add support for cpu freq governor metrics

Signed-off-by: Lukas Coppens <lukas.coppens@be-mobile.com>

* Reduce priviliges needed for btrfs device stats

Signed-off-by: Marcus Cobden <leth@users.noreply.github.com>

* Update common Prometheus files

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* build(deps): bump github.com/safchain/ethtool from 0.2.0 to 0.3.0

Bumps [github.com/safchain/ethtool](https://github.com/safchain/ethtool) from 0.2.0 to 0.3.0.
- [Release notes](https://github.com/safchain/ethtool/releases)
- [Commits](https://github.com/safchain/ethtool/compare/v0.2.0...v0.3.0)

---
updated-dependencies:
- dependency-name: github.com/safchain/ethtool
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* build(deps): bump github.com/prometheus/common from 0.41.0 to 0.42.0

Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.41.0 to 0.42.0.
- [Release notes](https://github.com/prometheus/common/releases)
- [Commits](https://github.com/prometheus/common/compare/v0.41.0...v0.42.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/common
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* NetBSD support for CPU collector (#2626)

* Added CPU collector for NetBSD to provide load and temperature statistics

---------

Signed-off-by: Matthias Petermann <mp@petermann-it.de>

* feat: added suspended as a node_zfs_zpool_state (#2449)

Signed-off-by: Pablo Caderno <kaderno@gmail.com>

* build(deps): bump github.com/mdlayher/netlink from 1.7.1 to 1.7.2

Bumps [github.com/mdlayher/netlink](https://github.com/mdlayher/netlink) from 1.7.1 to 1.7.2.
- [Release notes](https://github.com/mdlayher/netlink/releases)
- [Changelog](https://github.com/mdlayher/netlink/blob/main/CHANGELOG.md)
- [Commits](https://github.com/mdlayher/netlink/compare/v1.7.1...v1.7.2)

---
updated-dependencies:
- dependency-name: github.com/mdlayher/netlink
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* build(deps): bump github.com/prometheus/client_golang

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.14.0 to 1.15.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prometheus/client_golang/compare/v1.14.0...v1.15.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* doc: added undocumented include and exclude flags (#2670)

* doc: added undocumented exclude flags


Signed-off-by: David Calvert <david@0xdc.me>

* Expose administrative state of network interfaces as 'adminstate'. (#2515)

Signed-off-by: Maximilian Wilhelm <max@sdn.clinic>

* Use go-runit fork, mark collector as deprecated

Signed-off-by: Johannes Ziemke <github@5pi.de>

* build(deps): bump github.com/jsimonetti/rtnetlink from 1.3.1 to 1.3.2 (#2673)

Bumps [github.com/jsimonetti/rtnetlink](https://github.com/jsimonetti/rtnetlink) from 1.3.1 to 1.3.2.
- [Release notes](https://github.com/jsimonetti/rtnetlink/releases)
- [Commits](https://github.com/jsimonetti/rtnetlink/compare/v1.3.1...v1.3.2)

---
updated-dependencies:
- dependency-name: github.com/jsimonetti/rtnetlink
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* docs (node/mixin): fix annotation for Skew alert (#2671)

This updates the annotation for the NodeClockSkewDetected mixin alert to
match the new threshold set.

Original discussion was in this PR: https://github.com/prometheus/node_exporter/pull/1480

I spent an embarrassingly large amount of time trying to figure out how
the heck that alert would mean 300s of clock skew. Turns out the
annotation was just left the same after the threshold change.

Signed-off-by: Will Bollock <wbollock@linode.com>

* collector/netisr_freebsd.go: Added collector for netisr subsystem. (#2668)

Signed-off-by: Jonathan Davies <jpds@protonmail.com>

* Do not hand define struct clockinfo here. Instead use the version from (#2663)

x/sys/unix. The clockinfo struct was altered beginning of 2021 and this
code was not adjusted.

Signed-off-by: Claudio Jeker <claudio@openbsd.org>

* Fix filesystem collector for OpenBSD to not print loads of zero bytes in name (#2637)

Use the filesystem collector for all OpenBSD archs, there is no reason to
only use it on amd64 systems.

Signed-off-by: Claudio Jeker <claudio@openbsd.org>

* collector: fix comment and remove redundant parentheses (#2691)

Signed-off-by: cui fliter <imcusg@gmail.com>

* PMM-12116 Sync with upstream and update dependencies

Sync with the latest version of upstream v1.5.0 and update
dependencies with reported vulnerabilities.

* bcache: remove cache_readaheads_totals metrics #2103 (#2583)

* bcache: remove cache_readaheads_totals metrics #2103

Signed-off-by: Saleh Sal <0xack13@gmail.com>

* Append bcacheReadaheadMetrics when CacheReadaheads value exists

Signed-off-by: Saleh Sal <0xack13@gmail.com>

* Update test cases for cachereadahead greater than zero

Signed-off-by: Saleh Sal <0xack13@gmail.com>

---------

Signed-off-by: Saleh Sal <0xack13@gmail.com>

* Fix CVE-2022-41723 by upgrading x/net to v0.10.0 (#2694)

Signed-off-by: Nitin Shelke <nshelke@cloudera.com>

* Update e2e output fixtures (#2696)

Fix up correct e2e output for node_power_supply_info.

Signed-off-by: Ben Kochie <superq@gmail.com>

* Update Go modules (#2695)

Update Prometheus modules to latest releases.
* Add missing fixtures for cpus online/offline.

Signed-off-by: Ben Kochie <superq@gmail.com>

* fix(zfs): add `memory_available_bytes`, fix `dbufstats` filename on Linux (#2687)

* Fix zfs memory_available_bytes collector
* Fix zfs dbufstats collector
---------

Signed-off-by: dongjiang1989 <dongjiang1989@126.com>

* Update Go module for ema/qdisc (#2700)

* Update Go module for ema/qdisc

---------

Signed-off-by: jbradleynh <jbradley@fastly.com>

* Deprecate supervisord collector

Mark the `supervisord` as deprecated. This process
supevisor, like `runit`, is of scope for the node_exporter.

Signed-off-by: Ben Kochie <superq@gmail.com>

* collector/diskstats: Use SCSI_IDENT_SERIAL as serial (#2612)

On most hard drives, `ID_SERIAL_SHORT` and `SCSI_IDENT_SERIAL` are identical,
but on some SAS drives they do differ. In that case, `SCSI_IDENT_SERIAL`
corresponds to the serial number printed on the drive label, and to the value
returned by `smartctl -i`.

So use that value by default for the `serial` label on the `node_disk_info`
metric, and fallback to `ID_SERIAL_SHORT` only if it's undefined.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>

* softnet: additionals metrics from softnet_data,  (#2592)

* softnet: additionals metrics from softnet_data, https://github.com/prometheus/procfs/pull/473
---------

Signed-off-by: remi <remijouannet@gmail.com>
Signed-off-by: Rémi Jouannet <remijouannet@gmail.com>

* exposing softirq metrics (#2294)

Signed-off-by: abbeywoodyear <abbey.woodyear@thehutgroup.com>

* netlink: read missing attributes from sysfs (#2669)

Read missing dev_id, name_assign_type, and addr_assign_type
from sysfs, since they only take a device-specific lock and
not the whole RTNL lock. This means reading them is much less
impactful on other system processes than many of the other
attributes in sysfs that do take the RTNL lock.

Signed-off-by: Dan Williams <dcbw@redhat.com>

* Update ansible role in README.md (#2702)

https://github.com/cloudalchemy/ansible-node-exporter has been deprecated

Signed-off-by: Johannes Dilli <jd1@users.noreply.github.com>

* Release v1.6.0 (#2701)

* [CHANGE] Fix cpustat when some cpus are offline #2318
* [CHANGE] Remove metrics of offline CPUs in CPU collector #2605
* [CHANGE] Deprecate ntp collector #2603
* [CHANGE] Remove bcache `cache_readaheads_totals` metrics #2583
* [CHANGE] Deprecate supervisord collector #2685
* [FEATURE] Enable uname collector on NetBSD #2559
* [FEATURE] NetBSD support for the meminfo collector #2570
* [FEATURE] NetBSD support for CPU collector #2626
* [FEATURE] Add FreeBSD collector for netisr subsystem #2668
* [FEATURE] Add softirqs collector #2669
* [ENHANCEMENT] Add suspended as a `node_zfs_zpool_state` #2449
* [ENHANCEMENT] Add administrative state of Linux network interfaces #2515
* [ENHANCEMENT] Log current value of GOMAXPROCS #2537
* [ENHANCEMENT] Add profiler options for perf collector #2542
* [ENHANCEMENT] Allow root path as metrics path #2590
* [ENHANCEMENT] Add cpu frequency governor metrics #2569
* [ENHANCEMENT] Add new landing page #2622
* [ENHANCEMENT] Reduce privileges needed for btrfs device stats #2634
* [ENHANCEMENT] Add ZFS `memory_available_bytes` #2687
* [ENHANCEMENT] Use `SCSI_IDENT_SERIAL` as serial in diskstats #2612
* [ENHANCEMENT] Read missing from netlink netclass attributes from sysfs #2669
* [BUGFIX] perf: fixes for automatically detecting the correct tracefs mountpoints #2553
* [BUGFIX] Fix `thermal_zone` collector noise @2554
* [BUGFIX] Fix a problem fetching the user wire count on FreeBSD 2584
* [BUGFIX] interrupts: Fix fields on linux aarch64 #2631
* [BUGFIX] Remove metrics of offline CPUs in CPU collector #2605
* [BUGFIX] Fix OpenBSD filesystem collector string parsing #2637
* [BUGFIX] Fix bad reporting of `node_cpu_seconds_total` in OpenBSD #2663

Signed-off-by: Ben Kochie <superq@gmail.com>

* build(deps): bump github.com/beevik/ntp from 0.3.0 to 1.0.0

Bumps [github.com/beevik/ntp](https://github.com/beevik/ntp) from 0.3.0 to 1.0.0.
- [Release notes](https://github.com/beevik/ntp/releases)
- [Changelog](https://github.com/beevik/ntp/blob/main/RELEASE_NOTES.md)
- [Commits](https://github.com/beevik/ntp/compare/v0.3.0...v1.0.0)

---
updated-dependencies:
- dependency-name: github.com/beevik/ntp
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* build(deps): bump github.com/prometheus/procfs from 0.10.0 to 0.10.1

Bumps [github.com/prometheus/procfs](https://github.com/prometheus/procfs) from 0.10.0 to 0.10.1.
- [Release notes](https://github.com/prometheus/procfs/releases)
- [Commits](https://github.com/prometheus/procfs/compare/v0.10.0...v0.10.1)

---
updated-dependencies:
- dependency-name: github.com/prometheus/procfs
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* build(deps): bump github.com/jsimonetti/rtnetlink from 1.3.2 to 1.3.3

Bumps [github.com/jsimonetti/rtnetlink](https://github.com/jsimonetti/rtnetlink) from 1.3.2 to 1.3.3.
- [Release notes](https://github.com/jsimonetti/rtnetlink/releases)
- [Commits](https://github.com/jsimonetti/rtnetlink/compare/v1.3.2...v1.3.3)

---
updated-dependencies:
- dependency-name: github.com/jsimonetti/rtnetlink
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* Parallelize stat calls in Linux filesystem collector.

This change adds the ability to process multiple stat calls in parallel.
Processing is rate-limited based on the new flag
`collector.filesystem.stat-workers` (default 4).

Caveat: filesystem stats information is no longer in the same order as
returned by `/proc/1/mounts`.  This should not be an issue.

Caveat: This change currently uses unbuffered channels to prove
correctness without reliance on buffers.  Buffered channels will yield
superior performance.

Signed-off-by: Erica Mays <erica@emays.dev>

* fix misspel in CHANGELOG.md (#2717)

Signed-off-by: juzhao <juzhao@redhat.com>

* Bump wifi Go module (#2719)

Update github.com/mdlayher/wifi to the latest commit.

Signed-off-by: Ben Kochie <superq@gmail.com>

* Bump ethtool library (#2720)

Update to latest release.

Signed-off-by: Ben Kochie <superq@gmail.com>

* add missing linkspeeds (#2711)

Signed-off-by: Cam Cope <ccope@crusoeenergy.com>

* Update common Prometheus files (#2723)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Update golangci-lint config (#2722)

* Migrate from Python codespell to golangci-lint misspell.
* Inline errcheck exclude list in the golangci-lint config.

Signed-off-by: Ben Kochie <superq@gmail.com>

* Add mountpoint to NodeFilesystem alerts

This helps to identify alerting filesystem.

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>

* Decrease NodeFilesystem pending time to 15m

30m is too long and there is a risk of running out of disk space/inodes completely if something is filling up disk very fast (like log file).

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>

* Add CPU and memory alerts

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>

* Add failed systemd service alert

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>

* Decrease NodeNetwork*Errs pending period

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>

* Set 'at' everywhere as preposition for instance

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>

* Add NodeDiskIOSaturation alert

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>

* Add %(nodeExporterSelector)s to Network and conntrack alerts

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>

* Add diskDevice selector

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>

* Fix NodeMemoryHighUtilization alert

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>

* Add NodeSystemSaturation and NodeMemoryMajorPagesFaults

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>

* Decrease NodeSystemdServiceFailed severity to warning

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>

* Extend alert description

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>

* Add comma after 'mounted on'

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>

* Add thresholds for memory alerts

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>

* Add thresholds for memory, disk and system alerts

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>

* Set severity to NodeCPUHighUsage to info

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>

* Update NodeSystemSaturation severity

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>

* Revert alerts pending durtions

Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>

* Update common Prometheus files

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Add cpu vulnerabilities reporting from sysfs (#2721)

* Add cpu vulnerabilities reporting from sysfs

---------

Signed-off-by: Michal Wasilewski <michal@mwasilewski.net>

* build(deps): bump github.com/beevik/ntp from 1.0.0 to 1.1.1

Bumps [github.com/beevik/ntp](https://github.com/beevik/ntp) from 1.0.0 to 1.1.1.
- [Release notes](https://github.com/beevik/ntp/releases)
- [Changelog](https://github.com/beevik/ntp/blob/main/RELEASE_NOTES.md)
- [Commits](https://github.com/beevik/ntp/compare/v1.0.0...v1.1.1)

---
updated-dependencies:
- dependency-name: github.com/beevik/ntp
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* build(deps): bump github.com/prometheus/client_golang

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.15.1 to 1.16.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prometheus/client_golang/compare/v1.15.1...v1.16.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Add include and exclude filter for hwmon collector (#2699)

* Add include and exclude flags chip name flags to hwmon collector, following example in systemd collector

---------

Signed-off-by: Conall O'Brien <conall@conall.net>
Co-authored-by: Ben Kochie <superq@gmail.com>

* Add missing ethtool flag documentation (#2743)

Signed-off-by: Gabi Davar <grizzly.nyo@gmail.com>

* Update all Include and Exclude variables to use the systemdUnit naming (#2740)

prefix.

Leave an annotation about using regexps instead of device_filter.go, so
@SuperQ doesn't need to remember everything.

Signed-off-by: Conall O'Brien <conall@conall.net>

* Fixup hwmon chip include (#2739)

Use the correct include value to the device filter function.
* Add new bogus hwmon fixture.
* Update end-to-end test to use hwmon chip include flag.

Signed-off-by: Ben Kochie <superq@gmail.com>

* Release v1.6.1 (#2747)

Rebuild with latest Go compiler bugfix release.

Signed-off-by: Ben Kochie <superq@gmail.com>

* Synchronize common files from prometheus/prometheus (#2736)

* Update common Prometheus files

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Fixup linting issues

* Disbale unused-parameter check.
* Fixup minor linting issues.

Signed-off-by: Ben Kochie <superq@gmail.com>

---------

Signed-off-by: prombot <prometheus-team@googlegroups.com>
Signed-off-by: Ben Kochie <superq@gmail.com>
Co-authored-by: Ben Kochie <superq@gmail.com>

* Update common Prometheus files (#2752)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Include drm collector in README

The DRM collector was missing in the README, this change includes it together with a short description.

Signed-off-by: L <3177243+LukeLR@users.noreply.github.com>

* collector/netdev_linux.go: Fallback to 32-bit stats (#2757)

On some platforms, `msg.Attributes.Stats64` is `nil` because the kernel doesn't
expose 64-bit stats. In that case, return `msg.Attributes.Stats` instead, which
are the 32-bit equivalent.

Note that `RXOtherhostDropped` isn't available in that case, so we hardcode it
to zero.

Fixes #2756.

Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>

* build(deps): bump github.com/beevik/ntp from 1.1.1 to 1.3.0 (#2762)

Signed-off-by: Ben Kochie <superq@gmail.com>

* build(deps): bump github.com/prometheus/procfs from 0.11.0 to 0.11.1 (#2763)

Bumps [github.com/prometheus/procfs](https://github.com/prometheus/procfs) from 0.11.0 to 0.11.1.
- [Release notes](https://github.com/prometheus/procfs/releases)
- [Commits](https://github.com/prometheus/procfs/compare/v0.11.0...v0.11.1)

---
updated-dependencies:
- dependency-name: github.com/prometheus/procfs
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* build(deps): bump github.com/jsimonetti/rtnetlink from 1.3.3 to 1.3.4 (#2765)

Bumps [github.com/jsimonetti/rtnetlink](https://github.com/jsimonetti/rtnetlink) from 1.3.3 to 1.3.4.
- [Release notes](https://github.com/jsimonetti/rtnetlink/releases)
- [Commits](https://github.com/jsimonetti/rtnetlink/compare/v1.3.3...v1.3.4)

---
updated-dependencies:
- dependency-name: github.com/jsimonetti/rtnetlink
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Drop redundant GOOS build tags if already in filename

Drop redundant GOOS build tags at start of file if the constraint is
already specified by the filename, e.g. foo_GOOS.go or
foo_GOOS_GOARCH.go, avoiding potential confusion in future.

cf. https://pkg.go.dev/cmd/go#hdr-Build_constraints

Signed-off-by: Daniel Swarbrick <daniel.swarbrick@gmail.com>

* Sync build tags in *_test.go (#2767)

Ensure that unwanted tests are correctly excluded when various build
tags are specified, i.e. when the code that they test would be excluded
from compilation.

Signed-off-by: Daniel Swarbrick <daniel.swarbrick@gmail.com>

* Upgrade github.com/ema/qdisc to v1.0.0 to improve qdisc collector (#2779)

performance

Signed-off-by: Oliver Geiselhardt-Herms <ogh@deepl.com>
Co-authored-by: Oliver Geiselhardt-Herms <ogh@deepl.com>

* Add CPU MHz as the value for "node_cpu_info" metric

For CPUs which don't have an available (or insertable) cpufreq driver,
the /proc/cpuinfo file can sometimes have accurate CPU core frequency
measurements. This change replaces the constant value of "1" for the
"node_cpu_info" metric with the parsed CPU MHz value from
/proc/cpuinfo for each core.

Signed-off-by: John Kordich <jkordich@gmail.com>

* Update e2e-output.txt with new expected metric values

Changes the e2e-output.txt file to have the expected CPU MHz values
for the node_cpu_info metric.

Signed-off-by: John Kordich <jkordich@gmail.com>

* Add new node_cpu_frequency_hertz metric

Revert changes to node_cpu_info and add new node_cpu_frequency_hertz
metric for measuring CPU frequency from /proc/cpuinfo

Signed-off-by: John Kordich <jkordich@gmail.com>

* Change log message from Warn to Debug

Signed-off-by: John Kordich <jkordich@gmail.com>

Co-authored-by: Ben Kochie <superq@gmail.com>
Signed-off-by: John Kordich <jkordich@gmail.com>

* fix(qdisc) flag naming corrected for consistency (#2782)

* fix collector qdisc flag naming for consistency

---------

Signed-off-by: jbradleynh <jbradley@fastly.com>

* btrfs: close btrfs.FS handle after use

Despite being quite hard to provoke (< 10% in my testing), the btrfs
collector would occasionally leave stale FDs relating to btrfs
mountpoints, making the filesystems unable to be unmounted.

Fixes: #2772.

Signed-off-by: Daniel Swarbrick <daniel.swarbrick@gmail.com>

* Update to Go 1.21 (#2796)

* Update Go build to 1.21.
* Update machine images to Ubuntu 22.04 current.

Signed-off-by: Ben Kochie <superq@gmail.com>

* Optionally fetch ARP stats via rtnetlink instead of procfs (#2777)

* Optionally fetch ARP stats via rtnetlink instead of procfs

Implement collection of ARP stats via rtnetlink to work around
shortcomings in the output of /proc/net/arp, which truncates InfiniBand
link-layer addresses.

Fixes: #2776

---------

Signed-off-by: Daniel Swarbrick <daniel.swarbrick@gmail.com>
Co-authored-by: Ben Kochie <superq@gmail.com>

* build(deps): bump golang.org/x/sys from 0.10.0 to 0.12.0 (#2797)

Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.10.0 to 0.12.0.
- [Commits](https://github.com/golang/sys/compare/v0.10.0...v0.12.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update common Prometheus files (#2798)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Add ZFS freebsd per dataset stats (#2753)

* Rename parsePoolObjsetFile to parseLinuxPoolObjsetFile to better reflect
it's scope
* Create a new parseFreeBSDPoolObjsetStats function, to generate a list
of per pool metrics to be queried via sysctl


---------

Signed-off-by: Conall O'Brien <conall@conall.net>

* Move RO status before error return

Signed-off-by: Metbog <metbog@gmail.com>

* Update common Prometheus files

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* fix(zfs)  zfs `arcstats.p` on FreeBSD 14.0+ (#2754)

* dongjiang, fix zfs arcstats.p

Signed-off-by: dongjiang1989 <dongjiang1989@126.com>

* dongjiang, fix gofmt -s

Signed-off-by: dongjiang1989 <dongjiang1989@126.com>

* change warn log to debug log by code review

Signed-off-by: dongjiang1989 <dongjiang1989@126.com>

---------

Signed-off-by: dongjiang1989 <dongjiang1989@126.com>

* Fix promhttp_metric_handler_errors_total metric not being disabled by flag

Signed-off-by: ToMe25 <ToMe25@gmx.de>

* build(deps): bump github.com/prometheus/client_golang (#2815)

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.16.0 to 1.17.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prometheus/client_golang/compare/v1.16.0...v1.17.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix inconsistent variable name, to address compilation issue (#2820)

https://github.com/prometheus/node_exporter/issues/2819

Signed-off-by: Conall O'Brien <conall@conall.net>

* Update README.md: update the 'more details' url in the section 'TLS endpoint' (#2814)

* Update README.md: correct the wrong url(link to exporter-toolkit web-config) in the section 'TLS endpoint'

Signed-off-by: yang-stressfree <68363665+yang-stressfree@users.noreply.github.com>

* Update README.md

Co-authored-by: Ben Kochie <superq@gmail.com>
Signed-off-by: yang-stressfree <68363665+yang-stressfree@users.noreply.github.com>

---------

Signed-off-by: yang-stressfree <68363665+yang-stressfree@users.noreply.github.com>
Co-authored-by: Ben Kochie <superq@gmail.com>

* Update common Prometheus files

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* build(deps): bump golang.org/x/net from 0.11.0 to 0.17.0

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.11.0 to 0.17.0.
- [Commits](https://github.com/golang/net/compare/v0.11.0...v0.17.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

* build(deps): bump github.com/prometheus/procfs from 0.11.1 to 0.12.0

Bumps [github.com/prometheus/procfs](https://github.com/prometheus/procfs) from 0.11.1 to 0.12.0.
- [Release notes](https://github.com/prometheus/procfs/releases)
- [Commits](https://github.com/prometheus/procfs/compare/v0.11.1...v0.12.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/procfs
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* Update e2e fixtures

Update for fixes in https://github.com/prometheus/procfs/pull/543

Signed-off-by: Ben Kochie <superq@gmail.com>

* NFSd: fix nfsd v4 index miss (#2824)

* fix nfsd v4 index miss

---------

Signed-off-by: dongjiang1989 <dongjiang1989@126.com>

* fix readme about expose memory statistics

Signed-off-by: joey <zchengjoey@gmail.com>

* Fix typo in CHANGELOG.md (#2836)

Use # consistently for PR number.

Signed-off-by: nemobis <federicoleva@tiscali.it>

* Update common Prometheus files (#2840)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* build(deps): bump github.com/prometheus/common from 0.44.0 to 0.45.0 (#2837)

Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.44.0 to 0.45.0.
- [Release notes](https://github.com/prometheus/common/releases)
- [Commits](https://github.com/prometheus/common/compare/v0.44.0...v0.45.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/common
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* build(deps): bump github.com/prometheus/client_model (#2838)

Bumps [github.com/prometheus/client_model](https://github.com/prometheus/client_model) from 0.4.1-0.20230718164431-9a2bf3000d16 to 0.5.0.
- [Release notes](https://github.com/prometheus/client_model/releases)
- [Commits](https://github.com/prometheus/client_model/commits/v0.5.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_model
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Release 1.7.0 (#2845)

* [FEATURE] Add ZFS freebsd per dataset stats #2753
* [FEATURE] Add cpu vulnerabilities reporting from sysfs #2721
* [ENHANCEMENT] Parallelize stat calls in Linux filesystem collector #1772
* [ENHANCEMENT] Add missing linkspeeds to ethtool collector 2711
* [ENHANCEMENT] Add CPU MHz as the value for `node_cpu_info` metric #2778
* [ENHANCEMENT] Improve qdisc collector performance #2779
* [ENHANCEMENT] Add include and exclude filter for hwmon collector #2699
* [ENHANCEMENT] Optionally fetch ARP stats via rtnetlink instead of procfs #2777
* [BUFFIX] Fix ZFS arcstats on FreeBSD 14.0+ 2754
* [BUGFIX] Fallback to 32-bit stats in netdev #2757
* [BUGFIX] Close btrfs.FS handle after use #2780
* [BUGFIX] Move RO status before error return #2807
* [BUFFIX] Fix `promhttp_metric_handler_errors_total` being always active #2808
* [BUGFIX] Fix nfsd v4 index miss #2824

Signed-off-by: Ben Kochie <superq@gmail.com>

* Add NodeBondingDegraded alert (#2843)

Signed-off-by: Ayoub Nasr <ayoub.nasr@scality.com>

* Make filesystem space prediction window configurable (#2844)

Signed-off-by: fitz123 <alugovoi@ordercapital.com>

* NFSd: handle new wdeleg_getattr attribute in /proc/net/rpc/nfsd (#2810)

This attribute was introduced it v6.6-rc1.

The relevant changes in procfs were merged here:

https://github.com/prometheus/procfs/pull/574

and are part of procfs v0.11.2

I have also figured out that the stat should be part of the v4 ops
counters struct, but that will need changes to both procfs and this
code. Since people are already using 6.6-rc1, I think it's better to get
the code out there --- even if they don't care about wdeleg_getattr,
currently they get _no_ nfsd stats with 6.6-rc1.

I will make two follow-up PRs to clean this up in the next releases of
procfs and node-exporter.

Signed-off-by: Tobias Klausmann <klausman@schwarzvogel.de>

* Update common Prometheus files (#2851)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Update containerization warnings (#2855)

Running node_exporter in containers is now a fairly well understood
problem. Replace the warnings with something less dire and more
prescriptive.

Signed-off-by: Ben Kochie <superq@gmail.com>

* Fix debug log in cpu collector (#2857)

Signed-off-by: Simon Pasquier <spasquie@redhat.com>

* build(deps): bump github.com/alecthomas/kingpin/v2 from 2.3.2 to 2.4.0 (#2865)

Bumps [github.com/alecthomas/kingpin/v2](https://github.com/alecthomas/kingpin) from 2.3.2 to 2.4.0.
- [Release notes](https://github.com/alecthomas/kingpin/releases)
- [Commits](https://github.com/alecthomas/kingpin/compare/v2.3.2...v2.4.0)

---
updated-dependencies:
- dependency-name: github.com/alecthomas/kingpin/v2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* build(deps): bump howett.net/plist from 1.0.0 to 1.0.1 (#2862)

Bumps [howett.net/plist](https://github.com/DHowett/go-plist) from 1.0.0 to 1.0.1.
- [Commits](https://github.com/DHowett/go-plist/compare/v1.0.0...v1.0.1)

---
updated-dependencies:
- dependency-name: howett.net/plist
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add new collector and metrics for XFRM (#2544) (#2866)

Signed-off-by: Gavin Lam <gavin.oss@tutamail.com>

* build(deps): bump github.com/jsimonetti/rtnetlink from 1.3.5 to 1.4.0 (#2864)

Bumps [github.com/jsimonetti/rtnetlink](https://github.com/jsimonetti/rtnetlink) from 1.3.5 to 1.4.0.
- [Release notes](https://github.com/jsimonetti/rtnetlink/releases)
- [Commits](https://github.com/jsimonetti/rtnetlink/compare/v1.3.5...v1.4.0)

---
updated-dependencies:
- dependency-name: github.com/jsimonetti/rtnetlink
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* build(deps): bump golang.org/x/sys from 0.13.0 to 0.15.0 (#2863)

Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.13.0 to 0.15.0.
- [Commits](https://github.com/golang/sys/compare/v0.13.0...v0.15.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add TCPOFOQueue to default netstat metrics (#2867)

Adds a count for TCP packets received out of orders. This can be an
indication that there is packet loss on the way packets travel towards
this server. In that case, the sender will retransmit (and we can
already monitor the Tcp_RetransSegs there), but we have no way to
monitor the packet loss on the receiver side. When a packet is received
and the receiver detects previous one missing, it will increase the
TCPOFOQueue counter and reply with selective ACK to the sender, both
possible indications of packet loss. Confirmation of packet loss can be
achieved by taking packet captures, ignoring wireshark analysis, and
carefully looking at data being retransmitted based on the TCP seq.

Just like RetransSegs, TCPOFOQueue should be interesting for any
deployment as a mean to detect packet loss, so here suggesting adding it
to the default list.

Signed-off-by: François Rigault <frigo@amadeus.com>
Co-authored-by: François Rigault <frigo@amadeus.com>

* Update common Prometheus files (#2870)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Add mitigation information to the linux vulnerabilities collector (#2806)

While the CPU vulnerabilities collector has been added in https://github.com/prometheus/node_exporter/pull/2721 , it's currently not including information regarding the mitigation strategy used for a given vulnerability.

This information can be quite valuable, as often times different mitigation strategies come with a different performance impact.

This commit adds a third label to the cpu_vulnerabilities_info metric, to include the "mitigation" used for a given vulnerability - if a given vulnerability is not affecting a node or the node is still vulnerable, the mitigation is expected to be empty.

Signed-off-by: João Lima <jlima@cloudflare.com>

* Update common Prometheus files (#2872)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* build(deps): bump golang.org/x/crypto from 0.14.0 to 0.17.0 (#2877)

Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.14.0 to 0.17.0.
- [Commits](https://github.com/golang/crypto/compare/v0.14.0...v0.17.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update common Prometheus files (#2879)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* build(deps): bump github.com/prometheus/exporter-toolkit (#2885)

Bumps [github.com/prometheus/exporter-toolkit](https://github.com/prometheus/exporter-toolkit) from 0.10.0 to 0.11.0.
- [Release notes](https://github.com/prometheus/exporter-toolkit/releases)
- [Changelog](https://github.com/prometheus/exporter-toolkit/blob/master/CHANGELOG.md)
- [Commits](https://github.com/prometheus/exporter-toolkit/compare/v0.10.0...v0.11.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/exporter-toolkit
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* build(deps): bump github.com/beevik/ntp from 1.3.0 to 1.3.1 (#2886)

Bumps [github.com/beevik/ntp](https://github.com/beevik/ntp) from 1.3.0 to 1.3.1.
- [Release notes](https://github.com/beevik/ntp/releases)
- [Changelog](https://github.com/beevik/ntp/blob/main/RELEASE_NOTES.md)
- [Commits](https://github.com/beevik/ntp/compare/v1.3.0...v1.3.1)

---
updated-dependencies:
- dependency-name: github.com/beevik/ntp
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* build(deps): bump github.com/prometheus/client_golang (#2887)

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.17.0 to 1.18.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](https://github.com/prometheus/client_golang/compare/v1.17.0...v1.18.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update common Prometheus files (#2897)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* diskstats: ignore zram devices on linux systems by default (#2898)

Signed-off-by: DBS-ST-VIT <dbs-st-vit@users.noreply.github.com>
Co-authored-by: DBS-ST-VIT <dbs-st-vit@users.noreply.github.com>

* Bump golang-builder version (#2908)

Signed-off-by: Alper Polat <gitperr@gmail.com>

* exec_bsd: Fix labels for vm.stats.sys.v_syscall sysctl (#2895)

Signed-off-by: David O'Rourke <david.orourke@gmail.com>

* chore:remove constant from function (#2884)

Signed-off-by: tyltr <tylitianrui@126.com>

* build(deps): bump github.com/prometheus/common from 0.45.0 to 0.46.0 (#2910)

Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.45.0 to 0.46.0.
- [Release notes](https://github.com/prometheus/common/releases)
- [Commits](https://github.com/prometheus/common/compare/v0.45.0...v0.46.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/common
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* build(deps): bump github.com/jsimonetti/rtnetlink from 1.4.0 to 1.4.1 (#2909)

Bumps [github.com/jsimonetti/rtnetlink](https://github.com/jsimonetti/rtnetlink) from 1.4.0 to 1.4.1.
- [Release notes](https://github.com/jsimonetti/rtnetlink/releases)
- [Commits](https://github.com/jsimonetti/rtnetlink/compare/v1.4.0...v1.4.1)

---
updated-dependencies:
- dependency-name: github.com/jsimonetti/rtnetlink
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix hwmon nil ptr (#2873)

* fix hwmon nil ptr

syslink maybe lost in some cases.

---------

Signed-off-by: TaoGe <6657718+yowenter@users.noreply.github.com>

* Fix hwmon error capture (#2915)

Fix golangci-lint "ineffectual assignment" by correctly capturing any
errors within the hwmon gathering loop.

Signed-off-by: Ben Kochie <superq@gmail.com>

* Update common Prometheus files (#2917)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Revert "Add ZFS freebsd per dataset stats (#2753)" (#2925)

This reverts commit f34aaa61092fe7e3c6618fdb0b0d16a68a291ff7.

Signed-off-by: Caleb Webber <caleb@codingthemsoftly.com>

* filesystem: fix mountTimeout not working issue (#2903)

Signed-off-by: DongWei <jiangxuege@hotmail.com>

* Fix description for NodeDiskIOSaturation alert (#2929)

NodeDiskIOSaturation description should say 30m per the "for" clause

Signed-off-by: Taylor Sly <slyt@users.noreply.github.com>

* Enforce no subprocess policy (#2926)

Add depguard to golangci-lint to enforce the no-os/exec policy.

Signed-off-by: Ben Kochie <superq@gmail.com>

* filesystem: surface device errors (#2923)

filesystem: surface filesystem device error

Fixes: #2918
---------

Signed-off-by: Pamela Mei i540369 <pamela.mei@sap.com>

* Revert "filesystem: fix mountTimeout not working issue (#2903)" (#2932)

This reverts commit 9f1f791ac2e1377781c4f8807a23d86d92ad6499.

Signed-off-by: Ben Kochie <superq@gmail.com>

* Update common Prometheus files (#2939)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Update common Prometheus files (#2946)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* build(deps): bump golang.org/x/sys from 0.16.0 to 0.17.0 (#2943)

Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.16.0 to 0.17.0.
- [Commits](https://github.com/golang/sys/compare/v0.16.0...v0.17.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* build(deps): bump github.com/prometheus/client_golang (#2942)

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.18.0 to 1.19.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/v1.19.0/CHANGELOG.md)
- [Commits](https://github.com/prometheus/client_golang/compare/v1.18.0...v1.19.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* build(deps): bump github.com/prometheus/client_model from 0.5.0 to 0.6.0 (#2944)

Bumps [github.com/prometheus/client_model](https://github.com/prometheus/client_model) from 0.5.0 to 0.6.0.
- [Release notes](https://github.com/prometheus/client_model/releases)
- [Commits](https://github.com/prometheus/client_model/compare/v0.5.0...v0.6.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_model
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* build(deps): bump golang.org/x/sys from 0.17.0 to 0.18.0 (#2948)

Bumps [golang.org/x/sys](https://github.com/golang/sys) from 0.17.0 to 0.18.0.
- [Commits](https://github.com/golang/sys/compare/v0.17.0...v0.18.0)

---
updated-dependencies:
- dependency-name: golang.org/x/sys
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update common Prometheus files (#2949)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* build(deps): bump github.com/prometheus/procfs from 0.12.0 to 0.13.0 (#2952)

Bumps [github.com/prometheus/procfs](https://github.com/prometheus/procfs) from 0.12.0 to 0.13.0.
- [Release notes](https://github.com/prometheus/procfs/releases)
- [Commits](https://github.com/prometheus/procfs/compare/v0.12.0...v0.13.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/procfs
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add multi-cluster support for Nodes dashboard (#2945)

Signed-off-by: Adrian Berger <adria.berger94@gmail.com>

* disable selinux,fix end-to-end-test.sh error(#2934) (#2937)

Signed-off-by: heyitao <heyitao@uniontech.com>
Co-authored-by: heyitao <heyitao@uniontech.com>

* Add new collector and metrics for watchdog (#2309) (#2880)

Signed-off-by: Gavin Lam <gavin.oss@tutamail.com>

* Enable watchdog module by default; Add no data error (#2953)

Signed-off-by: Gavin Lam <gavin.oss@tutamail.com>

* Update common Prometheus files (#2954)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* build(deps): bump google.golang.org/protobuf from 1.32.0 to 1.33.0 (#2955)

Bumps google.golang.org/protobuf from 1.32.0 to 1.33.0.

---
updated-dependencies:
- dependency-name: google.golang.org/protobuf
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Update common Prometheus files (#2959)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Update common Prometheus files (#2964)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Sanitize ethtool metric name keys

Apply the same metric name sanitization to the keys as to the metric
names. This avoids conflicting help strings in the metric registry.

Fixes: https://github.com/prometheus/node_exporter/issues/2893

Signed-off-by: Ben Kochie <superq@gmail.com>

* Update common Prometheus files

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* chore: fix some typos (#2974)

Signed-off-by: occupyhabit <wangmengjiao@outlook.com>

* collector/textfile: Avoid inconsistent help-texts (#2962)

Avoid metrics with inconsistent help-texts. The earlier behaviour has
been preserved in the sense that the first encountered instance is still
used to generate metrics, whereas the subsequent inconsistent ones are
ignored along with a few peripheral changes.

```
 # HELP node_scrape_collector_duration_seconds node_exporter: Duration of a collector scrape.
 #TYPE node_scrape_collector_duration_seconds gauge
 node_scrape_collector_duration_seconds{collector="textfile"} 0.0004005
 # HELP node_scrape_collector_success node_exporter: Whether a collector succeeded.
 # TYPE node_scrape_collector_success gauge
 node_scrape_collector_success{collector="textfile"} 1
 # HELP node_textfile_mtime_seconds Unixtime mtime of textfiles successfully read.
 # TYPE node_textfile_mtime_seconds gauge
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-bar.prom"} 1.710812009e+09
 node_textfile_mtime_seconds{file="/Users/rexagod/repositories/misc/node_exporter/ne-foo.prom"} 1.710811982e+09
 # HELP node_textfile_scrape_error 1 if there was an error opening or reading a file, 0 otherwise
 # TYPE node_textfile_scrape_error gauge
 node_textfile_scrape_error 1
 # HELP promhttp_metric_handler_errors_total Total number of internal errors encountered by the promhttp metric handler.
 # TYPE promhttp_metric_handler_errors_total counter
 promhttp_metric_handler_errors_total{cause="encoding"} 0
 promhttp_metric_handler_errors_total{cause="gathering"} 0
 # HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served.
 # TYPE promhttp_metric_handler_requests_in_flight gauge
 promhttp_metric_handler_requests_in_flight 1
 # HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code.
 # TYPE promhttp_metric_handler_requests_total counter
 promhttp_metric_handler_requests_total{code="200"} 0
 promhttp_metric_handler_requests_total{code="500"} 0
 promhttp_metric_handler_requests_total{code="503"} 0
 # HELP tau_infrastructure_performing_maintenance_task At what timestamp a given task started or stopped, the last time it was run.
 # TYPE tau_infrastructure_performing_maintenance_task gauge
 tau_infrastructure_performing_maintenance_task{main_task="nightly",start_or_stop="start",sub_task="main"} 1.64728080198446e+09
```

Fixes: #2317

Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>

* Update common Prometheus files (#2973)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* os_release.go: Added support end parsing support. (#2982)

* os_release.go: Added support end parsing support.

Fixes: #2977

Signed-off-by: Jonathan Davies <jpds@protonmail.com>

* os_release_test.go: Added TestParseOSSupportEnd.

Signed-off-by: Jonathan Davies <jpds@protonmail.com>

---------

Signed-off-by: Jonathan Davies <jpds@protonmail.com>

* zfs: Log mib when sysctl read fails on FreeBSD

When the zfs collector fails on FreeBSD it doesn't log which `mib` triggered the issue. This makes diagnostics hard.

Incompatibilities in the list of supported mibs is not uncommon with major os updates. By adding this change, it'll be easier for users to report the specific mib that is triggering the failure.

Related to #2847

Signed-off-by: Daniel Kimsey <90741+dekimsey@users.noreply.github.com>

* chore: fix typo in comment

Signed-off-by: looklose <shishuaiqun@yeah.net>

* fibre_channel: update procfs to take into account optional attributes (#2933)

Signed-off-by: machine424 <ayoubmrini424@gmail.com>

* refactor: Optimize code by using built-in constants in the standard library (#2989)

Signed-off-by: coderwander <770732124@qq.com>

* os_release.go: Removed caching of modtime/filename of os-release file. (#2987)

Signed-off-by: Jonathan Davies <jpds@protonmail.com>

* build(deps): bump golang.org/x/net from 0.20.0 to 0.23.0 (#2996)

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.20.0 to 0.23.0.
- [Commits](https://github.com/golang/net/compare/v0.20.0...v0.23.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix: data race of NetClassCollector metrics initialization when multiple requests happen (#2995)

Signed-off-by: John Guo <john@johng.cn>

* Update common Prometheus files (#2992)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Update build (#3000)

* Update Go to 1.22.
* Update Go modules.
* Use new version collector.
* Use standard library slices package.

Signed-off-by: Ben Kochie <superq@gmail.com>

* Fix watchdog_test lint and test failures on macos. (#3003)

Ensure identical build flags embedded in both files.

Signed-off-by: Chris Cleeland <chris.cleeland@gmail.com>

* Release v1.8.0 (#3002)

* [CHANGE] exec_bsd: Fix labels for `vm.stats.sys.v_syscall` sysctl #2895
* [CHANGE] diskstats: Ignore zram devices on linux systems #2898
* [CHANGE] textfile: Avoid inconsistent help-texts  #2962
* [CHANGE] os: Removed caching of modtime/filename of os-release file #2987
* [FEATURE] xfrm: Add new collector #2866
* [FEATURE] watchdog: Add new collector #2880
* [ENHANCEMENT] cpu_vulnerabilities: Add mitigation information label #2806
* [ENHANCEMENT] nfsd: Handle new `wdeleg_getattr` attribute #2810
* [ENHANCEMENT] netstat: Add TCPOFOQueue to default netstat metrics #2867
* [ENHANCEMENT] filesystem: surface device errors #2923
* [ENHANCEMENT] os: Add support end parsing #2982
* [ENHANCEMENT] zfs: Log mib when sysctl read fails on FreeBSD #2975
* [ENHANCEMENT] fibre_channel: update procfs to take into account optional attributes #2933
* [BUGFIX] cpu: Fix debug log in cpu collector #2857
* [BUGFIX] hwmon: Fix hwmon nil ptr #2873
* [BUGFIX] hwmon: Fix hwmon error capture #2915
* [BUGFIX] zfs: Revert "Add ZFS freebsd per dataset stats #2925
* [BUGFIX] ethtool: Sanitize ethtool metric name keys #2940
* [BUGFIX] fix: data race of NetClassCollector metrics initialization #2995

Signed-off-by: Ben Kochie <superq@gmail.com>

* Update common Prometheus files (#3009)

Signed-off-by: prombot <prometheus-team@googlegroups.com>

* Sign node exporter darwin binary with rcodesign (#3008)

* Sign node exporter darwin binary with rcodesign

Prevents SIGKILL issues on macs

Signed-off-by: Alper Polat <gitperr@gmail.com>

* Be explicit about checking for the binary

Co-authored-by: Ben Kochie <superq@gmail.com>
Signed-off-by: Alper Polat <101826653+gitperr@users.noreply.github.com>

* Also attempt to sign darwin-amd64

Signed-off-by: Alper Polat <gitperr@gmail.com>

---------

Signed-off-by: Alper Polat <gitperr@gmail.com>
Signed-off-by: Alper Polat <101826653+gitperr@users.noreply.github.com>
Co-authored-by: Ben Kochie <superq@gmail.com>

* collector/cpu: s/cpu_ticks*/cpu_nsec* for solaris (#2963)

Replace all cpu_ticks_* with cpu_nsec_*, since the former was off my a
magnitude of 10e6, and showed incorrect values for
node_cpu_seconds_total.

Fixes: #1837

Signed-off-by: Pranshu Srivastava <rexagod@gmail.com>

* Fix pressure collector nil reference (#3016)

Check that the PSI metrics are returned in order to avoid nil pointer
dereference.
* Update fixutre to match real-world samples.

Fixes: https://github.com/prometheus/node_exporter/issues/3015

Signed-off-by: Ben Kochie <superq@gmail.com>

* Release v1.8.1 (#3018)

* [BUGFIX] Fix CPU seconds on Solaris #2963
* [BUGFIX] Sign Darwin/MacOS binaries #3008
* [BUGFIX] Fix pressure collector nil reference #3016

Signed-off-by: Ben Kochie <superq@gmail.com>

* Release v1.8.2 (#3055)

* fix pressure metric collection fails on systems that do not expose a full CPU stat #3051 (#3054)

Signed-off-by: joey <zchengjoey@gmail.com>
Signed-off-by: Ben Kochie <superq@gmail.com>

* Release v1.8.2

* [BUGFIX] Fix CPU pressure metric collection #3054

Signed-off-by: Ben Kochie <superq@gmail.com>

---------

Signed-off-by: joey <zchengjoey@gmail.com>
Signed-off-by: Ben Kochie <superq@gmail.com>
Co-authored-by: chengjoey <30427474+chengjoey@users.noreply.github.com>

* PMM-7 update imports

* PMM-12116 update imports

* PMM-12116 fix build

* PMM-12116 fix CI

* PMM-12116 fix linter.

* PMM-12116 fix go mod.

---------

Signed-off-by: jalev <qweet.ing@gmail.com>
Signed-off-by: Daniel Hodges <hodges.daniel.scott@gmail.com>
Signed-off-by: Ben Kochie <superq@gmail.com>
Signed-off-by: prombot <prometheus-team@googlegroups.com>
Signed-off-by: Benny Siegert <bsiegert@gmail.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Jia Xin <alexjx@gmail.com>
Signed-off-by: David O'Rourke <david.orourke@gmail.com>
Signed-off-by: Haoyu Sun <hasun@redhat.com>
Signed-off-by: LamGC <lam827@lamgc.net>
Signed-off-by: Daniël van Eeden <git@myname.nl>
Signed-off-by: Lukas Coppens <lukas.coppens@be-mobile.com>
Signed-off-by: Marcus Cobden <leth@users.noreply.github.com>
Signed-off-by: Matthias Petermann <mp@petermann-it.de>
Signed-off-by: Pablo Caderno <kaderno@gmail.com>
Signed-off-by: David Calvert <david@0xdc.me>
Signed-off-by: Maximilian Wilhelm <max@sdn.clinic>
Signed-off-by: Johannes Ziemke <github@5pi.de>
Signed-off-by: Will Bollock <wbollock@linode.com>
Signed-off-by: Jonathan Davies <jpds@protonmail.com>
Signed-off-by: Claudio Jeker <claudio@openbsd.org>
Signed-off-by: cui fliter <imcusg@gmail.com>
Signed-off-by: Saleh Sal <0xack13@gmail.com>
Signed-off-by: Nitin Shelke <nshelke@cloudera.com>
Signed-off-by: dongjiang1989 <dongjiang1989@126.com>
Signed-off-by: jbradleynh <jbradley@fastly.com>
Signed-off-by: Benoît Knecht <bknecht@protonmail.ch>
Signed-off-by: remi <remijouannet@gmail.com>
Signed-off-by: Rémi Jouannet <remijouannet@gmail.com>
Signed-off-by: abbeywoodyear <abbey.woodyear@thehutgroup.com>
Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: Johannes Dilli <jd1@users.noreply.github.com>
Signed-off-by: Erica Mays <erica@emays.dev>
Signed-off-by: juzhao <juzhao@redhat.com>
Signed-off-by: Cam Cope <ccope@crusoeenergy.com>
Signed-off-by: Vitaly Zhuravlev <v-zhuravlev@users.noreply.github.com>
Signed-off-by: Michal Wasilewski <michal@mwasilewski.net>
Signed-off-by: Conall O'Brien <conall@conall.net>
Signed-off-by: Gabi Davar <grizzly.nyo@gmail.com>
Signed-off-by: L <3177243+LukeLR@users.noreply.github.com>
Signed-off-by: Daniel Swarbrick <daniel.swarbrick@gmail.com>
Signed-off-by: Oliver Geiselhardt-Herms <ogh@deepl.com>
Signed-off-by: John Kordich <jkordich@gmail.com>
Signed-off-by: Metbog <metbog@gmail.com>
Signed-off-by: ToMe25 <ToMe25@gmx.de>
Signed-off-by: yang-stressfree <68363665+yang-stressfree@users.noreply.github.com>
Signed-off-by: joey <zchengjoey@gmail.com>
Signed-off-by: nemobis <federicoleva@tiscali.it>
Signed-off-by: Ayoub Nasr <ayoub.nasr@scality.com>
Signed-off-by: fitz123 <alugovoi@ordercapital.com>
Signed-off-by: Tobias Klausmann <klausman@schwarzvogel.de>
Signed-off-by: Simon Pasquier <spasquie@redhat.com>
Signed-off-by: Gavin Lam <gavin.oss@tutamail.com>
Signed-off-by: François Rigault <frigo@amadeus.com>
Signed-off-by: João Lima <jlima@cloudflare.com>
Signed-off-by: DBS-ST-VIT <dbs-st-vit@users.noreply.github.com>
Signed-off-by: Alper Polat <gitperr@gmail.com>
Signed-off-by: tyltr <tylitianrui@126.com>
Signed-off-by: TaoGe <6657718+yowenter@u…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants