Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cgroups stats not available on Debian 11 #76812

Closed
davidkyle opened this issue Aug 23, 2021 · 9 comments · Fixed by #76883 or #77128
Closed

cgroups stats not available on Debian 11 #76812

davidkyle opened this issue Aug 23, 2021 · 9 comments · Fixed by #76883 or #77128
Assignees
Labels
>bug :Core/Infra/Core Core issues without another label Team:Core/Infra Meta label for core/infra team

Comments

@davidkyle
Copy link
Member

Build scan:
https://gradle-enterprise.elastic.co/s/3hshdrpwm6ona/tests/:qa:os:destructiveDistroTest.default-docker/org.elasticsearch.packaging.test.DockerTests/test140CgroupOsStatsAreAvailable

Reproduction line:
null

Applicable branches:
master

Reproduces locally?:
Didn't try

Failure history:
https://gradle-enterprise.elastic.co/scans/tests?tests.container=org.elasticsearch.packaging.test.DockerTests&tests.test=test140CgroupOsStatsAreAvailable

Failure excerpt:

java.lang.AssertionError: Couldn't find /nodes/{nodeId}/os/cgroup in API response

  at __randomizedtesting.SeedInfo.seed([64ACE3C2900202C5:3161EBF2641C65AB]:0)
  at org.junit.Assert.fail(Assert.java:88)
  at org.junit.Assert.assertTrue(Assert.java:41)
  at org.junit.Assert.assertFalse(Assert.java:64)
  at org.elasticsearch.packaging.test.DockerTests.test140CgroupOsStatsAreAvailable(DockerTests.java:860)
  at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:-2)
  at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:566)
  at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1758)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
  at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
  at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
  at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
  at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:375)
  at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:831)
  at java.lang.Thread.run(Thread.java:834)

@davidkyle davidkyle added :Delivery/Packaging RPM and deb packaging, tar and zip archives, shell and batch scripts >test-failure Triaged test failures from CI labels Aug 23, 2021
@elasticmachine elasticmachine added the Team:Delivery Meta label for Delivery team label Aug 23, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-delivery (Team:Delivery)

@pugnascotia
Copy link
Contributor

Well this is a surprising failure.

@pugnascotia pugnascotia self-assigned this Aug 23, 2021
@davidkyle
Copy link
Member Author

Well this is a surprising failure.

A pleasant surprise?

@pugnascotia
Copy link
Contributor

All the recent failures are on Debian 11 🤔

@pugnascotia
Copy link
Contributor

I wonder if this is related to this change:

In bullseye, systemd defaults to using control groups v2 (cgroupv2), which provides a unified resource-control hierarchy. Kernel commandline parameters are available to re-enable the legacy cgroups if necessary; see the notes for OpenStack in Section 5.1.9, “OpenStack and cgroups v1” section.

@pugnascotia
Copy link
Contributor

I muted the test on master and 7.x

I also spun up a debian-11 instance on GCP and there's no cgroups information in the payload for /_nodes/stats/os.

@pugnascotia pugnascotia added :Core/Infra/Core Core issues without another label and removed :Delivery/Packaging RPM and deb packaging, tar and zip archives, shell and batch scripts Team:Delivery Meta label for Delivery team labels Aug 23, 2021
@elasticmachine elasticmachine added the Team:Core/Infra Meta label for core/infra team label Aug 23, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@pugnascotia pugnascotia changed the title [CI] DockerTests test140CgroupOsStatsAreAvailable failing cgroups stats not available on Debian 11 Aug 23, 2021
@pugnascotia pugnascotia removed the >test-failure Triaged test failures from CI label Aug 23, 2021
@pugnascotia
Copy link
Contributor

I tested the 7.14.0 and 8.0.0-alpha1 Docker images, BTW.

@pugnascotia
Copy link
Contributor

Looks like all the cgroup info is under /sys/fs/cgroup/ now.

pugnascotia added a commit to pugnascotia/elasticsearch that referenced this issue Aug 24, 2021
Closes elastic#76812.

`OsProbe` was only capable of handle cgroup data in the v1 format.
However, Debian 11 uses cgroups v2 by default, and Elasticsearch isn't
capable of reporting any cgroup information. Therefore, add support for
the v2 layout.
pugnascotia added a commit that referenced this issue Sep 1, 2021
Closes #76812.

`OsProbe` was only capable of handle cgroup data in the v1 format.
However, Debian 11 uses cgroups v2 by default, and Elasticsearch isn't
capable of reporting any cgroup information. Therefore, add support for
the v2 layout.
pugnascotia added a commit that referenced this issue Sep 1, 2021
Closes #76812.

`OsProbe` was only capable of handle cgroup data in the v1 format.
However, Debian 11 uses cgroups v2 by default, and Elasticsearch isn't
capable of reporting any cgroup information. Therefore, add support for
the v2 layout.
pugnascotia added a commit to pugnascotia/elasticsearch that referenced this issue Sep 1, 2021
Closes elastic#76812.

`OsProbe` was only capable of handle cgroup data in the v1 format.
However, Debian 11 uses cgroups v2 by default, and Elasticsearch isn't
capable of reporting any cgroup information. Therefore, add support for
the v2 layout.
pugnascotia added a commit that referenced this issue Sep 3, 2021
Closes #76812. Closes #77126.

OsProbe was only capable of handle cgroup data in the v1 format.
However, Debian 11 uses cgroups v2 by default, and Elasticsearch isn't
capable of reporting any cgroup information. Therefore, add support for
the v2 layout.

Note that we have to open access to all of /sys/fs/cgroup because with
cgroups v2, the files we need are in an unpredictably location.
pugnascotia added a commit to pugnascotia/elasticsearch that referenced this issue Sep 6, 2021
Closes elastic#76812. Closes elastic#77126.

OsProbe was only capable of handle cgroup data in the v1 format.
However, Debian 11 uses cgroups v2 by default, and Elasticsearch isn't
capable of reporting any cgroup information. Therefore, add support for
the v2 layout.

Note that we have to open access to all of /sys/fs/cgroup because with
cgroups v2, the files we need are in an unpredictably location.
elasticsearchmachine pushed a commit that referenced this issue Sep 6, 2021
* Handle cgroups v2 in `OsProbe` (#77128)

Closes #76812. Closes #77126.

OsProbe was only capable of handle cgroup data in the v1 format.
However, Debian 11 uses cgroups v2 by default, and Elasticsearch isn't
capable of reporting any cgroup information. Therefore, add support for
the v2 layout.

Note that we have to open access to all of /sys/fs/cgroup because with
cgroups v2, the files we need are in an unpredictably location.

* Handle a max memory value of 'max' (#77289)

* Handle a max memory value of 'max'

* Update docs/changelog/77289.yaml

* Delete 77289.yaml

* Fixes to backport

* Fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Core/Infra/Core Core issues without another label Team:Core/Infra Meta label for core/infra team
Projects
None yet
3 participants