Failing to detect SSDs in copyDir should not be a fatal error. #3653

cmacknz · 2023-10-24T18:00:06Z

The migration of the agent CI system to buildkite has resulted in a new pipeline with new Mac workers. On those workers there is a new test failure: https://buildkite.com/elastic/elastic-agent/builds/4344#018b6184-da97-4130-8dcf-60f543c6c94a/103-588

2023-10-24 11:54:42 UTC | === FAIL: internal/pkg/agent/application/upgrade Test_CopyFile/Existing_but_open,_ignore_errors (0.13s)
-- | --
  | 2023-10-24 11:54:42 UTC | upgrade_test.go:102:
  | 2023-10-24 11:54:42 UTC | Error Trace:	/Users/admin/builds/bk-agent-prod-orka-1698148180680922232/elastic/elastic-agent/internal/pkg/agent/application/upgrade/upgrade_test.go:102
  | 2023-10-24 11:54:42 UTC | Error:      	Not equal:
  | 2023-10-24 11:54:42 UTC | expected: false
  | 2023-10-24 11:54:42 UTC | actual  : true
  | 2023-10-24 11:54:42 UTC | Test:       	Test_CopyFile/Existing_but_open,_ignore_errors
  | 2023-10-24 11:54:42 UTC | Messages:   	ghw.Block() returned error: ioreg unmarshal resulted in 2 I/O device tree nodes

The ghw.Block function failing is from https://github.com/jaypipes/ghw which claims directly in its description:

ghw is a Go library providing hardware inspection and discovery for Linux and Windows. There currently exists partial support for MacOSX.

My guess is that this is a limitation of that library where it is failing to deal with the hardware configuration of the Mac workers here (probably there is some network attached storage through it off).

Regardless of the root cause, this isn't actually a fatal error since we can proceed with the default concurrency level of 1. Especially since this is part of the upgrade process, we should just fall back to a default and attempt the copy and let that fail instead if that is what is going to happen.

elasticmachine · 2023-10-24T18:00:09Z

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

cmacknz · 2023-10-24T18:02:35Z

Not backporting since the original change isn't in 8.11 #3212

mergify · 2023-10-24T18:03:00Z

This pull request does not have a backport label. Could you fix it @cmacknz? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

backport-v./d./d./d is the label to automatically backport to the 8./d branch. /d is the digit

NOTE: backport-skip has been added to this pull request.

elasticmachine · 2023-10-24T21:02:42Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Start Time: 2023-10-26T17:00:14.076+0000
Duration: 29 min 31 sec

Test stats 🧪

Test	Results
Failed	0
Passed	6553
Skipped	59
Total	6612

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.
/package : Generate the packages.
run integration tests : Run the Elastic Agent Integration tests.
run end-to-end tests : Generate the packages and run the E2E Tests.
run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

elasticmachine · 2023-10-24T21:02:51Z

🌐 Coverage report

Name	Metrics % (`covered/total`)	Diff
Packages	98.824% (`84/85`)	👍
Files	66.885% (`204/305`)	👍
Classes	65.95% (`368/558`)	👍
Methods	53.016% (`1160/2188`)	👎 -0.024
Lines	39.363% (`13665/34715`)	👍 0.004
Conditionals	100.0% (`0/0`)	💚

cmacknz · 2023-10-25T19:25:22Z

buildkite test it

cmacknz · 2023-10-25T21:12:05Z

--
  | 2023-10-25 20:54:04 UTC | /usr/local/lib/ruby/gems/3.1.0/gems/rexml-3.2.5/lib/rexml/parsers/treeparser.rb:96:in `rescue in parse': #<RuntimeError: Illegal character "\\u0000" in raw string "\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u0000\\u000
...
--
  | 2023-10-25 20:54:05 UTC | from /usr/local/lib/ruby/gems/3.1.0/gems/rexml-3.2.5/lib/rexml/text.rb:136:in `each'
  | 2023-10-25 20:54:05 UTC | from /usr/local/lib/ruby/gems/3.1.0/gems/rexml-3.2.5/lib/rexml/text.rb:136:in `check'
  | 2023-10-25 20:54:05 UTC | from /usr/local/lib/ruby/gems/3.1.0/gems/rexml-3.2.5/lib/rexml/text.rb:122:in `initialize'
  | 2023-10-25 20:54:05 UTC | from /usr/local/lib/ruby/gems/3.1.0/gems/rexml-3.2.5/lib/rexml/parsers/treeparser.rb:47:in `new'
  | 2023-10-25 20:54:05 UTC | from /usr/local/lib/ruby/gems/3.1.0/gems/rexml-3.2.5/lib/rexml/parsers/treeparser.rb:47:in `parse'
  | 2023-10-25 20:54:05 UTC | from /usr/local/lib/ruby/gems/3.1.0/gems/rexml-3.2.5/lib/rexml/document.rb:448:in `build'
  | 2023-10-25 20:54:05 UTC | from /usr/local/lib/ruby/gems/3.1.0/gems/rexml-3.2.5/lib/rexml/document.rb:101:in `initialize'
  | 2023-10-25 20:54:05 UTC | from /src/bin/annotate:59:in `new'
  | 2023-10-25 20:54:05 UTC | from /src/bin/annotate:59:in `block in <main>'
  | 2023-10-25 20:54:05 UTC | from /src/bin/annotate:53:in `each'
  | 2023-10-25 20:54:05 UTC | from /src/bin/annotate:53:in `<main>'
  | 2023-10-25 20:54:05 UTC | 💥 Error when processing JUnit tests

That's a new one

cmacknz · 2023-10-25T21:12:35Z

buildkite test it

pchila

LGTM

pchila · 2023-10-26T12:34:58Z

/test

cmacknz · 2023-10-26T13:57:25Z

(linux-amd64-ubuntu-2204) Failed for instance linux-amd64-ubuntu-2204 (@ 34.16.59.95): ogc-linux-amd64-ubuntu-2204-00df unable to continue because stack never became ready: failed to check for cloud 8.12.0-SNAPSHOT to be ready: context deadline exceeded
--

cmacknz · 2023-10-26T13:57:31Z

buildkite test it

mergify · 2023-10-26T15:09:47Z

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b block-device-detection-nonfatal upstream/block-device-detection-nonfatal
git merge upstream/main
git push upstream block-device-detection-nonfatal

cmacknz · 2023-10-26T16:06:07Z

This also affects the install command which is now fixed.

Document that errors are not fatal.

cmacknz · 2023-10-26T17:01:04Z

I moved all uses of ghw.Block into one place so that this problem is less likely to return.

elastic-sonarqube · 2023-10-26T17:11:52Z

SonarQube Quality Gate

20.0% Coverage on New Code (is less than 40%)

See analysis details on SonarQube

cmacknz · 2023-10-27T14:42:55Z

I am going to force merge this for the same reason as #3623 (comment), the install code is covered by integration tests.

Failing to detect SSDs should not be a fatal error.

476c876

cmacknz added Team:Elastic-Agent Label for the Agent team backport-v8.11.0 Automated backport with mergify labels Oct 24, 2023

cmacknz self-assigned this Oct 24, 2023

cmacknz requested a review from a team as a code owner October 24, 2023 18:00

cmacknz requested review from blakerouse and pchila October 24, 2023 18:00

cmacknz added skip-changelog and removed backport-v8.11.0 Automated backport with mergify labels Oct 24, 2023

mergify bot added the backport-skip label Oct 24, 2023

cmacknz requested review from ycombinator and AndersonQ October 24, 2023 18:03

pchila approved these changes Oct 26, 2023

View reviewed changes

ghw.Block errors should not be fatal during install.

43439b4

cmacknz added 3 commits October 26, 2023 11:56

Log block HW errors when they occur.

fc800be

Improve error messages.

5c9d6a6

Merge branch 'main' into block-device-detection-nonfatal

ef15d85

Centralize uses of ghw.Block to prevent misuse.

599e267

Document that errors are not fatal.

cmacknz mentioned this pull request Oct 27, 2023

[CI] Pull requests migration to buildkite #3573

Merged

cmacknz merged commit 3d8bdf4 into elastic:main Oct 30, 2023
7 of 8 checks passed

cmacknz deleted the block-device-detection-nonfatal branch October 30, 2023 13:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failing to detect SSDs in copyDir should not be a fatal error. #3653

Failing to detect SSDs in copyDir should not be a fatal error. #3653

cmacknz commented Oct 24, 2023

elasticmachine commented Oct 24, 2023

cmacknz commented Oct 24, 2023

mergify bot commented Oct 24, 2023

elasticmachine commented Oct 24, 2023 •

edited

Loading

Build stats

Test stats 🧪

elasticmachine commented Oct 24, 2023 •

edited

Loading

cmacknz commented Oct 25, 2023

cmacknz commented Oct 25, 2023 •

edited

Loading

cmacknz commented Oct 25, 2023

pchila left a comment

pchila commented Oct 26, 2023

cmacknz commented Oct 26, 2023

cmacknz commented Oct 26, 2023

mergify bot commented Oct 26, 2023

cmacknz commented Oct 26, 2023

cmacknz commented Oct 26, 2023

elastic-sonarqube bot commented Oct 26, 2023

cmacknz commented Oct 27, 2023

Failing to detect SSDs in copyDir should not be a fatal error. #3653

Failing to detect SSDs in copyDir should not be a fatal error. #3653

Conversation

cmacknz commented Oct 24, 2023

elasticmachine commented Oct 24, 2023

cmacknz commented Oct 24, 2023

mergify bot commented Oct 24, 2023

elasticmachine commented Oct 24, 2023 • edited Loading

💚 Build Succeeded

Build stats

Test stats 🧪

💚 Flaky test report

🤖 GitHub comments

elasticmachine commented Oct 24, 2023 • edited Loading

🌐 Coverage report

cmacknz commented Oct 25, 2023

cmacknz commented Oct 25, 2023 • edited Loading

cmacknz commented Oct 25, 2023

pchila left a comment

Choose a reason for hiding this comment

pchila commented Oct 26, 2023

cmacknz commented Oct 26, 2023

cmacknz commented Oct 26, 2023

mergify bot commented Oct 26, 2023

cmacknz commented Oct 26, 2023

cmacknz commented Oct 26, 2023

elastic-sonarqube bot commented Oct 26, 2023

cmacknz commented Oct 27, 2023

elasticmachine commented Oct 24, 2023 •

edited

Loading

elasticmachine commented Oct 24, 2023 •

edited

Loading

cmacknz commented Oct 25, 2023 •

edited

Loading