Skip to content
This repository has been archived by the owner on Sep 17, 2024. It is now read-only.

Remove the agent config file parameters for stand alone #983

Merged
merged 2 commits into from
Apr 8, 2021

Conversation

adam-stokes
Copy link
Contributor

@adam-stokes adam-stokes commented Apr 1, 2021

Signed-off-by: Adam Stokes 51892+adam-stokes@users.noreply.github.com

What does this PR do?

This removes the volume mounting of elastic-agent.yml for fleet server stand alone. The side effect is that we will have to think about how we want to approach fleet mode outside of that. Currently in stand alone that config file is not useful and it causes device and resource errors when mounting inside of docker when operations within the elastic-agent occur.

I think for the other use cases we can tests integrations etc more dynamically than having elastic-agent.yml mounted as a volume. Thoughts?

Why is it important?

Testing standalone fleet server will fail if this configuration and volume exists

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have run the Unit tests for the CLI, and they are passing locally
  • I have run the End-2-End tests for the suite I'm working on, and they are passing locally
  • I have noticed new Go dependencies (run make notice in the proper directory)

Related issues

Follow-ups

Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>
@adam-stokes adam-stokes requested a review from a team April 1, 2021 20:47
@elasticmachine
Copy link
Contributor

elasticmachine commented Apr 1, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: Pull request #983 updated

  • Start Time: 2021-04-08T08:57:34.804+0000

  • Duration: 25 min 40 sec

  • Commit: 2a36f11

Test stats 🧪

Test Results
Failed 0
Passed 138
Skipped 0
Total 138

Trends 🧪

Image of Build Times

Image of Tests

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test Results
Failed 0
Passed 138
Skipped 0
Total 138

@cachedout
Copy link
Contributor

I'm not totally sure I follow the explanation of the original problem. Perhaps it would be good to file an issue with a more detailed explanation of the problem so that it's easier to put this PR in the proper context?

@mdelapenya
Copy link
Contributor

I'd say it's not ok yet to remove the file from the e2e tests. If the original use case is to use what it's bundled into the image, we could move to an approach where we dockerCopy the file into the container only for PRs. Wdyt?

@adam-stokes
Copy link
Contributor Author

I'd say it's not ok yet to remove the file from the e2e tests. If the original use case is to use what it's bundled into the image, we could move to an approach where we dockerCopy the file into the container only for PRs. Wdyt?

I think this only effects running fleet in stand alone mode. For all other scenarios we can use the volumes mount directive. I still don't know why docker gives us a resource busy error in stand alone mode when the elastic-agent.yml is modified though. But not using the volumes and doing a docker copy would workaround this issue, I just don't know if this is the right approach we should take.

Another idea would be to have 2 separate profiles, one for fleet and another for fleet in standalone mode.

@jalvz
Copy link
Contributor

jalvz commented Apr 7, 2021

I think for the other use cases we can tests integrations etc more dynamically than having elastic-agent.yml mounted as a volume. Thoughts?

I like that, I generally think passing arguments is more flexible that mounting config files, + we don't need it for standalone.

@jalvz
Copy link
Contributor

jalvz commented Apr 7, 2021

fyi I think you need to merge master to make tests pass.

@mdelapenya
Copy link
Contributor

I've tested this PR locally and I think it's the right think to do, as explained in #981 (comment).

I'm going to merge it. Thanks for your work here!

@mdelapenya mdelapenya merged commit 9aeb82c into master Apr 8, 2021
@adam-stokes adam-stokes deleted the remove-volume-stand-alone branch April 8, 2021 12:11
mdelapenya added a commit to mdelapenya/e2e-testing that referenced this pull request Apr 12, 2021
* master:
  Remove the agent config file parameters for stand alone (elastic#983)
  Uniquify the stand-alone step for checking agent status (elastic#993)
adam-stokes added a commit that referenced this pull request Apr 12, 2021
Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>

Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com>
Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>
mdelapenya added a commit to mdelapenya/e2e-testing that referenced this pull request Apr 15, 2021
* master:
  chore: add debug info for the payload (elastic#1044)
  chore: add debug traces for the webhook payload (elastic#1043)
  fix: wrong interpolation (elastic#1042)
  Update Elastic Agent to not use Kibana (elastic#1036)
  fix: apply X version for non-master branches (elastic#1037)
  fix: add NodeJS to PATH (elastic#1035)
  fix: use an agent when building kibana (elastic#1030)
  fix(jjb): use a branch that exists (elastic#1029)
  remove uninstall step (elastic#1017)
  fix: delay checking stale agent version until it's used (elastic#1016)
  fix: use same JJB than in custom kibana (elastic#1010)
  chore: simplify PR template (elastic#1011)
  feat: support passing KIBANA_VERSION (elastic#905)
  [mergify] assign the original author (elastic#1009)
  Remove the agent config file parameters for stand alone (elastic#983)
  Uniquify the stand-alone step for checking agent status (elastic#993)
adam-stokes added a commit that referenced this pull request Apr 21, 2021
* Move kibana into internals, update fleet test suite
* migrate docker-compose related code to internal layout
* move docker related code to internal layout
* move git related code to internal layout
* move common attributes into internal common file system layout
* move elasticsearch specifics into its own filesystem layout
* move installer based code to internal layout
* move shell related code to internal layout
* move sanitizer code to internal layout
* move io related code to internal layout
* move utils into internal layout
* Update package integration querying/altering
* move curl to internal layout
* move helm to internal layout
* move kubectl into internal layout
* move state internal filesystem
* cleanup config in stand-alone
* remove unused files
* Uniquify the stand-alone step for checking agent status (#993)

There were 2 steps identical in both the stand-alone and fleet test suites.
Running the stand-alone test suite was picking up the step from the fleet test
suite and trying to reference the FleetTestSuite structure which did not hold
any of the agent information (like the hostname) for the stand alone tests.

This fixes it so that the standalone test step is being referenced in the
correct test suite.

* Remove the agent config file parameters for stand alone (#983)
* Update helm/metricbeat tests to use new layout
* Fix policy endpoint update
* fix panic on helm init
* Fix step reference as this being merged seperately
* Update function call to correct standalone step
* Fix merge conflict
* update ProfileEnv query/set for KibanaVersion
* More fixes to agent endpoint security checks
* update backend feature to call out endpoint in step
* use common.TimeoutFactor in docker checkprocess state
* Update adding endpoint integration
* enable features for fleet server
* not necessary to enroll after install
* wait for filebeat/metricbeat before restarts
* clear out fts.CurrentToken during beforeScenario
* attach system integration on deploy
* enroll if rpm
* dont store fleet policy
* update kibana config for latest fleet server
* Update e2e/_suites/fleet/fleet.go
* Update e2e/_suites/fleet/fleet.go
* Update e2e/_suites/fleet/fleet.go
* Update .pre-commit-config.yaml
* Update e2e/Makefile
* rename apt -> deb for installer type
* execute docker start/stop with timeout between
* fixes fleet_server scenario
* Utilize fleet server in all tests
* Fix enrollment url for fleet server
* Query elasticsearch logs for endpoint security event changes
* Increase search result size for ES
* Fix issue with fleet server restarting continuously
* unpin kibana pr now that most major breakage is resolved
* force unenroll
* for new fleet bootstrap on re-enrollment
* Fix unenrollment
* Add timeout safeguard to elastic-agent execution

In some cases such as attempting to re-enroll with a revoked token, the
elastic-agent will retry indefinitely. This fix adds a safeguard utilizing
'timeout' command prepended to the elastic-agent command so that it will timeout
after TimeoutFactor

Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>
Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com>
mergify bot pushed a commit that referenced this pull request Apr 21, 2021
* Move kibana into internals, update fleet test suite
* migrate docker-compose related code to internal layout
* move docker related code to internal layout
* move git related code to internal layout
* move common attributes into internal common file system layout
* move elasticsearch specifics into its own filesystem layout
* move installer based code to internal layout
* move shell related code to internal layout
* move sanitizer code to internal layout
* move io related code to internal layout
* move utils into internal layout
* Update package integration querying/altering
* move curl to internal layout
* move helm to internal layout
* move kubectl into internal layout
* move state internal filesystem
* cleanup config in stand-alone
* remove unused files
* Uniquify the stand-alone step for checking agent status (#993)

There were 2 steps identical in both the stand-alone and fleet test suites.
Running the stand-alone test suite was picking up the step from the fleet test
suite and trying to reference the FleetTestSuite structure which did not hold
any of the agent information (like the hostname) for the stand alone tests.

This fixes it so that the standalone test step is being referenced in the
correct test suite.

* Remove the agent config file parameters for stand alone (#983)
* Update helm/metricbeat tests to use new layout
* Fix policy endpoint update
* fix panic on helm init
* Fix step reference as this being merged seperately
* Update function call to correct standalone step
* Fix merge conflict
* update ProfileEnv query/set for KibanaVersion
* More fixes to agent endpoint security checks
* update backend feature to call out endpoint in step
* use common.TimeoutFactor in docker checkprocess state
* Update adding endpoint integration
* enable features for fleet server
* not necessary to enroll after install
* wait for filebeat/metricbeat before restarts
* clear out fts.CurrentToken during beforeScenario
* attach system integration on deploy
* enroll if rpm
* dont store fleet policy
* update kibana config for latest fleet server
* Update e2e/_suites/fleet/fleet.go
* Update e2e/_suites/fleet/fleet.go
* Update e2e/_suites/fleet/fleet.go
* Update .pre-commit-config.yaml
* Update e2e/Makefile
* rename apt -> deb for installer type
* execute docker start/stop with timeout between
* fixes fleet_server scenario
* Utilize fleet server in all tests
* Fix enrollment url for fleet server
* Query elasticsearch logs for endpoint security event changes
* Increase search result size for ES
* Fix issue with fleet server restarting continuously
* unpin kibana pr now that most major breakage is resolved
* force unenroll
* for new fleet bootstrap on re-enrollment
* Fix unenrollment
* Add timeout safeguard to elastic-agent execution

In some cases such as attempting to re-enroll with a revoked token, the
elastic-agent will retry indefinitely. This fix adds a safeguard utilizing
'timeout' command prepended to the elastic-agent command so that it will timeout
after TimeoutFactor

Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>
Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com>
(cherry picked from commit 5f59670)

# Conflicts:
#	e2e/Makefile
#	e2e/_suites/fleet/ingest_manager_test.go
#	e2e/_suites/fleet/stand-alone.go
#	e2e/_suites/fleet/world.go
adam-stokes added a commit that referenced this pull request Apr 21, 2021
* Move kibana into internals, update fleet test suite
* migrate docker-compose related code to internal layout
* move docker related code to internal layout
* move git related code to internal layout
* move common attributes into internal common file system layout
* move elasticsearch specifics into its own filesystem layout
* move installer based code to internal layout
* move shell related code to internal layout
* move sanitizer code to internal layout
* move io related code to internal layout
* move utils into internal layout
* Update package integration querying/altering
* move curl to internal layout
* move helm to internal layout
* move kubectl into internal layout
* move state internal filesystem
* cleanup config in stand-alone
* remove unused files
* Uniquify the stand-alone step for checking agent status (#993)

There were 2 steps identical in both the stand-alone and fleet test suites.
Running the stand-alone test suite was picking up the step from the fleet test
suite and trying to reference the FleetTestSuite structure which did not hold
any of the agent information (like the hostname) for the stand alone tests.

This fixes it so that the standalone test step is being referenced in the
correct test suite.

* Remove the agent config file parameters for stand alone (#983)
* Update helm/metricbeat tests to use new layout
* Fix policy endpoint update
* fix panic on helm init
* Fix step reference as this being merged seperately
* Update function call to correct standalone step
* Fix merge conflict
* update ProfileEnv query/set for KibanaVersion
* More fixes to agent endpoint security checks
* update backend feature to call out endpoint in step
* use common.TimeoutFactor in docker checkprocess state
* Update adding endpoint integration
* enable features for fleet server
* not necessary to enroll after install
* wait for filebeat/metricbeat before restarts
* clear out fts.CurrentToken during beforeScenario
* attach system integration on deploy
* enroll if rpm
* dont store fleet policy
* update kibana config for latest fleet server
* Update e2e/_suites/fleet/fleet.go
* Update e2e/_suites/fleet/fleet.go
* Update e2e/_suites/fleet/fleet.go
* Update .pre-commit-config.yaml
* Update e2e/Makefile
* rename apt -> deb for installer type
* execute docker start/stop with timeout between
* fixes fleet_server scenario
* Utilize fleet server in all tests
* Fix enrollment url for fleet server
* Query elasticsearch logs for endpoint security event changes
* Increase search result size for ES
* Fix issue with fleet server restarting continuously
* unpin kibana pr now that most major breakage is resolved
* force unenroll
* for new fleet bootstrap on re-enrollment
* Fix unenrollment
* Add timeout safeguard to elastic-agent execution

In some cases such as attempting to re-enroll with a revoked token, the
elastic-agent will retry indefinitely. This fix adds a safeguard utilizing
'timeout' command prepended to the elastic-agent command so that it will timeout
after TimeoutFactor

Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>
Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com>
(cherry picked from commit 5f59670)

# Conflicts:
#	e2e/Makefile
#	e2e/_suites/fleet/ingest_manager_test.go
#	e2e/_suites/fleet/stand-alone.go
#	e2e/_suites/fleet/world.go

Co-authored-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>
adam-stokes added a commit that referenced this pull request Apr 21, 2021
* Move kibana into internals, update fleet test suite
* migrate docker-compose related code to internal layout
* move docker related code to internal layout
* move git related code to internal layout
* move common attributes into internal common file system layout
* move elasticsearch specifics into its own filesystem layout
* move installer based code to internal layout
* move shell related code to internal layout
* move sanitizer code to internal layout
* move io related code to internal layout
* move utils into internal layout
* Update package integration querying/altering
* move curl to internal layout
* move helm to internal layout
* move kubectl into internal layout
* move state internal filesystem
* cleanup config in stand-alone
* remove unused files
* Uniquify the stand-alone step for checking agent status (#993)

There were 2 steps identical in both the stand-alone and fleet test suites.
Running the stand-alone test suite was picking up the step from the fleet test
suite and trying to reference the FleetTestSuite structure which did not hold
any of the agent information (like the hostname) for the stand alone tests.

This fixes it so that the standalone test step is being referenced in the
correct test suite.

* Remove the agent config file parameters for stand alone (#983)
* Update helm/metricbeat tests to use new layout
* Fix policy endpoint update
* fix panic on helm init
* Fix step reference as this being merged seperately
* Update function call to correct standalone step
* Fix merge conflict
* update ProfileEnv query/set for KibanaVersion
* More fixes to agent endpoint security checks
* update backend feature to call out endpoint in step
* use common.TimeoutFactor in docker checkprocess state
* Update adding endpoint integration
* enable features for fleet server
* not necessary to enroll after install
* wait for filebeat/metricbeat before restarts
* clear out fts.CurrentToken during beforeScenario
* attach system integration on deploy
* enroll if rpm
* dont store fleet policy
* update kibana config for latest fleet server
* Update e2e/_suites/fleet/fleet.go
* Update e2e/_suites/fleet/fleet.go
* Update e2e/_suites/fleet/fleet.go
* Update .pre-commit-config.yaml
* Update e2e/Makefile
* rename apt -> deb for installer type
* execute docker start/stop with timeout between
* fixes fleet_server scenario
* Utilize fleet server in all tests
* Fix enrollment url for fleet server
* Query elasticsearch logs for endpoint security event changes
* Increase search result size for ES
* Fix issue with fleet server restarting continuously
* unpin kibana pr now that most major breakage is resolved
* force unenroll
* for new fleet bootstrap on re-enrollment
* Fix unenrollment
* Add timeout safeguard to elastic-agent execution

In some cases such as attempting to re-enroll with a revoked token, the
elastic-agent will retry indefinitely. This fix adds a safeguard utilizing
'timeout' command prepended to the elastic-agent command so that it will timeout
after TimeoutFactor

Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>
Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com>
Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>
mdelapenya added a commit to mdelapenya/e2e-testing that referenced this pull request Apr 22, 2021
Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>

Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com>
@mdelapenya mdelapenya mentioned this pull request Apr 22, 2021
9 tasks
mdelapenya added a commit that referenced this pull request Apr 22, 2021
* cli: enable loading default profiles turnkey (#943)

* cli: enable loading default profiles turnkey

Fixes #933
* update NOTICE
* Fix additional lint issues in ingest_manager_test
* Cleanup comment and trace log in GetComposeFile
* Provide better trace feedback if missing docker-compose
* Update cli/config/config.go
* chore: add back traces when extracting the files from the box (#946)
* fix: use a more comprehensive initialisation method for configs
   As go init() method is not deterministic, I found that the logger init was
   not called at the right time. With change we ensure that the Init is:
   1) called first
   2) existing it the config was already populated
* chore: add back traces when extracting the files from the box

Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>
Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com>

* Add basic check on correct service is defined for profile runs (#957)

Fixes #944

This adds a length check on the string split for verifying that the
<service/image name>:<tag> is defined when adding additional services to a
profile deployment.

Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>

* Update NOTICE (#969)

Adds additional overrides to pulling in the proper licenses

Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>

* Build binaries via goreleaser (#977)

This handles building for all supported architectures including running packr
for embedding the binary files.

This allows us to easily extend our release process for tagging official cli
releases, building in various package formats and publishing to different
package registries

Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>

* chore: simplify release process on Jenkins (#980)

* chore: archive releases in Jenkins UI

* chore: simplify release process on Jenkins

* chore: remove garbage

* chore: support retrying fetching the goreleaser script

It will also retry in the case the release command fails

* chore: set GITHUB_TOKEN

* chore: ensure workspace is clean in the worker

* chore: add release information for goreleaser

* Remove the agent config file parameters for stand alone (#983)

Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>

Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com>

* fix: run unit tests after refactor (#1067)

* chore: remove unused files after refactor

* chore: run unit tests with new layout

* fix: run unit tests on CI

* chore: include unit tests for the e2e dir

* fix: move unit tests resources for installer tests

* fix: move more test resources for unit tests

* fix: abstract path calculation from OS

Co-authored-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

device or resource busy when attempting to load elastic-agent config in docker
5 participants