Remove the agent config file parameters for stand alone #983

adam-stokes · 2021-04-01T20:47:37Z

Signed-off-by: Adam Stokes 51892+adam-stokes@users.noreply.github.com

What does this PR do?

This removes the volume mounting of elastic-agent.yml for fleet server stand alone. The side effect is that we will have to think about how we want to approach fleet mode outside of that. Currently in stand alone that config file is not useful and it causes device and resource errors when mounting inside of docker when operations within the elastic-agent occur.

I think for the other use cases we can tests integrations etc more dynamically than having elastic-agent.yml mounted as a volume. Thoughts?

Why is it important?

Testing standalone fleet server will fail if this configuration and volume exists

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have made corresponding change to the default configuration files
I have added tests that prove my fix is effective or that my feature works
I have run the Unit tests for the CLI, and they are passing locally
I have run the End-2-End tests for the suite I'm working on, and they are passing locally
I have noticed new Go dependencies (run make notice in the proper directory)

Related issues

Closes device or resource busy when attempting to load elastic-agent config in docker #981

Follow-ups

Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>

elasticmachine · 2021-04-01T21:15:09Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Build Cause: Pull request #983 updated
Start Time: 2021-04-08T08:57:34.804+0000
Duration: 25 min 40 sec
Commit: 2a36f11

Test stats 🧪

Test	Results
Failed	0
Passed	138
Skipped	0
Total	138

Trends 🧪

💚 Flaky test report

Tests succeeded.

Expand to view the summary

Test stats 🧪

Test	Results
Failed	0
Passed	138
Skipped	0
Total	138

cachedout · 2021-04-02T09:51:17Z

I'm not totally sure I follow the explanation of the original problem. Perhaps it would be good to file an issue with a more detailed explanation of the problem so that it's easier to put this PR in the proper context?

e2e/_suites/fleet/stand-alone.go

mdelapenya · 2021-04-05T11:47:24Z

I'd say it's not ok yet to remove the file from the e2e tests. If the original use case is to use what it's bundled into the image, we could move to an approach where we dockerCopy the file into the container only for PRs. Wdyt?

adam-stokes · 2021-04-06T01:47:41Z

I'd say it's not ok yet to remove the file from the e2e tests. If the original use case is to use what it's bundled into the image, we could move to an approach where we dockerCopy the file into the container only for PRs. Wdyt?

I think this only effects running fleet in stand alone mode. For all other scenarios we can use the volumes mount directive. I still don't know why docker gives us a resource busy error in stand alone mode when the elastic-agent.yml is modified though. But not using the volumes and doing a docker copy would workaround this issue, I just don't know if this is the right approach we should take.

Another idea would be to have 2 separate profiles, one for fleet and another for fleet in standalone mode.

jalvz · 2021-04-07T11:50:40Z

I think for the other use cases we can tests integrations etc more dynamically than having elastic-agent.yml mounted as a volume. Thoughts?

I like that, I generally think passing arguments is more flexible that mounting config files, + we don't need it for standalone.

jalvz · 2021-04-07T11:51:17Z

fyi I think you need to merge master to make tests pass.

mdelapenya · 2021-04-08T08:56:15Z

I've tested this PR locally and I think it's the right think to do, as explained in #981 (comment).

I'm going to merge it. Thanks for your work here!

* master: Remove the agent config file parameters for stand alone (elastic#983) Uniquify the stand-alone step for checking agent status (elastic#993)

Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com> Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>

* master: chore: add debug info for the payload (elastic#1044) chore: add debug traces for the webhook payload (elastic#1043) fix: wrong interpolation (elastic#1042) Update Elastic Agent to not use Kibana (elastic#1036) fix: apply X version for non-master branches (elastic#1037) fix: add NodeJS to PATH (elastic#1035) fix: use an agent when building kibana (elastic#1030) fix(jjb): use a branch that exists (elastic#1029) remove uninstall step (elastic#1017) fix: delay checking stale agent version until it's used (elastic#1016) fix: use same JJB than in custom kibana (elastic#1010) chore: simplify PR template (elastic#1011) feat: support passing KIBANA_VERSION (elastic#905) [mergify] assign the original author (elastic#1009) Remove the agent config file parameters for stand alone (elastic#983) Uniquify the stand-alone step for checking agent status (elastic#993)

* Move kibana into internals, update fleet test suite * migrate docker-compose related code to internal layout * move docker related code to internal layout * move git related code to internal layout * move common attributes into internal common file system layout * move elasticsearch specifics into its own filesystem layout * move installer based code to internal layout * move shell related code to internal layout * move sanitizer code to internal layout * move io related code to internal layout * move utils into internal layout * Update package integration querying/altering * move curl to internal layout * move helm to internal layout * move kubectl into internal layout * move state internal filesystem * cleanup config in stand-alone * remove unused files * Uniquify the stand-alone step for checking agent status (#993) There were 2 steps identical in both the stand-alone and fleet test suites. Running the stand-alone test suite was picking up the step from the fleet test suite and trying to reference the FleetTestSuite structure which did not hold any of the agent information (like the hostname) for the stand alone tests. This fixes it so that the standalone test step is being referenced in the correct test suite. * Remove the agent config file parameters for stand alone (#983) * Update helm/metricbeat tests to use new layout * Fix policy endpoint update * fix panic on helm init * Fix step reference as this being merged seperately * Update function call to correct standalone step * Fix merge conflict * update ProfileEnv query/set for KibanaVersion * More fixes to agent endpoint security checks * update backend feature to call out endpoint in step * use common.TimeoutFactor in docker checkprocess state * Update adding endpoint integration * enable features for fleet server * not necessary to enroll after install * wait for filebeat/metricbeat before restarts * clear out fts.CurrentToken during beforeScenario * attach system integration on deploy * enroll if rpm * dont store fleet policy * update kibana config for latest fleet server * Update e2e/_suites/fleet/fleet.go * Update e2e/_suites/fleet/fleet.go * Update e2e/_suites/fleet/fleet.go * Update .pre-commit-config.yaml * Update e2e/Makefile * rename apt -> deb for installer type * execute docker start/stop with timeout between * fixes fleet_server scenario * Utilize fleet server in all tests * Fix enrollment url for fleet server * Query elasticsearch logs for endpoint security event changes * Increase search result size for ES * Fix issue with fleet server restarting continuously * unpin kibana pr now that most major breakage is resolved * force unenroll * for new fleet bootstrap on re-enrollment * Fix unenrollment * Add timeout safeguard to elastic-agent execution In some cases such as attempting to re-enroll with a revoked token, the elastic-agent will retry indefinitely. This fix adds a safeguard utilizing 'timeout' command prepended to the elastic-agent command so that it will timeout after TimeoutFactor Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com>

* Move kibana into internals, update fleet test suite * migrate docker-compose related code to internal layout * move docker related code to internal layout * move git related code to internal layout * move common attributes into internal common file system layout * move elasticsearch specifics into its own filesystem layout * move installer based code to internal layout * move shell related code to internal layout * move sanitizer code to internal layout * move io related code to internal layout * move utils into internal layout * Update package integration querying/altering * move curl to internal layout * move helm to internal layout * move kubectl into internal layout * move state internal filesystem * cleanup config in stand-alone * remove unused files * Uniquify the stand-alone step for checking agent status (#993) There were 2 steps identical in both the stand-alone and fleet test suites. Running the stand-alone test suite was picking up the step from the fleet test suite and trying to reference the FleetTestSuite structure which did not hold any of the agent information (like the hostname) for the stand alone tests. This fixes it so that the standalone test step is being referenced in the correct test suite. * Remove the agent config file parameters for stand alone (#983) * Update helm/metricbeat tests to use new layout * Fix policy endpoint update * fix panic on helm init * Fix step reference as this being merged seperately * Update function call to correct standalone step * Fix merge conflict * update ProfileEnv query/set for KibanaVersion * More fixes to agent endpoint security checks * update backend feature to call out endpoint in step * use common.TimeoutFactor in docker checkprocess state * Update adding endpoint integration * enable features for fleet server * not necessary to enroll after install * wait for filebeat/metricbeat before restarts * clear out fts.CurrentToken during beforeScenario * attach system integration on deploy * enroll if rpm * dont store fleet policy * update kibana config for latest fleet server * Update e2e/_suites/fleet/fleet.go * Update e2e/_suites/fleet/fleet.go * Update e2e/_suites/fleet/fleet.go * Update .pre-commit-config.yaml * Update e2e/Makefile * rename apt -> deb for installer type * execute docker start/stop with timeout between * fixes fleet_server scenario * Utilize fleet server in all tests * Fix enrollment url for fleet server * Query elasticsearch logs for endpoint security event changes * Increase search result size for ES * Fix issue with fleet server restarting continuously * unpin kibana pr now that most major breakage is resolved * force unenroll * for new fleet bootstrap on re-enrollment * Fix unenrollment * Add timeout safeguard to elastic-agent execution In some cases such as attempting to re-enroll with a revoked token, the elastic-agent will retry indefinitely. This fix adds a safeguard utilizing 'timeout' command prepended to the elastic-agent command so that it will timeout after TimeoutFactor Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com> (cherry picked from commit 5f59670) # Conflicts: # e2e/Makefile # e2e/_suites/fleet/ingest_manager_test.go # e2e/_suites/fleet/stand-alone.go # e2e/_suites/fleet/world.go

* Move kibana into internals, update fleet test suite * migrate docker-compose related code to internal layout * move docker related code to internal layout * move git related code to internal layout * move common attributes into internal common file system layout * move elasticsearch specifics into its own filesystem layout * move installer based code to internal layout * move shell related code to internal layout * move sanitizer code to internal layout * move io related code to internal layout * move utils into internal layout * Update package integration querying/altering * move curl to internal layout * move helm to internal layout * move kubectl into internal layout * move state internal filesystem * cleanup config in stand-alone * remove unused files * Uniquify the stand-alone step for checking agent status (#993) There were 2 steps identical in both the stand-alone and fleet test suites. Running the stand-alone test suite was picking up the step from the fleet test suite and trying to reference the FleetTestSuite structure which did not hold any of the agent information (like the hostname) for the stand alone tests. This fixes it so that the standalone test step is being referenced in the correct test suite. * Remove the agent config file parameters for stand alone (#983) * Update helm/metricbeat tests to use new layout * Fix policy endpoint update * fix panic on helm init * Fix step reference as this being merged seperately * Update function call to correct standalone step * Fix merge conflict * update ProfileEnv query/set for KibanaVersion * More fixes to agent endpoint security checks * update backend feature to call out endpoint in step * use common.TimeoutFactor in docker checkprocess state * Update adding endpoint integration * enable features for fleet server * not necessary to enroll after install * wait for filebeat/metricbeat before restarts * clear out fts.CurrentToken during beforeScenario * attach system integration on deploy * enroll if rpm * dont store fleet policy * update kibana config for latest fleet server * Update e2e/_suites/fleet/fleet.go * Update e2e/_suites/fleet/fleet.go * Update e2e/_suites/fleet/fleet.go * Update .pre-commit-config.yaml * Update e2e/Makefile * rename apt -> deb for installer type * execute docker start/stop with timeout between * fixes fleet_server scenario * Utilize fleet server in all tests * Fix enrollment url for fleet server * Query elasticsearch logs for endpoint security event changes * Increase search result size for ES * Fix issue with fleet server restarting continuously * unpin kibana pr now that most major breakage is resolved * force unenroll * for new fleet bootstrap on re-enrollment * Fix unenrollment * Add timeout safeguard to elastic-agent execution In some cases such as attempting to re-enroll with a revoked token, the elastic-agent will retry indefinitely. This fix adds a safeguard utilizing 'timeout' command prepended to the elastic-agent command so that it will timeout after TimeoutFactor Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com> (cherry picked from commit 5f59670) # Conflicts: # e2e/Makefile # e2e/_suites/fleet/ingest_manager_test.go # e2e/_suites/fleet/stand-alone.go # e2e/_suites/fleet/world.go Co-authored-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>

* Move kibana into internals, update fleet test suite * migrate docker-compose related code to internal layout * move docker related code to internal layout * move git related code to internal layout * move common attributes into internal common file system layout * move elasticsearch specifics into its own filesystem layout * move installer based code to internal layout * move shell related code to internal layout * move sanitizer code to internal layout * move io related code to internal layout * move utils into internal layout * Update package integration querying/altering * move curl to internal layout * move helm to internal layout * move kubectl into internal layout * move state internal filesystem * cleanup config in stand-alone * remove unused files * Uniquify the stand-alone step for checking agent status (#993) There were 2 steps identical in both the stand-alone and fleet test suites. Running the stand-alone test suite was picking up the step from the fleet test suite and trying to reference the FleetTestSuite structure which did not hold any of the agent information (like the hostname) for the stand alone tests. This fixes it so that the standalone test step is being referenced in the correct test suite. * Remove the agent config file parameters for stand alone (#983) * Update helm/metricbeat tests to use new layout * Fix policy endpoint update * fix panic on helm init * Fix step reference as this being merged seperately * Update function call to correct standalone step * Fix merge conflict * update ProfileEnv query/set for KibanaVersion * More fixes to agent endpoint security checks * update backend feature to call out endpoint in step * use common.TimeoutFactor in docker checkprocess state * Update adding endpoint integration * enable features for fleet server * not necessary to enroll after install * wait for filebeat/metricbeat before restarts * clear out fts.CurrentToken during beforeScenario * attach system integration on deploy * enroll if rpm * dont store fleet policy * update kibana config for latest fleet server * Update e2e/_suites/fleet/fleet.go * Update e2e/_suites/fleet/fleet.go * Update e2e/_suites/fleet/fleet.go * Update .pre-commit-config.yaml * Update e2e/Makefile * rename apt -> deb for installer type * execute docker start/stop with timeout between * fixes fleet_server scenario * Utilize fleet server in all tests * Fix enrollment url for fleet server * Query elasticsearch logs for endpoint security event changes * Increase search result size for ES * Fix issue with fleet server restarting continuously * unpin kibana pr now that most major breakage is resolved * force unenroll * for new fleet bootstrap on re-enrollment * Fix unenrollment * Add timeout safeguard to elastic-agent execution In some cases such as attempting to re-enroll with a revoked token, the elastic-agent will retry indefinitely. This fix adds a safeguard utilizing 'timeout' command prepended to the elastic-agent command so that it will timeout after TimeoutFactor Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com> Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>

Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com>

* cli: enable loading default profiles turnkey (#943) * cli: enable loading default profiles turnkey Fixes #933 * update NOTICE * Fix additional lint issues in ingest_manager_test * Cleanup comment and trace log in GetComposeFile * Provide better trace feedback if missing docker-compose * Update cli/config/config.go * chore: add back traces when extracting the files from the box (#946) * fix: use a more comprehensive initialisation method for configs As go init() method is not deterministic, I found that the logger init was not called at the right time. With change we ensure that the Init is: 1) called first 2) existing it the config was already populated * chore: add back traces when extracting the files from the box Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com> * Add basic check on correct service is defined for profile runs (#957) Fixes #944 This adds a length check on the string split for verifying that the <service/image name>:<tag> is defined when adding additional services to a profile deployment. Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> * Update NOTICE (#969) Adds additional overrides to pulling in the proper licenses Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> * Build binaries via goreleaser (#977) This handles building for all supported architectures including running packr for embedding the binary files. This allows us to easily extend our release process for tagging official cli releases, building in various package formats and publishing to different package registries Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> * chore: simplify release process on Jenkins (#980) * chore: archive releases in Jenkins UI * chore: simplify release process on Jenkins * chore: remove garbage * chore: support retrying fetching the goreleaser script It will also retry in the case the release command fails * chore: set GITHUB_TOKEN * chore: ensure workspace is clean in the worker * chore: add release information for goreleaser * Remove the agent config file parameters for stand alone (#983) Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com> Co-authored-by: Manuel de la Peña <mdelapenya@gmail.com> * fix: run unit tests after refactor (#1067) * chore: remove unused files after refactor * chore: run unit tests with new layout * fix: run unit tests on CI * chore: include unit tests for the e2e dir * fix: move unit tests resources for installer tests * fix: move more test resources for unit tests * fix: abstract path calculation from OS Co-authored-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>

Remove the agent config file parameters for stand alone

b39bd1a

Signed-off-by: Adam Stokes <51892+adam-stokes@users.noreply.github.com>

adam-stokes requested a review from a team April 1, 2021 20:47

mdelapenya reviewed Apr 5, 2021

View reviewed changes

e2e/_suites/fleet/stand-alone.go Show resolved Hide resolved

Merge branch 'master' into remove-volume-stand-alone

2a36f11

mdelapenya mentioned this pull request Apr 8, 2021

device or resource busy when attempting to load elastic-agent config in docker #981

Closed

mdelapenya merged commit 9aeb82c into master Apr 8, 2021

adam-stokes deleted the remove-volume-stand-alone branch April 8, 2021 12:11

mdelapenya mentioned this pull request Apr 22, 2021

chore: update backports to 7.x #1077

Merged

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove the agent config file parameters for stand alone #983

Remove the agent config file parameters for stand alone #983

adam-stokes commented Apr 1, 2021 •

edited by mdelapenya

Loading

elasticmachine commented Apr 1, 2021 •

edited

Loading

Build stats

Test stats 🧪

Trends 🧪

Test stats 🧪

cachedout commented Apr 2, 2021

mdelapenya commented Apr 5, 2021

adam-stokes commented Apr 6, 2021

jalvz commented Apr 7, 2021

jalvz commented Apr 7, 2021

mdelapenya commented Apr 8, 2021

Remove the agent config file parameters for stand alone #983

Remove the agent config file parameters for stand alone #983

Conversation

adam-stokes commented Apr 1, 2021 • edited by mdelapenya Loading

What does this PR do?

Why is it important?

Checklist

Related issues

Follow-ups

elasticmachine commented Apr 1, 2021 • edited Loading

💚 Build Succeeded

Build stats

Test stats 🧪

Trends 🧪

💚 Flaky test report

Test stats 🧪

cachedout commented Apr 2, 2021

mdelapenya commented Apr 5, 2021

adam-stokes commented Apr 6, 2021

jalvz commented Apr 7, 2021

jalvz commented Apr 7, 2021

mdelapenya commented Apr 8, 2021

adam-stokes commented Apr 1, 2021 •

edited by mdelapenya

Loading

elasticmachine commented Apr 1, 2021 •

edited

Loading