-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Harden collector shutdown while updating #597
Conversation
2bf4f95
to
e52785c
Compare
e52785c
to
a8121ab
Compare
a8121ab
to
c72e3ed
Compare
c72e3ed
to
1b9d3b0
Compare
// Stop the service before backing up the install directory; | ||
// We want to stop as early as possible so that we don't hit the collector's timeout | ||
// while it waits to be shutdown. | ||
service := service.NewService(logger, installDir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm not going to complain too much, but maybe to something to think about when we give this whole thing a once over. it would have been nice if we could have just had a method to call on the installer to do the stopping, instead of copying "a bunch" (i know its really not that much) of logic from it to here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good call out, definitely should be a part of the larger refactor.
* change service timeouts * update non-windows with new timeout * fix windows test * stop the service before rollback * fix install tests
* change service timeouts * update non-windows with new timeout * fix windows test * stop the service before rollback * fix install tests
* feat: Add windows service definition to archive (#515) * feat: Create Updater artifact (#529) * add new binary to everything but windows * add to windows msi * add version flag to updater * build using combined target * fix manual build * add license header * break updater into separate module * add new module to dependabot * copy version pkg to new updater internal pkg To not have a dependency on the root module * Workaround for securego/gosec#501 * feat: Add tarball download + unarchiving to updater (#538) * Add download and content-hash verification to updater * add a couple more tests for some edge cases * lint * gosec errors * fix defer f.close properly * fix tests on windows * more windows specific testing * fix final test failure on windows * more line-ending test fixes * feat: Added OpAMP PackageStatuses functionality & basic response to PackagesAvailable (#550) * Added OpAMP PackageStatuses functionality & basic response to PackagesAvailable * Add new data model for marshal/unmarshaling OpAMP package statuses. * feat: Add ability to install unpacked artifacts in updater (#562) * start artifact install logic * fix uninstall service step * add tests for windows service manager * remove kardiano/service dependency * check filepath with spaces * more tests, hook up to main * naming * add licenses * gosec fixes * linux gosec + some lint issues * linter * fix formatting of windows service test * actually fix formatting * guard linux/win service tests behind tag * run tests as sudo on linux * fix inverted conditional * split updater integration tests into separate target * refactor package for better encapsulation * update darwin service to load/unload for start/stop * fix installDir for windows after rename * test replaceInstallDir * add license to service_test.go * fix make target phony * add some comments * add start of readme * add a (very basic) readme * use switch instead of multiple ifs * Add comments to moveFiles * fix failing darwin test * Moved code to download, verify, and extract OpAMP package file from updater to collector (#565) * OpAmp Package Update Glue (#567) Switched PackageStatuses yaml to a JSON file to prevent partial reads by Updater. Removed excess fields for package status. We should be able to communicate with available status and error message. If just started an install, will prevent another PackagesAvailable message from starting another install. If OpAMP client errors out at any point, sets the status to failed with an error message (if possible) in the JSON file. This will allow the updater to quickly shut down the collector and start up the rollback one (which will then send the message to BindPlane). On BindPlane connect, will check if the status is installing. If so, will check if Server version matches current version. Based on this will either set status to success or fail and write to JSON file for BindPlane to notice. It should only try to send a message immediately to BindPlane if it was a success. * Moved package install function to goroutine * Add mutex for updatingClient flag in client (#570) * Created packagestate module (#579) * Broke package status objects into their own file Signed-off-by: Corbin Phelps <corbin.phelps@bluemedora.com> * Updated main module to reference packagestate module Signed-off-by: Corbin Phelps <corbin.phelps@bluemedora.com> * Fixed licsense check for new module Signed-off-by: Corbin Phelps <corbin.phelps@bluemedora.com> * Created interface and mocks for package state manager Signed-off-by: Corbin Phelps <corbin.phelps@bluemedora.com> * Changed PackageStateProvider to use interface of StateManager Signed-off-by: Corbin Phelps <corbin.phelps@bluemedora.com> * Fixed up linux test for package state manager Signed-off-by: Corbin Phelps <corbin.phelps@bluemedora.com> * feat: Updater rollback (#584) * start rollback * more wip * more tests * add licenses, more testing * split out action stuff to separate package, more testing Needed to do this due to circular deps in mocks * move service test data * fix up darwin tests * Fix linux service to fit new service interface * fix windows service (todo: tests) * add windows backup test * fix service action pointin to wrong file * Logic for Updater to monitor Collector Status (#581) * Added default file name into package state to be accessed by updater Signed-off-by: Corbin Phelps <corbin.phelps@bluemedora.com> * Added logic to monitor status of collector from updater Signed-off-by: Corbin Phelps <corbin.phelps@bluemedora.com> * Added tests and fixuped some ci-checks issues Signed-off-by: Corbin Phelps <corbin.phelps@bluemedora.com> * Ran make add-license Signed-off-by: Corbin Phelps <corbin.phelps@bluemedora.com> * Added mocks for updater state monitor Signed-off-by: Corbin Phelps <corbin.phelps@bluemedora.com> * Pre-PR fixups Signed-off-by: Corbin Phelps <corbin.phelps@bluemedora.com> * Modified monitor state logic to be more flexible on errors Signed-off-by: Corbin Phelps <corbin.phelps@bluemedora.com> * fix revive linting errors * fix windows gosec error * update gosec to ignore test program * refactor CopyFile to allow failure on overwrite * refactor file action to take relative dir * add interface enforcement to actions * add nosec to open func * split windows service backup function into a few functions Co-authored-by: Corbin Phelps <corbin.phelps@bluemedora.com> * feat: Updater logging (#589) * add zap logging * add log level flag * add license headers * lint fixes * remove unimplemented comment * skip NewLogger test on windows * remove ability to specify level * remove rotation * remove copyFiles receiver * remove stringer implementation * remove previous log file on logger creation * tidy go mod * re-add stringer for copy file action * feat: Collector starts up Updater (#590) * Adds ability to start Updater and monitor it for failure * Fixes new collector erroring on execution after it is copied * Added KillMode=process to the linux service file in order to orphan the updater * Added disconnection flag to avoid failure messages in graceful shutdown * Added linux service file to tarball Co-authored-by: Corbin Phelps <corbin.phelps@bluemedora.com> * Fixed go.sum Signed-off-by: Corbin Phelps <corbin.phelps@bluemedora.com> * feat: Remove tmpdir from updater (#591) * starting on changing installDir * fix and add tests * fix gosec issues * add license * fix formatting * remove command line option from collector * remove redundant parameters, rename copyFiles functions * Fixed name of package updater looks at (#592) Signed-off-by: Corbin Phelps <corbin.phelps@bluemedora.com> * feat: Copy updater executable to CWD of collector before executing (#594) * move updater to CWD before running * fix darwin, add test * fix windows + windows tests * make tests parallel for updater manager * gosec * fix: Windows updater log fix (#595) * Added os specific log path Signed-off-by: Corbin Phelps <corbin.phelps@bluemedora.com> * make tests run on windows * make fmt, fix function redefinitions * reduce diff * add license Co-authored-by: Corbin Phelps <corbin.phelps@bluemedora.com> * feat: Updater cleans up temporary directory (#596) * remove tmpdir on rollback or update success * remove temp directory in failure scenarios * comment why we use a noop logger for failure * move installer creation to where it's actually used * fix redundant calls to removeTmpDir * fix: Pass install dir into service (#598) * pass install dir into service * pass install dir to service update action service * Updater properly installs and rollbacks JMX Jar. (#600) Also making sure that we use the backed up file permissions when rolling back a file that no longer exists in the install directory * feat: Harden collector shutdown while updating (#597) * change service timeouts * update non-windows with new timeout * fix windows test * stop the service before rollback * fix install tests * fix: If the collector detects an error updating, clean temporary directory (#601) * Have the collector clean artifacts if update fails early * fix client tests * Updated Makefile & GitHub Action workflow so Updater binary has license scans (#604) * Fixes tmp dir for update to have 0700 permissions (#609) * fix(updater): Do Update in place for windows service (#605) * do Update in place for windows service * add a few comments * feat: Refactor updater main (#608) * refactor main; tests WIP * add tests for Updater * fix lint * add license * rename installer and rollbacker to avoid confusion w/ interface * final debug log to info log * empty commit for testing * fix(updater): Enable debug logs (#613) * feat: Refactor Updater's file package (#611) * break CopyFile into separate functions * break overwrite flag into two functions * fix comment for CopyFileOverwrite * small tweaks * tests for file package * fix linux build * remove todo * explain why we continue even on error. * empty commit for testing * Added better logging/messaging around collector package updating (#614) Co-authored-by: Brandon Johnson <brandon.johnson@bluemedora.com> Co-authored-by: Brandon Johnson <binaryfissiongames@gmail.com> Co-authored-by: Corbin Phelps <corbin.phelps@bluemedora.com>
Proposed Change
Checklist