Skip to content

Commit

Permalink
consul/connect: fixed a bug where restarting proxy tasks failed.
Browse files Browse the repository at this point in the history
The first start of a Consul Connect proxy sidecar triggers a run
of the envoy_version hook which modifies the task config image
entry. The modification takes into account a number of factors to
correctly populate this. Importantly, once the hook has run, it
marks itself as done so the taskrunner will not execute it again.

When the client receives a non-destructive update for the
allocation which the proxy sidecar is a member of, it will update
and overwrite the task definition within the taskerunner. In
doing so it overwrite the modification performed by the hook. If
the allocation is restarted, the envoy_version hook will be
skipped as it previously marked itself as done, and therefore the
sidecar config image is incorrect and causes a driver error.

The fix removes the hook in marking itself as done to the view of
the taskrunner, and instead tracks this internally. The hook also
implements the TaskPreKillHook interface, so that it's state can
be correctly modified when required by task events.
  • Loading branch information
jrasell committed Apr 6, 2023
1 parent 37d1bfb commit c9c4539
Show file tree
Hide file tree
Showing 2 changed files with 129 additions and 48 deletions.
30 changes: 27 additions & 3 deletions client/allocrunner/taskrunner/envoy_version_hook.go
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,12 @@ type envoyVersionHook struct {

// logger is used to log things.
logger hclog.Logger

// done marks whether this hook has run successfully and therefore does not
// need to run again until the task is killed or restarted. The hook cannot
// use ifs.TaskPrestartRequest.Done to mark this, as it will always be
// skipped and has no way of resetting itself.
done bool
}

func newEnvoyVersionHook(c *envoyVersionHookConfig) *envoyVersionHook {
Expand All @@ -62,7 +68,14 @@ func (envoyVersionHook) Name() string {
return envoyVersionHookName
}

func (h *envoyVersionHook) Prestart(_ context.Context, request *ifs.TaskPrestartRequest, response *ifs.TaskPrestartResponse) error {
func (h *envoyVersionHook) Prestart(_ context.Context, request *ifs.TaskPrestartRequest, _ *ifs.TaskPrestartResponse) error {

// If we are being called, but the hook has been run and has no need to run
// again, exit early.
if h.done {
return nil
}

// First interpolation of the task image. Typically this turns the default
// ${meta.connect.sidecar_task} into envoyproxy/envoy:v${NOMAD_envoy_version}
// but could be a no-op or some other value if so configured.
Expand All @@ -73,7 +86,7 @@ func (h *envoyVersionHook) Prestart(_ context.Context, request *ifs.TaskPrestart
// - task is a connect sidecar or gateway
// - task image needs ${NOMAD_envoy_version} resolved
if h.skip(request) {
response.Done = true
h.done = true
return nil
}

Expand All @@ -95,7 +108,18 @@ func (h *envoyVersionHook) Prestart(_ context.Context, request *ifs.TaskPrestart
// Set the resulting image.
h.logger.Trace("setting task envoy image", "image", image)
request.Task.Config["image"] = image
response.Done = true
h.done = true
return nil
}

// PreKilling implements the ifs.TaskPreKillHook interface.
//
// When a sidecar task is restarting, it is possible a previous non-destructive
// allocation update has overwritten the image modification which takes place
// within Prestart. It is necessary that the hook is triggered again to replace
// the image tag with a value that can be downloaded.
func (h *envoyVersionHook) PreKilling(_ context.Context, _ *ifs.TaskPreKillRequest, _ *ifs.TaskPreKillResponse) error {
h.done = false
return nil
}

Expand Down
Loading

0 comments on commit c9c4539

Please sign in to comment.