Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(plugin): fix plugin family runner race condition #1824

Merged
merged 2 commits into from
Jun 21, 2024

Conversation

zsoltkacsandi
Copy link
Contributor

@zsoltkacsandi zsoltkacsandi commented Jun 20, 2024

Description

If an error occurs, we send the result back to the family manager on a channel and return from the function:

if err := rr.WaitReady(ctx); err != nil {
  s.sendResults(retResults, fmt.Errorf("failed to wait for plugin scanner to be ready: %w", err))
  return
}

As part of the return, the deferred Stop and Remove lifecycle functions begin to run:

defer func() {
	if err := rr.Stop(ctx); err != nil {
		s.logger.WithError(err).Errorf("failed to stop runner")
	}

	// TODO: add short wait before removing to respect container shutdown procedure

	if err := rr.Remove(ctx); err != nil {
		s.logger.WithError(err).Errorf("failed to remove runner")
	}
}() //nolint:errcheck

Once the family manager gets the result in the result channel, it does not really care if the underlying goroutines are finished or not:

case r := <-result:

This results in a race condition, once the family manager gets all the results it expects, the program will terminate regardless of whether the plugins' Stop() and Destroy() functions have finished or not.

We need to call the Exit and Destroy lifecycle functions before sending back the result.

Type of Change

  • Bug Fix
  • New Feature
  • Breaking Change
  • Refactor
  • Documentation
  • Other (please describe)

Checklist

  • I have read the contributing guidelines
  • Existing issues have been referenced (where applicable)
  • I have verified this change is not present in other open pull requests
  • Functionality is documented
  • All code style checks pass
  • New code contribution is covered by automated tests
  • All new and existing tests pass

@zsoltkacsandi zsoltkacsandi requested a review from a team as a code owner June 20, 2024 09:11
@zsoltkacsandi zsoltkacsandi self-assigned this Jun 20, 2024
@zsoltkacsandi zsoltkacsandi changed the title fix(plugin): fix race condition fix(plugin): fix plugin family runner race condition Jun 20, 2024

This comment has been minimized.

@akijakya
Copy link
Contributor

Nice catch! This behavior can be found in other families and should be fixed there as well...

Just a thought: what if the cleanup func is added to the Scanner struct, and the sendResults is refactored in a way that it calls s.cleanup first, it can even be renamed to finish() or something?

@zsoltkacsandi
Copy link
Contributor Author

Just a thought: what if the cleanup func is added to the Scanner struct, and the sendResults

That can work too, the reason why I chose this approach, because it's more expressive.
If someone reads the code, you can read as a sentence, first we are cleaning up, then we are sending back the result, and then returning from the function. The intent is clear.

To make part of the sendResults feels like creating a side effect. The name of the function says that I am sending back a result. There would be a hidden behaviour that you wouldn't expect without taking a look into the sendResults function.

it can even be renamed to finish() or something?

Totally agree, we can use a more descriptive name here. finishRunner or something like that?

@akijakya
Copy link
Contributor

Totally agree, we can use a more descriptive name here. finishRunner or something like that?

finishScannerRun?

Copy link

Hey!

Your images are ready:

  • ghcr.io/openclarity/vmclarity-apiserver-dev:pr1824-88aa0e94dec793aea0f1096051641489d2a69358
  • ghcr.io/openclarity/vmclarity-cli-dev:pr1824-88aa0e94dec793aea0f1096051641489d2a69358
  • ghcr.io/openclarity/vmclarity-cr-discovery-server-dev:pr1824-88aa0e94dec793aea0f1096051641489d2a69358
  • ghcr.io/openclarity/vmclarity-orchestrator-dev:pr1824-88aa0e94dec793aea0f1096051641489d2a69358
  • ghcr.io/openclarity/vmclarity-plugin-kics-dev:pr1824-88aa0e94dec793aea0f1096051641489d2a69358
  • ghcr.io/openclarity/vmclarity-ui-dev:pr1824-88aa0e94dec793aea0f1096051641489d2a69358
  • ghcr.io/openclarity/vmclarity-ui-backend-dev:pr1824-88aa0e94dec793aea0f1096051641489d2a69358

@zsoltkacsandi zsoltkacsandi added this pull request to the merge queue Jun 21, 2024
Merged via the queue into main with commit 90f87ca Jun 21, 2024
40 checks passed
@zsoltkacsandi zsoltkacsandi deleted the fix-plugin-family-runner-race-condition branch June 21, 2024 09:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants