Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Watchdog: use abort action as a default #13208

Closed
wants to merge 2 commits into from

Conversation

KBaichoo
Copy link
Contributor

Signed-off-by: Kevin Baichoo kbaichoo@google.com

Commit Message: Watchdog: use abort action as a default if killing is enabled and we're on a supported platform.
Additional Description:
Risk Level: low
Testing: unit tests
Docs Changes: N/A
Release Notes: TBD
See #12860 (review) for comment chain about making this a default

…re on a supported platform.

Signed-off-by: Kevin Baichoo <kbaichoo@google.com>
// Add abort_action if it's available on the given platform and killing is
// enabled since it's more helpful than the default kill action.
#ifndef WIN32
envoy::extensions::watchdog::abort_action::v3alpha::AbortActionConfig abort_config;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question for the reviewers: There's some tension here given that the action AbortAction extension and folks can choose to omit it, but if they do this might break and they'd need to patch it.

Thoughts?


// Add abort_action if it's available on the given platform and killing is
// enabled since it's more helpful than the default kill action.
#ifndef WIN32
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think this part should be platform agnostic. And it will be up to the implementation of this events to handle cross platform differences.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, yes, but the Abort Action implementation is currently marked to be excluded from Windows since it doesn't support Windows yet.

See the prior PR: #12860 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for missing the review on the initial PR, but did we consider on platforms like Windows to enable the extension but fall back to the basic watchdog full process kill/panic behavior (and log the thread id to be killed and maybe more diagnostic info)?

We could use Bazel platform selection instead of ifdefs to build a small platform specific "thread kill" library and that would be localized to the abort action extension (and allow us to iterate further if we are able to support the extension fully on Windows).

Meta question: Not sure if the philosophy of the project is to just disable a feature/extension rather than offer a degraded experience on a different platform

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sunjayBhatia yes, if you look at this intermediate commit I was originally just having windows print out unimplemented, and the default panic behavior would kill the process.

Regarding the philosophy, I've seen some other extensions that are also just disabled for windows -- some of which might just arise from lack of resources, but ultimately across platforms there are some differences in capabilities (one example being signals).

Any thoughts @envoyproxy/api-shepherds ?

Copy link
Member

@sunjayBhatia sunjayBhatia Sep 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen some other extensions that are also just disabled for windows

Yeah, the difference there is that the other currently disabled extensions have external dependencies that require a decent amount of work to get compiling on Windows, we haven't gotten around to those yet so we disabled them, ultimately we will get these working with parity to other platforms

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK sorry I didn't realize that this is the direction we are going to go. Here is what I would do:

  1. Per my other comment I would move the code back to core, since it's not really an extension anymore.
  2. I would have a platform function to kill a thread.
  3. On Windows just abort/panic the process as it did before. On Linux do what you have now. This should allow the extension to be compiled on all platforms and do something sane.

Does that work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM. Will close this PR while making those changes,

@yanavlasov
Copy link
Contributor

@envoyproxy/windows-dev

Signed-off-by: Kevin Baichoo <kbaichoo@google.com>
@@ -55,6 +55,7 @@ envoy_cc_library(
"@envoy_api//envoy/config/bootstrap/v3:pkg_cc_proto",
"@envoy_api//envoy/config/metrics/v3:pkg_cc_proto",
"@envoy_api//envoy/config/trace/v3:pkg_cc_proto",
"@envoy_api//envoy/extensions/watchdog/abort_action/v3alpha:pkg_cc_proto",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is going to be built-in, it shouldn't actually be an extension. Can you move it back into the core code? (Even if it is still registered as an extension.)

Copy link
Contributor Author

@KBaichoo KBaichoo Sep 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify a bit more on what you mean by "move it back into the core core"? For example, would I just have this as an extension that'll register on all platforms?

Thanks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I just mean it shouldn't be in the extensions directory IMO as it's no longer an extension. It's always linked. It's not a big deal and you can do as a follow up if you like.


// Add abort_action if it's available on the given platform and killing is
// enabled since it's more helpful than the default kill action.
#ifndef WIN32
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK sorry I didn't realize that this is the direction we are going to go. Here is what I would do:

  1. Per my other comment I would move the code back to core, since it's not really an extension anymore.
  2. I would have a platform function to kill a thread.
  3. On Windows just abort/panic the process as it did before. On Linux do what you have now. This should allow the extension to be compiled on all platforms and do something sane.

Does that work?

@KBaichoo KBaichoo closed this Sep 24, 2020
htuch pushed a commit that referenced this pull request Oct 19, 2020
Use abort action as a default if killing is enabled and we're on a supported platform.

Risk Level: low
Testing: unit tests
Docs Changes: Included
Release Notes: Included

See PR #13208 for context as the reason it's part of core envoy and not an extension.

Signed-off-by: Kevin Baichoo <kbaichoo@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants