-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade rules_go to v0.47.0 #6436
Conversation
b098753
to
a408be3
Compare
// require.Error(t, res.Error) | ||
// assert.Contains(t, res.Error.Error(), "signal: terminated") | ||
// assert.Equal(t, -1, res.ExitCode) | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bduffany FYI.
I thought this test only killed the sh
shell, but for some reason sigterm is being ignored here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can restore the original behavior with signal.Reset(os.SIGTERM)
, but its slightly suspicious that this is needed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me try bisecting rules_go tmr to verify the root cause
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bisect confirmed the culprit PR.
signal.Reset
did not fix the test, but adding our own signal.Notify
handler worked.
d2452e8
to
63880db
Compare
63880db
to
1f6b176
Compare
@@ -140,7 +140,7 @@ func TestExecStreamed_Crash(t *testing.T) { | |||
Arguments: []string{"bash", "-c", ` | |||
echo foo-stdout >&1 | |||
echo bar-stderr >&2 | |||
kill 0 -KILL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any idea why this change is needed? I'd be surprised if upgrading rules_go broke this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://buildbuddy.buildbuddy.io/invocation/d1ddf6d1-cf88-4de1-89bd-bf39c97ca43d here is an example of this failure.
From https://man7.org/linux/man-pages/man1/kill.1.html, we should use kill -KILL -- 0
to be sure.
It's unclear to me why bazel-contrib/rules_go#3920 triggered this though.
// Since https://github.com/bazelbuild/rules_go/pull/3920 | ||
// We need to make sure that our go test binary properly capture and handle SIGTERM | ||
_, stop := signal.NotifyContext(ctx, syscall.SIGTERM) | ||
|
||
res := runSh(ctx, "kill -TERM $$") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the SIGTERM signal here should only be getting sent to the sh
process running the kill -TERM
command so I'm a bit confused why this change is needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh, I am not sure either... I tried to dump out the process tree plus the process group id but I could not tell for certain how the sigterm could propagate from the child process, the /bin/sh
shell, to the parent process (the go test binary). Or if there is something else (i.e. Bazel) sending the go binary sigkterm.
A simpler solution here for me is to patch rules_go with the old signal.Notify behavior. Which, as I discussed with the rules_go PR author, seems like it would work.
Raised the rules_go PR for this and in the meantime, patch our own rules_go with that fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what was the error that was happening without the _, stop := signal.NotifyContext(ctx, syscall.SIGTERM)
workaround? (before applying the revert patch to rules_go)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no the error was pretty much caused by the change in rules_go. I have bisected rules_go to that exact PR.
I think that PR made it so that the test itself could not use sigterm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, what was the test failure you were seeing? I understand it was caused by rules_go but trying to understand why
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I misunderstood your question 🤦
Here is an example failure with the revert of patching rules_go https://buildbuddy.buildbuddy.io/invocation/f016c53f-2073-476b-a338-939a56ef720d?target=%2F%2Fenterprise%2Fserver%2Fremote_execution%2Fcommandutil%3Acommandutil_test&targetStatus=6
Here is with a small patch to get a bit more verbose output
--- a/enterprise/server/remote_execution/commandutil/commandutil_unix_test.go
+++ b/enterprise/server/remote_execution/commandutil/commandutil_unix_test.go
@@ -123,6 +123,7 @@ func TestRun_Killed_ErrorResult(t *testing.T) {
{
res := runSh(ctx, "kill -TERM $$")
+ t.Logf("ExitCode: %d\nStdout: %s \n---\nStderr: %s\n", res.ExitCode, string(res.Stdout), string(res.Stderr))
require.Error(t, res.Error)
assert.Contains(t, res.Error.Error(), "signal: terminated")
assert.Equal(t, -1, res.ExitCode)
We were expecting an error, but the command simply executed successfully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh, TIL about sigaction
. IIUC, the go runtime is ignoring SIGTERM by ultimately doing a syscall like sigaction(SIGTERM, &sigaction{sa_handler: SIG_IGN}, NULL)
. The installed action handlers are then inherited by child processes, which means that child processes will ignore signals too.
So it does seem like rules_go should not be doing signal.Ignore
as that will likely break more people than just us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1f6b176
to
600d4aa
Compare
No description provided.