os/exec: Current requirement for all reads to be completed before calling cmd.Wait can block indefinitely #60309

mitar · 2023-05-19T18:00:41Z

What version of Go are you using (`go version`)?

$ go version
go version go1.20.4 linux/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (`go env`)?

go env Output

$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/mitar/.cache/go-build"
GOENV="/home/mitar/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/mitar/.go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/mitar/.go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.20.4"
GCCGO="/usr/bin/gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/tmp/git/go.mod"
GOWORK=""
CGO_CFLAGS="-O2 -g"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-O2 -g"
CGO_FFLAGS="-O2 -g"
CGO_LDFLAGS="-O2 -g"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build4256193346=/tmp/go-build -gno-record-gcc-switches"

What did you do?

I tried to run the following program:

package main

import (
	"io"
	"log"
	"os"
	"os/exec"
	"sync"
	"syscall"
	"time"
)

func execute() error {
	cmd := exec.Command("bash", "-c", "(sleep infinity)")

	stdout, _ := cmd.StdoutPipe()
	stderr, _ := cmd.StderrPipe()

	if err := cmd.Start(); err != nil {
		log.Printf("Error executing command: %s......\n", err.Error())
		return err
	}

	go func() {
		<-time.After(10 * time.Second)
		log.Printf("sending SIGTERM")
		cmd.Process.Signal(syscall.SIGTERM)
	}()

	var wg sync.WaitGroup

	wg.Add(1)
	go func() {
		defer wg.Done()
		io.Copy(os.Stdout, stdout)
	}()

	wg.Add(1)
	go func() {
		defer wg.Done()
		io.Copy(os.Stderr, stderr)
	}()

	wg.Wait()

	if err := cmd.Wait(); err != nil {
		log.Printf("Error waiting for command execution: %s......\n", err.Error())
		return err
	}

	log.Printf("end")

	return nil
}

func main() {
	execute()
}

This is a variation of an example from #38268. I constructed this example but in my real app I do not just copy subprocess stdout to main stdout, but I do a transformation on it. Anyway, the issue is that it seems when bash subprocess gets terminated, reader does not get EOF and so io.Copy does not end, so wg.Wait never finishes, so we never get to cmd.Wait. If I comment out wg.Wait, then cmd.Wait correctly reaps the subprocess (but of course it could mean not everything gets copied).

Currently, documentation of StdoutPipe and StderrPipe requires:

It is thus incorrect to call Wait before all reads from the pipe have completed.

But it seems it is not possible to always know when all reads from the pipe have completed?

What did you expect to see?

2023/05/19 19:44:58 sending SIGTERM
2023/05/19 19:44:58 Error waiting for command execution: signal: terminated......

And programs terminates.

What did you see instead?

2023/05/19 19:44:58 sending SIGTERM

And program never terminates.

The text was updated successfully, but these errors were encountered:

heschi · 2023-05-19T18:35:54Z

See #23019, perhaps. cc @ianlancetaylor @bcmills

bcmills · 2023-05-19T18:56:50Z

Here is a Playground variation on that program that exhibits the same behavior:
https://go.dev/play/p/34d4n3sTlDq

bcmills · 2023-05-19T19:13:11Z

Actually, the problem on the Playground seems to come from the Playground-specific faketime hook somehow interfering with the network poller. Adding a time.Sleep loop to it seems to fix it to give the expected behavior:
https://go.dev/play/p/OodMdJxLEJf

bcmills · 2023-05-19T19:23:06Z

it seems when bash subprocess gets terminated, reader does not get EOF and so io.Copy does not end

@mitar, I think the problem you observe comes from sending the signal to the bash process rather than its sleep subprocess, which causes both processes to remain blocked. (In particular, the sleep subprocess holds open stdout.)

Adjusting the program to create a process group and send the signal to the whole process group also gives the expected behavior.

package main

import (
	"io"
	"log"
	"os"
	"os/exec"
	"sync"
	"syscall"
	"time"
)

func execute() error {
	cmd := exec.Command("bash", "-c", "(sleep infinity)")
	cmd.SysProcAttr = &syscall.SysProcAttr{
		Setpgid: true,
	}

	stdout, _ := cmd.StdoutPipe()
	stderr, _ := cmd.StderrPipe()

	if err := cmd.Start(); err != nil {
		log.Printf("Error executing command: %s......\n", err.Error())
		return err
	}

	go func() {
		<-time.After(1 * time.Second)
		log.Printf("sending SIGTERM")
		syscall.Kill(-cmd.Process.Pid, syscall.SIGTERM)
	}()

	var wg sync.WaitGroup

	wg.Add(1)
	go func() {
		defer wg.Done()
		io.Copy(os.Stdout, stdout)
	}()

	wg.Add(1)
	go func() {
		defer wg.Done()
		io.Copy(os.Stderr, stderr)
	}()

	wg.Wait()

	if err := cmd.Wait(); err != nil {
		log.Printf("Error waiting for command execution: %s......\n", err.Error())
		return err
	}

	log.Printf("end")

	return nil
}

func main() {
	execute()
}

bcmills · 2023-05-19T19:23:30Z

As far as I can tell, this is more of a bash usability problem than a Go os/exec problem.

mitar · 2023-05-19T21:43:26Z

Thanks. A process group approach is nice and is a workaround when you know that your child process is misbehaving like bash. But the issue is that in my particular case I cannot know that in advance. I am writing an init process in Go and some services might behave well and expect only the top-level process to get the signal and some might need the process group approach. For example, this could be a case where process does double fork to get to daemonize. In that case I want to reap the direct child and if it happens that stdin/stdout got inherited by the second daemonized process should not prevent me from doing this.

To me the surprising thing is Go API requirement here. Without this API requirement, I could wait for the process, see that it terminated, decide to end processing of stdout and stderr (if that has not yet finished), and leave to the stray process to get reparented to PID 1. Even more, I could even decide that I want to continue processing stdout and stderr, but that I also want to reap the now dead children.

bcmills · 2023-05-19T21:47:36Z

In that case, don't use StdoutPipe and StderrPipe. Instead, you can call os.Pipe to obtain pipes that won't be closed when Wait returns. (I took such an approach in https://go.dev/cl/484741.)

It's kind of an unfortunate workaround, but it avoids the close-on-Wait race without having to make a backward-incompatible change to the interaction between StdoutPipe and Wait.

ianlancetaylor · 2023-05-19T23:29:15Z

I don't think there is anything to be done here, so closing. Please comment if you disagree.

mitar · 2023-05-20T07:40:20Z

I am confused by looking at https://go.dev/cl/484741, because from what I understand from it, it creates a pipe, calls run, calls wait, and when the overall function returns, it closes the pipe without waiting for everything to be read from it. Isn't this exactly the same as what would happen if I just used StdoutPipe and called Wait without waiting for everything to be read from it? It would first wait on the process and then close parentIOPipes.

So my question is: could documentation for StdoutPipe be relaxed and instead of saying:

It is thus incorrect to call Wait before all reads from the pipe have completed.

To something like (paraphrasing):

If you call Wait before all reads from the pipe have completed, you might miss some of the data wrote to the pipe and if the pipe got inherited by some other process, that process might be killed if it continues to write into a closed pipe?

Am I missing something here? Maybe I do not see how is your "unfortunate workaround" doing anything different with the close-on-Wait race?

ianlancetaylor · 2023-05-20T21:37:09Z

It seems to me that the current documentation is reasonable for almost all users: it says what happens and makes a recommendation. You clearly understand in detail what is happening. If StdoutPipe works for you, go ahead and use it.

bcmills · 2023-06-21T13:50:22Z

when the overall function returns, it closes the pipe without waiting for everything to be read from it. Isn't this exactly the same as what would happen if I just used StdoutPipe and called Wait without waiting for everything to be read from it?

The difference is that with your own pipe, you can continue to read from the pipe (and wait for that read to complete) even after Wait has returned. (In the linked CL, the defer closure waits until ctx.Done() is closed, which may be an arbitrarily long time after Wait.)

In contrast, with StdoutPipe the pipe is always closed (and the read canceled) as soon as Wait unblocks.

mitar · 2023-06-21T20:29:12Z

Could StdoutPipe be made so that if it is used, Wait only returns after everything from StdoutPipe has been read?

bcmills · 2023-06-21T20:36:21Z

No. That is exactly the opposite of the current documented behavior, and could introduce deadlocks in existing programs that (accidentally or intentionally) rely on that behavior.

mitar · 2023-06-21T20:47:22Z

I am just thinking that this is a common use case. One wanting to wait for all data and for program to finish. Maybe then another Wait-like method should be introduced? Like PipeWait?

bcmills · 2023-06-21T20:58:53Z

Again, you can fairly easily use os.Pipe and start your own copying goroutine. Just substituting os.Pipe for Cmd.StdoutPipe is really not that much extra code.

(Most of the extra complexity in the linked CL is because it handles arbitrary writers for stdout and stderr, not just pipes.)

heschi added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label May 19, 2023

heschi added this to the Backlog milestone May 19, 2023

bcmills added the WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. label May 19, 2023

bcmills added WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. and removed WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. labels May 19, 2023

ianlancetaylor closed this as not planned Won't fix, can't repro, duplicate, stale May 19, 2023

bcmills mentioned this issue Jun 21, 2023

os/exec: detect incorrect usages of Cmd.StdoutPipe #60908

Open

acabarbaye mentioned this issue Mar 21, 2024

[invoker] set the WaitDelay to avoid getting indefinitely blocking wait() shirou/gopsutil#1614

Closed

5 tasks

golang locked and limited conversation to collaborators Jun 20, 2024

gopherbot added the FrozenDueToAge label Jun 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

os/exec: Current requirement for all reads to be completed before calling cmd.Wait can block indefinitely #60309

os/exec: Current requirement for all reads to be completed before calling cmd.Wait can block indefinitely #60309

mitar commented May 19, 2023

heschi commented May 19, 2023

bcmills commented May 19, 2023

bcmills commented May 19, 2023

bcmills commented May 19, 2023

bcmills commented May 19, 2023

mitar commented May 19, 2023

bcmills commented May 19, 2023

ianlancetaylor commented May 19, 2023

mitar commented May 20, 2023

ianlancetaylor commented May 20, 2023

bcmills commented Jun 21, 2023

mitar commented Jun 21, 2023

bcmills commented Jun 21, 2023

mitar commented Jun 21, 2023

bcmills commented Jun 21, 2023

os/exec: Current requirement for all reads to be completed before calling cmd.Wait can block indefinitely #60309

os/exec: Current requirement for all reads to be completed before calling cmd.Wait can block indefinitely #60309

Comments

mitar commented May 19, 2023

What version of Go are you using (go version)?

Does this issue reproduce with the latest release?

What operating system and processor architecture are you using (go env)?

What did you do?

What did you expect to see?

What did you see instead?

heschi commented May 19, 2023

bcmills commented May 19, 2023

bcmills commented May 19, 2023

bcmills commented May 19, 2023

bcmills commented May 19, 2023

mitar commented May 19, 2023

bcmills commented May 19, 2023

ianlancetaylor commented May 19, 2023

mitar commented May 20, 2023

ianlancetaylor commented May 20, 2023

bcmills commented Jun 21, 2023

mitar commented Jun 21, 2023

bcmills commented Jun 21, 2023

mitar commented Jun 21, 2023

bcmills commented Jun 21, 2023

What version of Go are you using (`go version`)?

What operating system and processor architecture are you using (`go env`)?