Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using socketpair on Unix for Process.Start redirected stdin/out/err #34158

Closed
stephentoub opened this issue Mar 26, 2020 · 8 comments · Fixed by #34861
Closed

Consider using socketpair on Unix for Process.Start redirected stdin/out/err #34158

stephentoub opened this issue Mar 26, 2020 · 8 comments · Fixed by #34861

Comments

@stephentoub
Copy link
Member

stephentoub commented Mar 26, 2020

On Linux and macOS today when we fork a process as part of Process.Start, if RedirectStandardInput/Output/Error are set, we create anonymous pipes and use those for the new process' stdin/out/err. The resulting file descriptors are then wrapped in FileStreams, which are in turn wrapped in StreamReader/Writers.

A downside of this, however, is that async operations performed on those readers/writers end up doing async-over-sync as part of the FileStream implementation, which just queues work items that do Read/Write as part of ReadAsync/WriteAsync, and that can be problematic for systems that spawn lots of processes and monitor their stdout asynchronously.

There are a few ways to address that performance issue. One is to enable the epoll-based architecture we have underlying System.Net.Sockets to be used by any type with any relevant kind of file descriptor, but that is a significant effort. In the meantime, we might be able to achieve just as good a result, maybe better, specifically for Process, by using socketpair instead of pipe, and constructing NetworkStreams with Sockets instead of FileStreams. This would end up implicitly using the right support under sockets, it would end up providing more efficient implementations of APIs like ReadAsync (which has a non-allocating implementation in NetworkStream), etc.

@tmds, @wfurt, any significant roadblocks to this spring to mind?

@stephentoub stephentoub added this to the 5.0 milestone Mar 26, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Mar 26, 2020
@fbrosseau
Copy link

The identical effect (with very different code) holds true for Windows, where it eventually uses CreatePipe to get anonymous pipe pairs, which do not support overlapped operations. Here:

bool ret = Interop.Kernel32.CreatePipe(out hReadPipe, out hWritePipe, ref lpPipeAttributes, nSize);

Could that be changed to a pair of named pipes for example, which would allow the full power of overlapped/IOCP? I guess it would merit its own issue because the two pieces of code are unrelated.

@stephentoub
Copy link
Member Author

Maybe, but see the large comment just below that.

@fbrosseau
Copy link

fbrosseau commented Mar 26, 2020

Ah yes... it would be a breaking change to processes that eventually WriteFile without an overlapped pointer. Could this be opt-in for cases where the subprocess is known by the caller to be compatible?

But the identical synchronous behavior could be kept if the subprocess-side of the named pipe is non-overlapped, no?

(Sorry for kind of hijacking the linux/mac topic!)

@tmds
Copy link
Member

tmds commented Mar 27, 2020

It would be great if we improve this. Next to potential ThreadPool exhaustion, there is also no way to abort the pending read that is blocking the ThreadPool thread.

I think using socketpair is an interesting idea. socket/pipe are different, but for read/write usage they should be virtually the same.
Some differences, which we may, or may not care about:

  • Sockets are bi-directional.
  • They also allow sending file descriptors.
  • Buffer sizes may be different, and have different APIs for control.

We should check how read/write behave when the .NET end has closed, and verify if it is similar to pipes. I expect this to be the case.

One is to enable the epoll-based architecture we have underlying System.Net.Sockets to be used by any type with any relevant kind of file descriptor, but that is a significant effort.

fyi, for the Socket perf experiments I've written an AsyncEngine that is less tied to Sockets. It deals with an abstract AsyncOperation.

@stephentoub
Copy link
Member Author

Doing this correctly will likely require #862.

@stephentoub
Copy link
Member Author

stephentoub commented Apr 3, 2020

@davidfowl, could you try this out and see if it addresses the issue you hit in:
dotnet/tye#81
?

Here's a branch:
stephentoub@ffa06e1
It requires building several native/managed components, so best bet would just do a full build of dotnet/runtime (./build.sh -subsetcategory coreclr-libraries -c release) and using the resulting shared framework bits in the test host directory (e.g. ~/repos/runtime/artifacts/bin/testhost/netcoreapp5.0-Linux-Release-x64/shared/Microsoft.NETCore.App/5.0.0/).

Perf of sync operations on the Process.StandardOutput/Input/Error is about the same. There's more CPU overhead now when using the async operations than there was before, by a non-trivial amount, but that's also pretty common: there's more set up to be done, more syscalls to be made when data isn't available, etc. However, the async operations also now allocate significantly less than they did before, don't block threads, and are cancelable. On the balance, I think it's a reasonable change to make, assuming it unblocks the scenarios we care about.

@tmds, @wfurt, obviously please feel free to take it for a spin as well.

I'm not putting up a PR yet, as getting it in is blocking on approving #862.

@davidfowl
Copy link
Member

How much of the clr do I have to build to get this.

@stephentoub
Copy link
Member Author

System.Net.Sockets.dll, System.Diagnostics.Process.dll, and libSystem.Native.so. You could build them individually if you want, but it's probably easier just to use the command I shared to build everything. Up to you :)

@jeffhandley jeffhandley removed the untriaged New issue has not been triaged by the area owner label Sep 17, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants