Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nomad exec part 3: executor based drivers #5634

Merged
merged 5 commits into from
May 11, 2019

Conversation

notnoop
Copy link
Contributor

@notnoop notnoop commented Apr 30, 2019

Sequel to #5632 and #5633 .

Extend driver implementation for executor based drivers: raw_exec, exec, and java.

Here, the drivers use the low-level interface drivers.ExecTaskStreamingRawDriver, so drivers can delegate to the executor. Executors handle the exec stream, and construct the tty session.

We used kr/pty for terminal creation in the universal executor, because I found issues on macOS when using http://github.com/containerd/console (the pty library used by docker and imported here).

@notnoop notnoop requested a review from schmichael April 30, 2019 18:38
@notnoop notnoop force-pushed the f-nomad-exec-parts-02-cli branch from f6969b3 to 96ac203 Compare May 9, 2019 23:42
@notnoop notnoop changed the base branch from f-nomad-exec-parts-02-cli to master May 9, 2019 23:43
@notnoop notnoop force-pushed the f-nomad-exec-parts-03-executors branch from 6592b7f to c302f15 Compare May 9, 2019 23:44
Copy link
Member

@schmichael schmichael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving because I don't think any of my comments are blockers. Do the conformance tests cover a good set or error conditions or would it be possible to add some unit tests to internal funcs? I just want to make sure we don't have deadlocks or other unintended behavior when things stray off the happy path in production.

}

if err := execStream.Send(m); err != nil {
return err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do all returns need grpcutils.HandleGrpcErr? I'm a little fuzzy on where that needs to be called.

If all returns do need that check maybe just wrap the whole method in a shim to do the check?

func (d *grpcExecutorClient) ExecStreaming(...) error {
  if err := d.execStreaming(...); err != nil {
    return grpcutils.HandleGrpcErr(err, d.doneCtx)
  }
}

// actual impl  in execStreaming method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't call grpcutils.HandleGrpcErr anywhere here. It seems mostly for detecting shutdown and being able to handle it (by restarting, etc), in this case, we don't want to handle executor shutdown any specially.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! execStream is "this" (the driver) side whereas the HandleGrpcErr calls above are for remote calls to executors? I guess I'm still confused why Recv wouldn't call HandleGrpcErr as its the executor side...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see - I missed it - i'll add it.


// newTerminal function creates a tty appropriate for the command
// The returned master end of tty function is to be called after process start.
newTerminal func() (master func() (*os.File, error), slave *os.File, err error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit strange of a pattern as well. What if you made all of these funcs methods on an interface like:

type terminal interface{
  newTerminal() (...)
  setTTY(...) error
  ...
}

...and then created structs that implemented them? Might be a little more code, but I think it might be more idiomatic and unittestable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit confusing - libcontainer uses Parent/child for the socket that's controlling the tty - but the tty documentation uses master/slave e.g. https://linux.die.net/man/4/pts .

libcontainer avoids using master/slave by using RecvFd[1] and SendFd[2] but they are going against pretty much all tty and tty libraries documentation. Recognize the history of terminology, but I would rather not confuse future devs who need to consult publicly available documentations for tty.

[1] https://godoc.org/github.com/opencontainers/runc/libcontainer/utils#RecvFd
[2] https://godoc.org/github.com/opencontainers/runc/libcontainer/utils#SendFd

Copy link
Member

@schmichael schmichael May 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's break from the historical terminology on this one and use libcontainer's terminology. The kr/pty library recently fixed their terminology as well: kr/pty@7dc38fb

Copy link
Contributor Author

@notnoop notnoop May 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Umm.. actually looks like libcontainer still calls them master/slave in the docs[1][2] and in its implementation [3], and so does containerd/console[4] (the library used by runc to create console and created by ~the same maintainers) when referring to the console fields. I'll use kr/pty terminology instead.

[1] https://github.com/opencontainers/runc/blob/eb4aeed24ffbf8e2d740fafea39d91faa0ee84d0/docs/terminals.md#-new-terminal
[2] https://godoc.org/github.com/opencontainers/runc/libcontainer#Process
[3] https://github.com/opencontainers/runc/blob/master/libcontainer/console_linux.go#L9-L41
[4] https://github.com/containerd/console/blob/0650fd9eeb50bab4fc99dceb9f2e14cf58f36e7f/console_unix.go#L27-L30

}

func handleStdin(logger hclog.Logger, stdin io.WriteCloser, stream drivers.ExecTaskStream, errCh chan<- error) {
go func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general for funcs which are intended to run concurrently with the caller I prefer to let the caller call go. This makes it explicit when reading the caller's code that this func does not block and it can ease unit testing.

wg.Add(1)

go func() {
defer wg.Done()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, and I think the caller should manage the waitgroup as well.

wg.Add(1)

go func() {
defer wg.Done()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

defer wg.Done()

for {
buf := make([]byte, 4096)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought in #5632 you often reused buffers because you were careful to make send() not hold onto the buffer after returning? Is that not the case in this code? It would be nice to try to keep the approaches consistent or at least comment carefully when they diverge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - I missed these during the refactor - will make buf outside the for loop

func (e *UniversalExecutor) ExecStreaming(ctx context.Context, command []string, tty bool,
stream drivers.ExecTaskStream) error {

cmd := exec.CommandContext(ctx, command[0], command[1:]...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a len check here to make sure we don't panic if command is empty?

return fmt.Errorf("attempted to resize a non-tty session")
}

pty.Setsize(f, &pty.Winsize{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment as to why we ignore Setsize errors.

}

func setTTYSize(w io.Writer, height, width int32) error {
return fmt.Errorf("unsupported")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an error or a noop?

return fmt.Errorf("first message should always be setup")
}

return s.impl.ExecStreaming(server.Context(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually is there where we could wrap errors in that HandleGrpcErr?

@notnoop notnoop force-pushed the f-nomad-exec-parts-03-executors branch from e6bd183 to 90386e4 Compare May 10, 2019 22:45
Mahmood Ali added 5 commits May 10, 2019 19:17
Prepare executor to handle streaming exec API calls that reuse drivers protobuf
structs.
Implements streamign exec handling in both executors (i.e. universal and
libcontainer).

For creation of TTY, some incidental complexity leaked in.  The universal
executor uses github.com/kr/pty for creation of TTYs.

On the other hand, libcontainer expects a console socket and for libcontainer to
create the underlying console object on process start.  The caller can then use
`libcontainer.utils.RecvFd()` to get tty master end.

I chose github.com/kr/pty for managing TTYs here.  I tried
`github.com/containerd/console` package (which is already imported), but the
package did not work as expected on macOS.
These simply delegate call to backend executor.
@notnoop notnoop force-pushed the f-nomad-exec-parts-03-executors branch from fc4f92b to 7f76aed Compare May 10, 2019 23:17
@notnoop notnoop merged commit c25b247 into master May 11, 2019
@notnoop notnoop deleted the f-nomad-exec-parts-03-executors branch May 11, 2019 01:24
@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 10, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants