-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement long running interactive session and sending build context incrementally #32677
Conversation
As discussed while working on credential pulls, the actual session ID should not be part of the docker remote context identifier: in the future we will want to leverage this session for other stuff than build context, and possibly we will have case where context streaming is off, but other features like credential pulls or copy-out would be required. Additionaly, client-side services seem a bit clunky both to declare and to discover. I would really like an api on client like: |
4bcb02b
to
8a5114e
Compare
builder/dockerfile/builder.go
Outdated
@@ -63,52 +70,94 @@ type Builder struct { | |||
cacheBusted bool | |||
buildArgs *buildArgs | |||
escapeToken rune | |||
dockerfile *parser.Result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think maybe there was a bad merge here? This was just removed
@simonferquel @dnephin @cpuguy83 @dmcgowan What's your thoughts on the transport interface for this? I only made the generic interface https://github.com/moby/moby/pull/32677/files#diff-8d234f28fa68cff9ee839315dbeaeb49R51 because I couldn't decide in favor of a single transport(grpc callbacks, websocket, single hijacked grpc stream, ssh etc). If we only support grpc then this could be described more with proto files. I still wouldn't prefer to use grpc generation directly because I think the callbacks should follow the I also made another poc implementation using this, for exposing ssh signers to the builder with |
client/session/grpctransport/grpc.go
Outdated
return gc, nil | ||
} | ||
|
||
func (gc *grpcCaller) Call(ctx context.Context, id, method string, opt map[string][]string) (session.Stream, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking fairly low-level. This is a major risk in depending on grpc this tightly and will couple us with the internal implementation.
Is there any reason that interceptors won't work here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On API level this doesn't depend on gRPC
at all atm(but could change by the comment above). It only depends on the transport interface. The API exposes what method/transport combination is supported for the negotiation. If that is successful it is in gRPC control that they both use the same method type(currently it is fixed) and only configuration aspect passed in by the API is the URL that needs to match on both sides. I don't think there is a higher chance to break this because then it would break generated proto mismatches as well. If we would switch to gRPC
only then this would go away and equivalent code will be generated from proto. Then the user needs to make sure that these proto files don't go out of sync.
Interceptors would let us not predefine the handlers and handle them lazily when a request comes in. That's not really an issue atm as the client would send the methods it supports anyway so that daemon components can make correct decisions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tonistiigi Sounds like I need a little more context here. Are we abstracting RPC calls to a central service some where?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stevvooe The abstraction is just so that you can write isolated features(currently rsync context sending, client-side token queries and auth requests, ssh signing) that can be exposed through remote API, without the features knowing how they would be accessed on the wire. The remote API starts by making a POST request with exposed method names and then hijacks the tcp connection for a transport. The abstraction was added because no transport method seemed clearly better than rest. Currently, there are no plans to ship support for multiple transport implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure we should abstract too much from gRPC
. Actually we already have an hidden coupling to protocol buffer
here (which seems worse than an explicit dependency) as the Stream interface implementation only works with proto messages (even if the interface does not hint to that). IMO, If we want a real transport abstraction, we should not have this restriction.
Additionaly, we already have the sync-ness of proto issue, at the message level. So I don't think it would be an issue to describe services in the proto files.
In another way, we could also want to abstract away from grpc, but still enforce protocol buffer
as the message serialization mechanism. But if we go this way, we should not abstract the object stream, but only the way []byte
payloads go accross the wire.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a user code perspective, the transport API feels a bit too complex anyway. On the exposing side, the handler registration code seems overly complex (having to define a method, that takes a callback, that you must call for each method you want to expose feels so 80's in a world with introspection-enabled languages).
On the consuming side, having to call methods by putting their name in a string, passing arguments as a combination of a map of string slices and object streaming is much lower level that I would expect from a modern transport stack.
Ultimately what I want is a way to expose a simple request/response interface implementation like
type AuthDelegate interface{
GetBasicCredentials(registry string)(username, password string, err error)
GetOAuthAccessToken(registry string, oauthParams OAuthParams) (accessToken string, err error)
}
and for streaming based services like diffcopy, interfaces like:
type DiffCopyService interface{
Proceed(outputFilePackets chan<- FilePacket, inputFileContentRequests <-chan ContentRequest) error
}
I have added 2 PRs directly on Tonis branch to fix issues on Windows:
|
Looks like this needs a rebase |
d4a4fd1
to
60126d9
Compare
65a7926
to
cb5dfa6
Compare
I've removed the WIP indicator, rebased after cli split and added the persistent cache and garbage collection. I removed the transport interface and replaced it with grpc only logic similar to what @simonferquel suggested. The discovery part is optional, independent layer. It is generated from proto but the grpc itself does not rely on it. gRPC itself only uses the generated proto code to connect a client with the handler.
I've separated out the implementation of client session itself and file syncing that is one example feature on top of that but they could be reviewed separately or the order which we use to merge actual features doesn't matter. I've also included the CLI update to ease testing, that I will remove before merge. Base: master...tonistiigi:client-session-base-nocli File syncing: |
I am still not sure about the Supports method at the method level. If we want to keep it, we should at least encapsulate that in the fssession package (version negociation of client side services should happen at the protocol level). So from fscache we should just have a function call like:
This way client code does not have to know about the notion of method url etc. In term of versioning, I propose that we experiment a little with the following scenario:
In the fs cache scenario, we have a quite easy scenario as both protocol provide the same value (with more or less efficiency), but if we take the exemple of authentication, we could have:
In the case where client cli only implements v1, a v2 daemon should implement v2 on top of a v1 client (getting oauth access token using AuthConfig provided by the v1 client), in a transparent manner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Testing nits can be handled later
defer fscache.Close() | ||
|
||
err = fscache.RegisterTransport("test", &testTransport{}) | ||
assert.Nil(t, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very minor nit: I generally use require.Nil(t, err)
for errors. The difference is that assert
is not fatal, it keeps running the rest of the function, where as require
is fatal. In the case of errors I generally think it should end the test, because the state is probably going to make the rest of the assertions fail. Using assert
here will make some test failures a bit harder to read, sometimes you end up with a nil panic after the assertions error.
I think it's fine for now.
assert.Nil(t, err) | ||
assert.Equal(t, string(dt), "data") | ||
|
||
// same id doesn't recalculate anything |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have these as different cases so that the cause of failures are more obvious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These do use the same state that is generated by previous cases.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
f9f7f8a
to
0dcbd38
Compare
Also exposes shared cache and garbage collection/prune for the source data. Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
0dcbd38
to
8f68adf
Compare
LGTM, let's take care the nits if needed in a following PR |
depends on #31984fixes #31829
This PR adds a new
POST /session
endpoint to the daemon that can be used for running interactive long-running protocols between client and the daemon.Not ready for code review yet, but open for feedback on the design. Currently, this is missing cache handling features and validations for the fs change stream.
Client calls this endpoint and exposes features that daemon can call back to. The first use case for this, also included in this PR, is the implementation to send incremental context changes for the builder. There is work going on to use this for accessing pull credentials. Later it could be used for accessing secrets from the client, transferring build artifacts back to client etc.
When client calls
/session
endpoint it requests upgrade to a different transport protocol it exposes possible callback features in the request headers. Only possible transport method currently supported is gRPC.Client API package adds only one new method
client.DialSession
that is used for requesting a hijacked connection over the HTTP API. The other logic is in separate packages so that api package doesn't add any extra dependencies.client/session
(location tbd) contains the actual code for creating the interactive sessions. This is called by thecli
and takesclient.DialSession
as a dependency. This package also defines the interfaces that need to be implemented to expose new features or new transports on top of the session.The implementation for the incrementally sending filesystem changes is in
client/session/fssession
. This uses the fs changes stream algorithm fromcontainerd
that is modified to compare files from the client with the files from the previous build instead of comparing rwlayer to rolayer. It currently uses helper packagetonistiigi/fsutil
for the file checks, this package would likely go away in the future and the responsibility will be split bycontainerd
andcontinuity
repos.fscache
component implements the access point for the filesystem data has been transferred from external sources. After building the build context remains there so it can be used by the next invocations. GC routine ofdocker prune
(tbd) can delete this data any time and would cause the client to resend full context next time.Testing out:
The feature is behind experimental flag so to test it, daemon needs to be started with
--experimental
.To invoke a builder with using the session stream instead of sending the tar context you have to set the
--stream
flag.Currently, for testing individual components this will use the session but still transfer the files with a tar archive(so there shouldn't be meaningful performance gain). To use the protocol that can do
rsync
-like incremental updates, set env variableBUILD_STREAM_PROTOCOL
todiffcopy
.