-
Notifications
You must be signed in to change notification settings - Fork 616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow replacing ingress network #2028
Conversation
This will now need a rebase, but wanted the review process to start. |
api/objects.proto
Outdated
// this information, so that on reboot or leader re-election, new leader | ||
// will not automatically create the routing mesh if it cannot find one | ||
// in store. | ||
bool routingMeshOff = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments here:
- protobuf names should use underscores, for example
routing_mesh_off
. This gets converted to CamelCase by the protobuf compiler for Go code. - Maybe
routing_mesh_disabled
is better? - It's too bad we don't have a message inside
Cluster
to store networking options, to organize this and stuff likenetwork_bootstrap_keys
andencryption_key_lamport_clock
. Maybe it's not too late to start one, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not opt for enabled/disabled
because this variable indicates a state, which could be just temporary until user creates a custom ingress network, I thought on/off
was more appropriate for a state while enabled/disabled
is more for a config (which we do not have).
But not being a native speaker, routing_mesh_disabled
works just as fine for me. Will change to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it may be possible to avoid keeping this state. Right now we create the default cluster object here:
What if we created the ingress network at the same time, but only if the CreateCluster
call succeeded? This would ensure that a swarm cluster always gets an ingress network by default, but if it is deleted later on, it would not be recreated automatically.
I'm not sure if this would be sort of hacky. It seems like it would simplify things a bit, but I don't have a strong opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am happy to avoid the extra state, but the other options I had before were also sort of hacky, like to add a mock ingress network object to store, to denote the ingress nw need not to be created.
IIUC, the allocator would attempt to create the default cluster object, the failure will indicate this is a leader reelection, or a cluster start after a shutdown. In that case I can
if ErrNoIngress && cannot create cluster object {
routingMeshDisabled == true
}
But what does it mean that allocator succeeds in creating the default cluster object ?
Once allocator gets a chance to run, isn't the default cluster obj already present in store, as created by the manager ? Or is the allcoator cluster object a mock one not seen by anybody else ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rejected adding the mock ingress network, because if user downgrade docker than previous version would not handle the mock network properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was suggesting moving ingress network creation out of the allocator, to where the initial cluster and node objects are created (see link). It kind of makes sense to do that, since there would be a single place where initial resources are created.
Every time the manager enters the leader state, it tries to create a default cluster object, because this might be a fresh swarm that has never had a leader before. We could also create an ingress network at this time, if creating the default cluster object succeeded (which means this is indeed a fresh swarm).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting.
Ideally would prefer allocating the networking stuff all in one place, in allocator/network.go.
But what you are suggesting would mean to just push to store the default ingress network object, not yet allocating it.
Let me give it a try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested it, it works. Thanks.
@@ -321,6 +321,9 @@ message NetworkSpec { | |||
// enabled(default case) no manual attachment to this network | |||
// can happen. | |||
bool attachable = 6; | |||
|
|||
// Ingress indicates this network will provide the routing-mesh. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's mention in the comment that legacy ingress networks won't have this flag set, and instead have the com.docker.swarm.internal
label, and the name ingress
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
manager/allocator/network.go
Outdated
clusters, err = store.FindClusters(readTx, store.ByName(store.DefaultClusterName)) | ||
}); err != nil { | ||
return errors.Wrapf(err, | ||
"failed to retrieve cluster object to check routing mesh state during init: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the %v
with the err
argument. This will include err
twice in the string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, I will change it.
manager/allocator/network.go
Outdated
// If ingress network is not found, create one right away | ||
// using the predefined template. | ||
if len(networks) == 0 { | ||
nc.clusterID = clusters[0].ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should check that clusters
has a nonzero length. FindClusters
will not return an error if it doesn't find any matching clusters.
There should always be a default
cluster, so it would be a bug if one didn't exist. But I think it's better to return an error than crash in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to know, I was in fact not sure about this.
I will follow the length check logic I saw in other part of the code.
manager/allocator/network.go
Outdated
@@ -992,3 +1090,40 @@ func updateTaskStatus(t *api.Task, newStatus api.TaskState, message string) { | |||
t.Status.Message = message | |||
t.Status.Timestamp = ptypes.MustTimestampProto(time.Now()) | |||
} | |||
|
|||
func GetIngressNetwork(s *store.MemoryStore) (*api.Network, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll need a comment to pass lint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep
manager/controlapi/network.go
Outdated
// if you change this function, you have to change createInternalNetwork in | ||
// the tests to match it (except the part where we check the label). | ||
if err := validateNetworkSpec(request.Spec, s.pg); err != nil { | ||
if err := s.validateNetworkRequest(request.Spec, s.pg); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be called inside the s.store.Update
call below, to make sure multiple CreateNetwork
invocations can't race to create multiple ingress networks.
You might consider changing GetIngressNetwork
to take a store.ReadTx
instead of a store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good suggestion. I was under the impression we only run best effort check, and a concurrent ingress nw creation will anyway fail to be allocated by the allocator.
If we can protect against the race is better.
manager/allocator/network.go
Outdated
nc.clusterID = clusters[0].ID | ||
|
||
// Check if we have the ingress network. If not found create | ||
// it before reading all network objects for allocation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the code to check for the ingress network and create it if necessary should be inside a store.Update
transaction. Otherwise, something else could create an ingress network through the API at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the advice. I will see if I can refactor it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this code is only executed on the elected leader once the allocator is started (run()).
Allocator is the only component which will check and create the ingress network if needed.
Also, I do not think the system can receive a user request to create an ingress network at this time.
Hopefully we do not need to worry about concurrent creations here.
manager/allocator/network.go
Outdated
if err != nil { | ||
return errors.Wrap(err, "failed to find ingress network after creating it") | ||
return errors.Wrapf(err, "failed to find ingress network after creating it: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
%v", err
is redundant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
manager/controlapi/service.go
Outdated
if doesServiceNeedIngress(service) { | ||
if _, err := allocator.GetIngressNetwork(s.store); err != nil { | ||
if grpc.Code(err) == codes.NotFound { | ||
return grpc.Errorf(codes.PermissionDenied, "service needs ingress network, but ingress network is not present") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should use codes.FailedPrecondition
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
manager/controlapi/service.go
Outdated
if doesServiceNeedIngress(service) { | ||
if _, err := allocator.GetIngressNetwork(s.store); err != nil { | ||
if grpc.Code(err) == codes.NotFound { | ||
return nil, grpc.Errorf(codes.PermissionDenied, "service needs ingress network, but ingress network is not present") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should use codes.FailedPrecondition
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Will change it.
manager/controlapi/service.go
Outdated
@@ -424,6 +425,33 @@ func (s *Server) checkSecretExistence(tx store.Tx, spec *api.ServiceSpec) error | |||
return nil | |||
} | |||
|
|||
func doesServiceNeedIngress(srv *api.Service) bool { | |||
if srv.Spec.Endpoint.Mode != api.ResolutionModeVirtualIP { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Endpoint
will be nil if unspecified in the API request, so be sure to check before dereferencing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not hit this problem during testing. Probably the cli client fills in a default enpoint object, as it defaults to VIP mode. Here the right thing to do is to check for nil. Thanks.
manager/allocator/network.go
Outdated
@@ -691,6 +786,9 @@ func (a *Allocator) allocateService(ctx context.Context, s *api.Service) error { | |||
// world. Automatically attach the service to the ingress | |||
// network only if it is not already done. | |||
if isIngressNetworkNeeded(s) { | |||
if nc.ingressNetwork == nil { | |||
return fmt.Errorf("Ingress network is missing") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lowercase ingress
manager/allocator/network.go
Outdated
return n, nil | ||
} | ||
} | ||
return nil, grpc.Errorf(codes.NotFound, "no ingress network found") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better not to return a gRPC error here, because the allocator doesn't use gRPC at all. Maybe define an exported error and return that.
var ErrNoIngressNetwork = errors.New("no ingress network found")
Then callers can easily check if this error was returned, and translate to a gRPC error if appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's because I have the controlapi
functions which validate the service and network specs call this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, I will export an error for that.
Codecov Report
@@ Coverage Diff @@
## master #2028 +/- ##
==========================================
- Coverage 53.93% 53.82% -0.12%
==========================================
Files 109 109
Lines 19100 19187 +87
==========================================
+ Hits 10302 10327 +25
- Misses 7561 7608 +47
- Partials 1237 1252 +15 Continue to review full report at Codecov.
|
@aaronlehmann Updated. Tested reelection with the new logic you suggested, it works fine. |
manager/controlapi/network.go
Outdated
@@ -112,6 +114,13 @@ func (s *Server) CreateNetwork(ctx context.Context, request *api.CreateNetworkRe | |||
} | |||
|
|||
err := s.store.Update(func(tx store.Tx) error { | |||
if request.Spec.Ingress { | |||
if n, err := allocator.GetIngressNetwork(s.store); err == nil { | |||
return grpc.Errorf(codes.PermissionDenied, "ingress network (%s) is already present", n.ID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
codes.AlreadyExists
manager/controlapi/network.go
Outdated
} | ||
for _, srv := range services { | ||
if doesServiceNeedIngress(srv) { | ||
return grpc.Errorf(codes.PermissionDenied, "ingress network cannot be removed because service %s depends on it", srv.ID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
codes.FailedPrecondition
Design LGTM |
manager/allocator/network.go
Outdated
@@ -163,8 +111,11 @@ func (a *Allocator) doNetworkInit(ctx context.Context) (err error) { | |||
} | |||
} | |||
|
|||
skipIngressNetworkAllocation: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this label confusing because the flow from line 89 nc.ingressNetwork = ingressNetwork
could also reach here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, code from line 89 is expected to reach here.
Let me see if modifying the if blocks into a switch makes it more clear, so that I will avoid using the goto.
Signed-off-by: Alessandro Boch <aboch@docker.com>
Thanks @dongluochen . I reworked the code to avoid the goto, via a switch, and added some comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -112,6 +114,13 @@ func (s *Server) CreateNetwork(ctx context.Context, request *api.CreateNetworkRe | |||
} | |||
|
|||
err := s.store.Update(func(tx store.Tx) error { | |||
if request.Spec.Ingress { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think it better to move this check into store.CreateNetwork(tx, n)
where it already checks name duplicate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it is appropriate to percolate the ingress notion down into the store.createNetwork()
.
It is true it is doing now a check on duplicated network name, though, but that is more generic, top level resource name.
If you guys think it makes sense, I can move it there now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think either is fine.
LGTM |
} | ||
} | ||
return false | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aboch: Sorry for the post-merge comment. I was just looking at this again, and I noticed that this function is similar (but not the same as) isIngressNetworkNeeded
in the allocator. Are the differences correct?
Would it be possible to combine these functions as an exported function the allocator? I think it's better for controlapi not to implement this logic directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to combine these functions as an exported function the allocator
Agree
Regarding the differences, I want to double check. Probably some are redundant given the spec validation that happen first.
Was this intended to prevent service connections to any "Internal" network at all, or merely the ingress network? From documentation, I assumed internal was a generic network type that application developers could use to place sensitive services in. |
@archisgore I am not sure I follow, I am adding some more background so that we are on the same page for further clarifications. This change is to allow user to remove and/or replace the swarm ingress network. It gives interested user control over the ingress network configuration. The ingress network is the infrastructure network which provides the routing mesh for traffic ingressing the host. In the swarmkit code this network was previously marked with a In docker networking model, there is also an This |
Oh okay. I see. I'm seeing some strange behavior with the latest builds where:
Followed by a RemoteAPI call to start a service connected to that network returns:
But if I do:
The same call works. I searched for that error string and found this commit. |
Thank you @archisgore What you found is a bug. Will take care of it. |
Please see moby/moby/pull/31714 for details.
Signed-off-by: Alessandro Boch aboch@docker.com