-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix container package #820
Conversation
Why removing the part to listen when the network is actually deleted ? We need to ensure that the network is deleted when we remove the service otherwise we will have some problems again. Also adding 1s a bit everywhere in the code is not a good solution, sometime it might take more that 1s to delete the network/container/service |
Did you see the explanation about this in the PR description? Do you want more explanation?
|
Sorry i missed the recursive call. It's better but I feel having a polling or any kind of solutions like that is not so good when we can have events. Why not just using a |
That's what I tried first and it doesn't solve the issue. A Docker event system will make sense to sync in background the service status in realtime between Docker and the Service database. Otherwise, I feel it's too complicated. |
3d71b21
to
f01fded
Compare
@mesg-foundation/core Modification done, PR description updated, please review |
func (c *DockerContainer) deletePendingContainer(namespace []string) error { | ||
ctx, cancel := context.WithTimeout(context.Background(), c.callTimeout) | ||
defer cancel() | ||
func (c *DockerContainer) deletePendingContainer(namespace []string, maxGraceTime time.Time) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can simplify time logic by using <-time.After(stopGracePeriod).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how it will simplify.
The goal here is to "return" as soon as the container is not found.
The maxGraceTime is kind of a timeout.
I think using <-time.After(stopGracePeriod)
will always to call ContainerRemove
or to prevent this, another chan done need to be created to tell the go routine to stop. So actually it seems more complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i guess current implementation is fine because after we remove the service and check if the container exists right after, Docker still might be in the deleting process of container. So it's good that we don't block for the whole timeout and check the status of container again in short periods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
manual tests are good
This PR fixes and simplifies a few stuff on the package container:
Introduce the option StopGracePeriod.
This option tells docker the time to wait between the service is removed and killing the associated container. Now it's possible to set a custom on when starting a docker service.
Upgrade function
deletePendingContainer
to wait for a grace period before removing the container.It will either use the grace period is set in the docker service data or use the default one (10sec) if the first is not set.
The previous logic of
deletePendingContainer
was killing and removing the container after only a few second, without anyway to change the time (timeout of containerStop was somehow not taking into account, maybe because of docker service).Simplifying / improving function
DeleteNetwork
.The
DeleteNetwork
was using Docker.Events function to listen for network deletion event with a context with timeout. The problem is, sometime, the network is deleted either too early or too late (after the context timeout), so theDeleteNetwork
was falling in such case.I update the code to use a similar logic that the function
deletePendingContainer
, a recursive call that check if the network is actually deleted first.Change default
container.callTimeout
to 60sec.