Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker: stop network pause container of lost alloc after node restart #17455

Merged
merged 1 commit into from
Jun 9, 2023

Conversation

shoenig
Copy link
Member

@shoenig shoenig commented Jun 7, 2023

This PR fixes a bug where the docker network pause container would not be
stopped and removed in the case where a node is restarted, the alloc is
moved to another node, the node comes back up. See the issue below for
full repro conditions.

Basically in the DestroyNetwork PostRun hook we would depend on the
NetworkIsolationResource field not being nil - which is only the case
if the Client stays alive all the way from network creation to network
teardown. If the node is rebooted we lose that state and previously
would not be able to find the pause container to remove. Now, we manually
find the pause container by scanning them and looking for the associated
allocID.

Fixes #17299

@shoenig
Copy link
Member Author

shoenig commented Jun 7, 2023

Spot check (all we have since this requires reboots)

Setup: 3 nodes (1 server + 2 clients)

ubuntu@ip-172-31-25-137:~$ nomad node status
ID        Node Pool  DC   Name              Class   Drain  Eligibility  Status
ded51f46  default    dc1  ip-172-31-19-192  <none>  false  eligible     ready
faf9c60b  default    dc1  ip-172-31-24-55   <none>  false  eligible     ready

Basic docker redis job with bridge network mode

job "redis" {

  group "cache" {
    network {
      mode = "bridge"
      port "db" {
        to = 6379
      }
    }

    task "redis" {
      driver = "docker"

      config {
        image          = "redis:7"
        ports          = ["db"]
        auth_soft_fail = true
      }

      resources {
        cpu    = 500
        memory = 256
      }
    }
  }
}

Run job. Show docker ps -a on both nodes.

ubuntu@ip-172-31-19-192:~$ sudo docker ps -a
CONTAINER ID   IMAGE                                      COMMAND                  CREATED          STATUS          PORTS     NAMES
c785e8e57553   redis:7                                    "docker-entrypoint.s…"   50 seconds ago   Up 49 seconds             redis-99a82eee-3965-28cf-b21d-43ea8db1b03f
16ce1901172c   gcr.io/google_containers/pause-amd64:3.1   "/pause"                 50 seconds ago   Up 49 seconds             nomad_init_99a82eee-3965-28cf-b21d-43ea8db1b03f
ubuntu@ip-172-31-24-55:~$ sudo docker ps -a
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
<empty>

Reboot the node where the alloc is running.

ubuntu@ip-172-31-19-192:~$ sudo reboot
Connection to ec2-54-91-84-184.compute-1.amazonaws.com closed by remote host.

Wait for alloc to be replaced on the other node

ubuntu@ip-172-31-25-137:~$ nomad job status redis | tail -n4
Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created    Modified
b41b9b72  faf9c60b  cache       0        run      running  38s ago    9s ago
99a82eee  ded51f46  cache       0        stop     lost     2m27s ago  10s ago

See docker ps -a on old node is unclean (nomad agent not started)

ubuntu@ip-172-31-19-192:~$ sudo docker ps -a
CONTAINER ID   IMAGE                                      COMMAND                  CREATED         STATUS                          PORTS     NAMES
c785e8e57553   redis:7                                    "docker-entrypoint.s…"   2 minutes ago   Exited (0) About a minute ago             redis-99a82eee-3965-28cf-b21d-43ea8db1b03f
16ce1901172c   gcr.io/google_containers/pause-amd64:3.1   "/pause"                 2 minutes ago   Up 39 seconds                             nomad_init_99a82eee-3965-28cf-b21d-43ea8db1b03f

Start nomad agent

ubuntu@ip-172-31-19-192:~$ sudo service nomad start

Now docker ps -a we see containers are cleaned up

ubuntu@ip-172-31-19-192:~$ sudo docker ps -a
CONTAINER ID   IMAGE     COMMAND   CREATED   STATUS    PORTS     NAMES
<empty>

@shoenig shoenig added backport/1.3.x backport to 1.3.x release line backport/1.4.x backport to 1.4.x release line backport/1.5.x backport to 1.5.x release line labels Jun 7, 2023
@shoenig shoenig added this to the 1.5.x milestone Jun 7, 2023
This PR fixes a bug where the docker network pause container would not be
stopped and removed in the case where a node is restarted, the alloc is
moved to another node, the node comes back up. See the issue below for
full repro conditions.

Basically in the DestroyNetwork PostRun hook we would depend on the
NetworkIsolationSpec field not being nil - which is only the case
if the Client stays alive all the way from network creation to network
teardown. If the node is rebooted we lose that state and previously
would not be able to find the pause container to remove. Now, we manually
find the pause container by scanning them and looking for the associated
allocID.

Fixes #17299
Comment on lines +493 to +495
if spec == nil {
return nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm having trouble following this. If we return here, how does the Docker driver ever get the DestroyNetwork call so that it can run the findPauseContainer?

Copy link
Member Author

@shoenig shoenig Jun 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This path only applies to the other kind of docker managed [driver] network where the create/destroy happen over RPC. In the group network case (where we use a pause container) the create/destroy are invoked directly* by the client on the docker driver implementation.

  • where directly means through the interface handle backed by the creation of the handle through the RPC dispatch.

This code is really difficult to follow.

Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.5.x backport to 1.5.x release line
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nomad left pause-amd64 continaers alive if drain_on_shutdown is used
2 participants