Skip to content
This repository has been archived by the owner on Feb 1, 2021. It is now read-only.

Add support for pull on all hosts (no change in docker run) #349

Merged
merged 3 commits into from
Mar 6, 2015
Merged

Add support for pull on all hosts (no change in docker run) #349

merged 3 commits into from
Mar 6, 2015

Conversation

vieux
Copy link
Contributor

@vieux vieux commented Feb 2, 2015

Right now, we have an issue because when we do a docker run, the scheduler will select a node.
Then, if this node doesn't have the image already pulled, swarm will pull the image automatically.
We do that because there is no way to tell to the client to pull the missing image on a particular node.
One issue is: it doesn't work with private images, because swarm doesn't have the client credential.

With this PR, you are able to do a docker pull manually to pull an image on all the nodes, this works for private images.
This PR also add a basic implementation of docker rmi and docker inspect for images

There is no change in the way docker run works.

docker pull rethinkdb
ubuntu-2: Pulling downloaded... 
fedora-1: Pulling downloaded... 
ubuntu-1: Pulling downloaded... 

then

docker pull rethinkdb
ubuntu-2: Pulling downloaded...
fedora-1: Pulling downloaded...
ubuntu-1: Pulling downloaded... : rethinkdb:latest

then

docker pull rethinkdb
ubuntu-2: Pulling downloaded... : rethinkdb:latest
fedora-1: Pulling downloaded... : rethinkdb:latest
ubuntu-1: Pulling downloaded... : rethinkdb:latest

@chanwit
Copy link
Contributor

chanwit commented Feb 3, 2015

I think its the best solution for the time being.

@aluzzardi
Copy link
Contributor

A few points:

  • Given that DeployContainer doesn't trigger a pull automatically anymore, it means that re-scheduling will fail. Perhaps the pull flag should be exposed there and set to false by the API (and we'll let the scheduler set it to true when re-scheduling).
  • What happens when the cluster grows? Let's say you did a docker run redis. It will pull redis on every hosts. Now a new host comes up and we do a docker run redis again, which happens to end up on this new machine. The CLI will issue a pull, resulting in the update of the redis image on every single host, which may not be something the user expected?
  • What happens at scale? We have our target of ~100 nodes. Making docker pull pull on every node makes sense, the user explicitly asked that. However, pulling a hundred times on every docker run might not be something we want. Or not. Perhaps this is great as we can re-schedule containers on the fly without having to pull images, resulting in a smaller downtime. Thoughts?

@chanwit
Copy link
Contributor

chanwit commented Feb 4, 2015

What happens at scale?

Should we intercept docker run and explicitly check and let docker pull missing images?
With this, it may allow pulling only for the node selected to run.

@aluzzardi
Copy link
Contributor

Unfortunately, docker run is entirely on the CLI side and issues a create (gets a 404 since the image doesn't exist) / pull / create / start

@vieux
Copy link
Contributor Author

vieux commented Feb 4, 2015

  • I agree about the flag, it should set by the API.
  • What happens when the cluster grows?
    Here is an idea that might be crazy, instead of "remembering the pulls":
    When a new node joins the cluster, the master could have a rule: it scans every node it already new and for each image that is on every node, pull it a new one.
  • What happens at scale?
    I don't think it would be an issue, as you said it would speed up the rebalancing.

@vieux vieux mentioned this pull request Feb 9, 2015
@bfirsh
Copy link
Contributor

bfirsh commented Feb 10, 2015

+1 we need this! yay!

My concerns with the implementation are similar to @aluzzardi's. Also – what if you commit an image? How does that image get propagated to other nodes in the cluster?

Perhaps instead of each node having all of the images, a swarm could be attached to a registry which represented the images in a swarm. When a container is created, the assigned node could pull the image from the registry. When you commit, the image gets pushed to the registry.

Also – another crazy idea: what if Engines could push and pull to each other? Perhaps the images in a swarm could be an aggregate of the images on each of the nodes. When a node in the cluster needed an image that was on another node, it could pull from that node. /cc @stevvooe @dmp42 who have had similar ideas

@stevvooe
Copy link

@bfirsh We can break this down using the following rules:

  1. The "swarm" has a global image store.
  2. The current location of any image should never matter to the operator.
  3. Push, commit and pull client commands from swarm operate on this store.

From this, it follows that using the run command may result in an "internal pull". If an image cannot be resolved in the local cluster (read "swarm registry"), the run command would fail. I'm not sure if runtime image resolution exists currently, but a pluggable registry for internal clusters would make this possible.

As a side note, pulling all images to all machines is a huge problem for large deployments. Imagine the read amplification for a small deployment on a large cluster, where the application is deployment to a small percentage of nodes.

@vieux vieux mentioned this pull request Feb 20, 2015
@allisonvoll
Copy link

As a side note, pulling all images to all machines is a huge problem for large deployments. Imagine the read amplification for a small deployment on a large cluster, where the application is deployment to a small percentage of nodes.

maybe the option to apply the filter in docker pull (as is possible in scheduler docker run)

@vieux
Copy link
Contributor Author

vieux commented Feb 20, 2015

The thing is, we are limited by the docker cli, and right now the cli
doesn't accept any flag.

We also tried docker pull redis pulls on every machine and docker pull redis@node1 pulls only on node1, but we cannot use symbols

See #337 for reference.

On Fri, Feb 20, 2015 at 11:35 AM, allisonvoll notifications@github.com
wrote:

As a side note, pulling all images to all machines is a huge problem for
large deployments. Imagine the read amplification for a small deployment on
a large cluster, where the application is deployment to a small percentage
of nodes.

maybe the option to apply the filter in docker pull (as is possible
scheduler docker run)


Reply to this email directly or view it on GitHub
#349 (comment).

Victor Vieux
Docker Core Engineer
vieux@docker.com

@kgoudeaux
Copy link

What about maintaining a mapping of images to filters? On a create call that requests image X with filter constraint=ssd store the X->[filter1, filter2] relationship. Any subsequent pull for X (e.g. from the client that just got a 404 on create) will trigger the pull on all hosts that match the cached filters for image X. If there are no associated filters, pull everywhere. For new nodes, walk the data in reverse to find images that match the node.

@bacongobbler
Copy link
Contributor

+1 for the ability to pre-fetch images onto every swarm host!

What happens at scale? We have our target of ~100 nodes. Making docker pull pull on every node makes sense, the user explicitly asked that. However, pulling a hundred times on every docker run might not be something we want. Or not. Perhaps this is great as we can re-schedule containers on the fly without having to pull images, resulting in a smaller downtime. Thoughts?

I think this is actually a great thing that the global pull occurs on a docker run. For our use case, we would like a speedy recovery in the instance that a host dies rather than having to wait however long it takes to pull the image onto the new host.

Here is an idea that might be crazy, instead of "remembering the pulls":
When a new node joins the cluster, the master could have a rule: it scans every node it already new and for each image that is on every node, pull it a new one.

👍 for "smart" operations on node joins. I can see that we could add "smart" node re-balancing scheduler strategies with this workflow in the future, too :)

@smothiki
Copy link
Contributor

Victor has announced that there is a plan for Swarm manager to be HA and leader election through Raft . If there is a way for swarm manager to store or cache the images that are being pulled on a node and while creating a container on a different node other than the one that has the image , then the node can simply pull from the swarm manager --> just requires to edit the image or json template for the pull request from manager to node .

@stevvooe
Copy link

@vieux I don't have a solution for the CLI aspect, other than tying image fetch to a resource offer framework.

I would respectfully request that you avoid the docker pull redis@node1 since it will collide with the proposal in moby/moby#10740.

@vieux
Copy link
Contributor Author

vieux commented Feb 24, 2015

@stevvooe thanks for pointing that out. In any case, we are avoiding it, because it is not possible with today's CLI.

@gabrtv @aluzzardi I updated the PR to remove the pull on docker run. To use private images, you'll have to docker pull (on every node) manually.

@vieux vieux added this to the 0.2.0 milestone Feb 27, 2015
@vieux vieux changed the title Proposal: add support for inspect (images), pull and rmi on all hosts Add support for inspect (images), pull and rmi on all hosts (no change in docker run) Mar 2, 2015
@vieux
Copy link
Contributor Author

vieux commented Mar 2, 2015

I updated the title and description to make things more accurate.

/cc @aluzzardi

@vieux vieux changed the title Add support for inspect (images), pull and rmi on all hosts (no change in docker run) Add support for pull on all hosts (no change in docker run) Mar 5, 2015
@@ -186,6 +186,14 @@ func (c *Cluster) Image(IdOrName string) *cluster.Image {
return nil
}

func (c *Cluster) Pull(name string, begin, end func(string)) {
for _, node := range c.nodes {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably happen in parallel

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to find a way, last time I checked, it was messing the docker cli UI completly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah since, I changed, the output, it works!

@vieux
Copy link
Contributor Author

vieux commented Mar 6, 2015

@aluzzardi please take another look I updated with your suggestions

Signed-off-by: Victor Vieux <vieux@docker.com>
Signed-off-by: Victor Vieux <vieux@docker.com>
done := make(chan bool, size)
for _, n := range c.nodes {
go func(nn *node) {
callback(nn.Name(), "")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you ensure the callback is not null before invoking it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will even though I don't think is is necessary. The function isn't called by the cluster driver implem, but by the swarm api. So we won't call it with nil.

Signed-off-by: Victor Vieux <vieux@docker.com>
@vieux
Copy link
Contributor Author

vieux commented Mar 6, 2015

@aluzzardi PTAL

aluzzardi added a commit that referenced this pull request Mar 6, 2015
Add support for pull on all hosts (no change in docker run)
@aluzzardi aluzzardi merged commit 466242c into docker-archive:master Mar 6, 2015
@aluzzardi aluzzardi deleted the pull_rmi branch March 6, 2015 19:56
@huslage
Copy link

huslage commented Apr 7, 2015

Wouldn't adding support for docker login on each node solve the credentials issue and allow docker run to pull images again?

@jippi
Copy link

jippi commented Dec 20, 2015

👍

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.