Skip to content
This repository has been archived by the owner on Oct 11, 2023. It is now read-only.

Timing issue when using image preload #2297

Closed
igormrbean opened this issue Mar 19, 2018 · 6 comments
Closed

Timing issue when using image preload #2297

igormrbean opened this issue Mar 19, 2018 · 6 comments

Comments

@igormrbean
Copy link

RancherOS Version: 1.2.0

Where are you running RancherOS? vmware

I'm preloading my images on my image that I'm creating. For this, I use a script similar to "scripts/run-install", to install on a virtual disk, and then, I mount it, to put my custom tar.gz images into /var/lib/rancher/preload/docker/. I have a cloud-config.yml that defines a custom service, which uses the image from the preload.
However, when I boot the disk (after converting it to a vmware image), the preload works fine, but takes some time to complete, and the ros-sysinit fails to boot my container, since the preload isn't completed. After a while, I can clearly see the image being loaded by doing a "docker images", but the container will not start, until a reboot (where the image is already available in docker).

Any idea how to make ros-sysinit wait for "preload-user-images" ?

@niusmallnan niusmallnan added this to the v1.4.0 milestone Mar 23, 2018
@niusmallnan
Copy link
Contributor

We use the docker API SDK(different from docker CLI) to load images, It should be an asynchronous method that does not block processes. I think this may be the cause of this timing issue.

A workaround can be like this:
Use runcmd instead of your custom service, before you start your container, use a script to check if the image is loaded.(Maybe this is not a good idea)

Or do not use preload-user-images, use runcmd directly:

docker image load xxx
docker run xxxx

@igormrbean
Copy link
Author

Hi,

Thank you for your prompt answer. Personally, I think that in some use cases, it might be interesting to have async and in others, to have a sync.

For instance, when you deploy thousands of ros instances, and provision them later using rancher server or something like that, timing isn't really important, and you probably want to have faster boot. However, in my case, I want to boot a VM, with everything functional on the 1st boot (containers images and running).

I'll see if I can use your runcmd workaround. Right now, the only way I found is through packer, and mounting a RANCHER_STATE drive separately. The downside of this, is that it increases image size, since the containers are unpacked under /var/lib/rancher, instead of the compressed image on disk. Also, this workaround could work for regular containers, it is likely gonna fail for system-docker, as seen in issue #1698.

Great project, keep up the good work guys.

@niusmallnan
Copy link
Contributor

@Jason-ZW
We can introduce some flags, like rancher.user_preload.wait and rancher.system_preload.wait.
We can ensure that user-specified images are synchronized to load via these flags.

Please track this issue.

@niusmallnan
Copy link
Contributor

niusmallnan commented Apr 2, 2018

On boot, RancherOS scans /var/lib/rancher/preload/docker and /var/lib/rancher/preload/system-docker directories and tries to load container image archives it finds there.

For system-docker:
Image loading is an asynchronous process by default, image loading may not complete when the user starts a service based on this image. This is the root cause of the timing issue.
We will introduce a flag called rancher.preload.wait , if users set its value to true. System-docker will wait for the image to be loaded before running the system-service.

For user-docker:
The timing issue is more complicated, unlike system-docker preload, user-docker uses a compose service to run preload. It is very difficult to completely control the startup order of libcompose.
Even Docker did not provide a better way. https://docs.docker.com/compose/startup-order/
So we may redesign the user-docker part in the future.

Now, we will ensure that system-docker can solve the timing issue while providing some workaround for user-docker.

@lucymhdavies
Copy link

rancher.preload.wait sounds perfect 👍

@kingsd041
Copy link
Contributor

Fixed in rancheros v1.4.0-rc1 and tested passed

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants