-
Notifications
You must be signed in to change notification settings - Fork 655
RancherOS becomes unstable when cni config doesn't match what system-docker is running #1903
Comments
so with debug off:
|
and with debug on:
so its not just downgrading any ros - @janeczku we need more specific details. |
@SvenDowideit This is reproducible 3/5 when trying to upgrade our preconfigured VMware vSphere hosts from v1.0.0 to v1.0.1. The upgrade fails with "oci runtime error". Then also manually forcing a reboot fails with the same error making it impossible to reboot and complete the upgrade. To recover we have to reset the host via the vSphere console. Steps to reproduce
Configuration
system-docker
user-docker
dmesg snippet
Logs |
So this seems to be caused by the custom We have the following configuration in the cloud-config passed to write_files:
- content: |+
{
"name": "bridge",
"type": "bridge",
"bridge": "docker-sys",
"isDefaultGateway": true,
"ipMasq": true,
"hairpinMode": true,
"ipam": {
"type": "host-local",
"subnet": "10.0.0.0/16"
}
}
owner: root
path: /etc/docker/cni/bridge.d/bridge.conf
permissions: "0644" On the hosts where the upgrade to v1.0.1 fails with
event though the custom bridge.config has been written as expected:
On the hosts where the upgrade succeeds the bridge is configured correctly:
|
TLDR: the workaround we use to configure a custom bridge IP by overwriting the http://rancher.com/docs/os/configuration/docker/#system-docker-settings |
merci for the info. looking into it. |
@janeczku I would think that as the
not good, but that way the config is definitely going to be written before you boot into the persistence partition. In my testing I also noticed that after you reboot after the first boot from persistent disk - though you do need to use the vmware power button to do it. in v1.1.0, there's a given http://rancher.com/docs/os/configuration/write-files/#writing-files-in-specific-system-services says that the default is to write files in the console container, I don't really see how the write-files above would work reliably. and yup, this bumps #1870 up as something I may have to get going in 1.2.0 and then think about backporting depending on how intrusive a change it is. |
I want to support a kernel parameter to modify docker-sys subnet. Tested with my own image:
|
Tested with RancherOS v1.2.0-rc2.
|
Docker 17.03
resolution plan: (updated by Sven)
for v1.0.x and v11.0, its best to move the customisation of the docker-sys bridge into the installation phase, and then reboot - that way it'll be ready when the system docker is started (using cloud-init write-file shouldn't really work)
for 1.2.0, 1.1.x and possibly 1.0.x I'll work on adding a cloud-init setting for this (and hopefully other settings)
The text was updated successfully, but these errors were encountered: