-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RKE will not finish deployment if certain network mounts exist on the target node. #964
Comments
To add some additional context to this, |
@emm-dee @skaven81 I wasn't able to reproduce the issue on my setup with the latest rke, here are my steps:
rke finish building the cluster successfully and kubelet started correctly, can you try to test it again with the latest rke and see if you can still reproduce |
@galal-hussein did you mount the nfs filesystem at /usr/local/doc? Just having an NFS mount at /nfs isn't going to reproduce the issue. The problem is caused when the installer attempts to umount /usr/local/doc, which fails because the filesystem is in use. |
@skaven81 I tried that, on the nfs server the /etc/exports:
on the rke node: /etc/auto.master
and in /etc/auto.nfs
I was able to verify that /usr/local/doc was automounted from the nfs server
After that i added the rke node to cluster.yml and ran rke up
Can you share your configuration file i am not sure what is not working correctly on your setup. |
@sangeethah can you give it a try too |
Your NFS configuration is not using direct maps. I suspect the issue arises through the use of direct maps, because direct maps appear in /proc/mounts even when they're not mounted. Your /etc/auto.master should contain something like this:
And then
The |
Have we figured out how to reproduce this yet? |
Seems to be the same problem with non existing glusterfs network mount. I removed the glusterfs-server, and forget to remove the /etc/fstab entry:
|
~~i'm experiencing this problem on one cluster, but not another (identical cluster in a different datacenter), I have no NFS mounts. ~~ argh, nevermind. typo on my part. (don't indent |
@moelsayed can you help to validate |
I just tried to reproduce this again. Here is my configuration:
Related mounts on the node:
Using latest master, cluster provisioning completed successfully. |
@emm-dee Are you still having this problem? Is there any additional configuration you can provide to reproduce it ? |
The key for reproduction might be that the filesystem has to be in use so that it can't be unmounted. Perhaps try opening a separate shell and |
@skaven81 I vimed a file in the same location. Also cd'ed into the directory. Still was able to provision the cluster successfully. |
We worked around the issue by dropping the two noted automount points off of our RKE systems a looong time ago. So for all I know the issue is fixed in modern versions of RKE. I don't have a way of reproducing it anymore, without a lot of work to un-do our workaround in a test environment. |
Based on the ^ closing the issue. Please reopen if you see it again. |
RKE version:
v0.1.10
Docker version: (
docker version
,docker info
preferred)17.0.3
Operating system and kernel: (
cat /etc/os-release
,uname -r
preferred)Ubuntu 16.04
Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO)
Bare metal
cluster.yml file:
Steps to Reproduce:
rke up
Results:
Failed deployment with the following error:
Findings
End result was that the issue was caused by having a network mount to
/usr/local/doc
on the target nodes. When the user removed the network mount, therke up
completed successfully.Filing issue as customer requires various network mounts across their systems and it would be ideal if the deployment could ignore if any paths are externally mounted. (More details and chatter is on internal Slack comms on 10/15/2018)
The text was updated successfully, but these errors were encountered: