Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latest rc can't start xrd containers #2324

Closed
edvgui opened this issue Dec 2, 2024 · 15 comments · Fixed by #2326
Closed

Latest rc can't start xrd containers #2324

edvgui opened this issue Dec 2, 2024 · 15 comments · Fixed by #2326
Labels
bug Something isn't working

Comments

@edvgui
Copy link

edvgui commented Dec 2, 2024

When deploying some xrd containers with the latest rc, they fail to start, showing the following error in the container logs:

The boot script specified in XR_EVERY_BOOT_SCRIPT is not executable: /etc/xrd/mgmt_intf_v6_addr.sh
[ERROR  ] Invalid environment variable(s)
[ERROR  ] XRd hit a critical error during initialization and has aborted launch

Containerlab version is

  ____ ___  _   _ _____  _    ___ _   _ _____ ____  _       _     
 / ___/ _ \| \ | |_   _|/ \  |_ _| \ | | ____|  _ \| | __ _| |__  
| |  | | | |  \| | | | / _ \  | ||  \| |  _| | |_) | |/ _` | '_ \ 
| |__| |_| | |\  | | |/ ___ \ | || |\  | |___|  _ <| | (_| | |_) |
 \____\___/|_| \_| |_/_/   \_\___|_| \_|_____|_| \_\_|\__,_|_.__/ 

    version: 0.60.0-rc1
     commit: 5468b6f2
       date: 2024-11-28T15:28:39Z
     source: https://github.com/srl-labs/containerlab
 rel. notes: https://containerlab.dev/rn/0.60/

If I try the same topology with the latest release (0.59.0) it works just fine.

Any idea what might have changed related to this?

@hellt
Copy link
Member

hellt commented Dec 2, 2024

what xrd version is this?

@edvgui
Copy link
Author

edvgui commented Dec 2, 2024

This is 7.9.2, I didn't test more recent versions yet, should I?

@edvgui
Copy link
Author

edvgui commented Dec 2, 2024

I guess it has to be related to this: c06dd25

@edvgui
Copy link
Author

edvgui commented Dec 2, 2024

Changing the permissions on the script to make it executable (on the host where containerlab is installed) did fix it for the first container of the lab that is starting, but the next one failed on it with the above mentioned error. This is confusing.

Update: something is resetting the permissions of the script and makes it not executable, no idea what it is.
Update 2: nevermind, I didn't realize the file was in the state dir of the container, and therefore different for each container (so it had to be fixed for each of them)

@hellt
Copy link
Member

hellt commented Dec 2, 2024

this is likely for @kaelemc who did the original thing
it worked in my test bench that has one container with 7.8.1 version. I think it might be permissions indeed, since this is what I see

❯ ls -la clab-xrd/xrd/mgmt_intf_v6_addr.sh
-rw-rwxr--+ 1 root root 146 Dec  2 11:43 clab-xrd/xrd/mgmt_intf_v6_addr.sh

setting it +x for the world should likely fix it

@edvgui
Copy link
Author

edvgui commented Dec 2, 2024

setting it +x for the world should likely fix it

Indeed, I can do this for each "state dir" used by my containers and on the next deploy they will work. But I guess this should be addressed in containerlab before the next release?

@hellt hellt added the bug Something isn't working label Dec 2, 2024
@kaelemc
Copy link
Contributor

kaelemc commented Dec 2, 2024

Yes, the perms issue is clearly the case here.

What's confusing me is I tested this across 2 machines when I made the PR, and after this rc release was published I installed it on 2 fresh installs of clab where XRd worked exactly as intended with that script.

utils.AdjustFileACLs() should be setting the +x. In development I got this exact error without utils.AdjustFileACLs(). I wonder what's going on.

@kaelemc
Copy link
Contributor

kaelemc commented Dec 2, 2024

by chance are you running as the root user (not using sudo)?

@edvgui
Copy link
Author

edvgui commented Dec 2, 2024

I am indeed, I tried it both with clab installed on the host (as root) and using clab in docker, where I believe the user is root as well.

@hellt
Copy link
Member

hellt commented Dec 2, 2024

@kaelemc when you create a file in Go, make it world executable

@kaelemc
Copy link
Contributor

kaelemc commented Dec 2, 2024

Yes sorry, that was totally an oversight on my end.. I've opened #2326

@edvgui @hellt kindly requesting if you guys could test it out :).

@edvgui
Copy link
Author

edvgui commented Dec 2, 2024

I am sorry but I don't manage to build the binary, I used the make build on your branch, but the command always ends up hanging on some dependency downloading (this is the first time ever I build a go binary so it really does download a lot as the cache is completely empty). I have no idea what it going on, will try tomorrow again

@kaelemc
Copy link
Contributor

kaelemc commented Dec 2, 2024

@edvgui That's ok, are you on x86 architecture, I can try build you a binary and I think you should be able to use it.

EDIT: OF course you are, else you couldn't run XRd 😅

@kaelemc
Copy link
Contributor

kaelemc commented Dec 2, 2024

@edvgui Hopefully this works, this is my first time trying this..

sudo docker run --rm -v $(pwd):/workspace ghcr.io/oras-project/oras:v1.1.0 pull ghcr.io/kaelemc/clab-oci:79a545116

This will download a containerlab binary wherever you execute that command, (you might have to chmod +x the binary :)

You should be able to just ./containerlab deploy hopefully.

@edvgui
Copy link
Author

edvgui commented Dec 3, 2024

@kaelemc thanks, I appreciate the effort. I managed to get some help and got the build working. I tested it and it is looking good on my end. 😄

@hellt hellt closed this as completed Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants