Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to start workspaces due to Syntax error in che-plugin-registry #14704

Closed
5 tasks done
davidwindell opened this issue Sep 28, 2019 · 18 comments
Closed
5 tasks done
Labels
area/install Issues related to installation, including offline/air gap and initial setup kind/bug Outline of a bug - must adhere to the bug report template. severity/P2 Has a minor but important impact to the usage or development of the system.

Comments

@davidwindell
Copy link
Contributor

Describe the bug

Unable to start workspaces as the che-plugin-registry is failing to start.

=> sourcing 10-set-mpm.sh ...
=> sourcing 20-copy-config.sh ...
=> sourcing 40-ssl-certs.sh ...
AH00526: Syntax error on line 39 of /opt/rh/httpd24/root/etc/httpd/conf.d/mod_security.conf:
ModSecurity: Failed to open debug log file: /var/log/httpd24/modsec_debug.log

Che version

  • nightly

Steps to reproduce

Try to start any new or existing workspace.

Runtime

  • kubernetes

Installation method

  • helm

Environment

  • Cloud
    • other (Rancher/AWS)
@che-bot che-bot added the status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. label Sep 28, 2019
@tolusha
Copy link
Contributor

tolusha commented Sep 30, 2019

I successfully deployed che on minkube using the following command
chectl server:start --platform=minikube --installer=helm --k8spodreadytimeout=500000

@davidwindell
Have you tried with chectl?

@tolusha tolusha added status/info-needed More information is needed before the issue can move into the “analyzing” state for engineering. area/install Issues related to installation, including offline/air gap and initial setup and removed status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. labels Sep 30, 2019
@davidwindell
Copy link
Contributor Author

@tolusha no, I'm using Rancher/Vanilla K8s so no chectl.

I've just forced the helm chart to use 7.2.0 instead of nightly and it works. From what I can tell therefore is that there is definitely a bug in quay.io/eclipse/che-plugin-registry:nightly

@tolusha tolusha added severity/P2 Has a minor but important impact to the usage or development of the system. kind/bug Outline of a bug - must adhere to the bug report template. and removed status/info-needed More information is needed before the issue can move into the “analyzing” state for engineering. labels Oct 3, 2019
@amisevsk
Copy link
Contributor

amisevsk commented Oct 8, 2019

I haven't tested on k8s but quay.io/eclipse/che-plugin-registry:nightly starts without issue on OpenShift.

@davidwindell
Copy link
Contributor Author

@amisevsk any ideas on how I could provide helpful debug for this? Where might the syntax error be coming from?

@amisevsk
Copy link
Contributor

amisevsk commented Oct 9, 2019

@davidwindell I'm kind of at a loss -- we don't do any modification to the Apache config from the upstream registry.centos.org/centos/httpd-24-centos7 image. Are you able to start a plain registry.centos.org/centos/httpd-24-centos7 container on your cluster (i.e. update the image in the plugin registry deploy to the the httpd container we're basing off of)?.

In my working container, the referenced line is

$ sed '39!d' /opt/rh/httpd24/root/etc/httpd/conf.d/mod_security.conf
    SecDebugLog /var/log/httpd24/modsec_debug.log

and the /var/log/httpd24/ directory has permissions

$ ls -al /var/log/httpd24
total 0
drwxrwx---. 1 default    root 54 Oct  9 13:09 .
drwxr-xr-x. 1 root       root 21 Sep 11 18:24 ..
-rw-r-----. 1 1000450000 root  0 Oct  9 13:09 modsec_audit.log
-rw-r-----. 1 1000450000 root  0 Oct  9 13:09 modsec_debug.log

Could you check your deploy is similar? Perhaps it's a permissions issue related to UID stuff.

@davidwindell
Copy link
Contributor Author

I've just pulled today's nightly image and it randomly seems to be resolved. I'm going to close for now, thank you for looking into it.

@amisevsk
Copy link
Contributor

@davidwindell Let us know if it crops up again -- I don't like when issues mysteriously go away.

@davidwindell
Copy link
Contributor Author

Neither do I, will do 👍

@davidwindell davidwindell reopened this Oct 10, 2019
@davidwindell
Copy link
Contributor Author

This came back (perhaps it wasn't running nightly when I thought it was before). I've confirmed and I can't run the upstream image...

screenshot-rancher edge-servers com-2019 10 10-16_30_51

@davidwindell
Copy link
Contributor Author

There something about my node(s) that doesn't want to run it, but I've no idea why. It runs fine on my local docker and anything non-k8s

@davidwindell
Copy link
Contributor Author

davidwindell commented Oct 10, 2019

Ah, I've just tried to force the container to run as UID 0 (root) and that worked:

docker run -u 0 registry.centos.org/centos/httpd-24-centos7

tldr; setting the below on the plugin registry in our helm chart fixes this for me

securityContext:
          runAsUser: 0

I'd really like to understand why a simple docker run on our K8's cluster throws the error but locally it doesn't!

@amisevsk
Copy link
Contributor

Thanks for the analysis @davidwindell -- looks like it's still a bug somewhere, since running as root should not be necessary.

@davidwindell
Copy link
Contributor Author

Just FYI this is also now happening on the latest devfile-registry images (not just plugin-registry). Something must have changed upstream that broke running as non-root.

@davidwindell
Copy link
Contributor Author

This is still an issue with 7.5.1

@davidwindell
Copy link
Contributor Author

Also reproduced in 7.6.0. I think it could be something to do with the permissions of the /var/log/httpd24 folder which is current drwxrwx--- 2 default root 4096 Jan 2 11:09 httpd24.

The helm chart doesn't by default specify a runasuser

@amisevsk
Copy link
Contributor

amisevsk commented Jan 7, 2020

@davidwindell From my experience, the user that runs in k8s/OpenShift container is a member of the root group (and so should have write permissions to httpd24). Can you confirm this is the case in your cluster (i.e. open a terminal into the container and check group number)?

@davidwindell
Copy link
Contributor Author

@amisevsk here's the output when I force the image to run with a custom entrypoint:

screenshot-rancher edge-servers com-2020 01 (1)

@davidwindell
Copy link
Contributor Author

It looks like this is now fixed somehow, closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/install Issues related to installation, including offline/air gap and initial setup kind/bug Outline of a bug - must adhere to the bug report template. severity/P2 Has a minor but important impact to the usage or development of the system.
Projects
None yet
Development

No branches or pull requests

4 participants