-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
502 bad gateway for workspace health check on Minkube when using UBI9 #23179
Comments
This bug somewhat resembles #23103 (comment), which occurs on a Kubernetes (K3s) cluster. However, for that bug, the issue occurs with the empty workspace sample which uses the UDI - so they might be separate issues. |
Linking this comment from another issue, as it seems to be a similar issue to my description of the postStart event failing. My description:
Comment findings:
|
@azatsarynnyy @vitaliy-guliy @RomanNikitenko It's possible that the che code entrypoint postStart event might be failing for UBI9 images (see above comment). Please let me know if you have any thoughts on this. Note that the entrypoint might be failing only on Minkube, but succeeds on OpenShift? Though on Apple Sillicon, the entrypoint fails for both Minikube and OpenShift Local (this may be a different, Apple Sillicon issue). |
I tested installing Che on minikube using chectl In the ingress-nginx-controller logs, I see a HTTP 500 result, followed by redirection to the Dashboard:
If I curl the workspace URL, I see a HTTP 302 response. I assume this is because the dashboard is redirecting me from the workspace URL to the dashboard URL.
The DevWorkspace Operator logs also show the 502 error that the health check is failing:
So I can confirm that this issue doesn't only occur with the Che Operator install-on-minikube script, but also using chectl alone. |
@AObuchow
In general - Could you take a look at entrypoint's logs again - are there something like:
I'm trying to fix problems with starting Che locally on my machine to investigate the problem... |
@RomanNikitenko Thank you for the follow-up. I checked the I haven't looked into this thoroughly, but microsoft/vscode#204178 might be relevant. My current theory: It seems like CheCode calls This makes sense, since the UID on the ubi9-python image is 1234 and calling
This might be tricky to resolve...how do we ensure an arbitrary image used for the tooling container has a user entry in /etc/passwd/? I think the only way we could modify the /etc/passwd/ (since it's owned by root) is by mounting an updated version as a kubernetes volume? Worst case, we have to document that the UID in the container image must have an entry in /etc/passwd to be used in Che. I believe the reason why we don't have this issue on OpenShift is because OpenShift will automatically set the UID and add an entry to /etc/passwd/ for us. Maybe there's a Kubernetes alternative to this feature that could be added to the chectl install process? Here's the full entrypoint logs:
|
Actually. the UBI9 python image in question sets USER to 1001, so I'm not sure where the UID 1234 is coming from yet. |
Some new findings & a temporary workaround (for this specific devfile) below. The UID 1234 was coming from the default pod security context used by DevWorkspace Operator on Kubernetes: defaultKubernetesPodSecurityContext = &corev1.PodSecurityContext{
RunAsUser: pointer.Int64(1234),
RunAsGroup: pointer.Int64(0),
RunAsNonRoot: pointer.Bool(true),
FSGroup: pointer.Int64(1234),
}
defaultKubernetesContainerSecurityContext = &corev1.SecurityContext{} Setting the pod & container security context through the Che Cluster CR results in the workspace starting up (the che code entrypoint succeeds). I'm not sure yet if this is the minimal configuration required to get the workspace starting: kind: CheCluster
metadata:
name: eclipse-che
namespace: eclipse-che
spec:
components:
cheServer:
debug: false
logLevel: INFO
dashboard:
logLevel: ERROR
devWorkspace: {}
devfileRegistry:
disableInternalRegistry: true
externalDevfileRegistries:
- url: https://registry.devfile.io
imagePuller:
enable: false
spec: {}
metrics:
enable: true
pluginRegistry:
disableInternalRegistry: true
containerRegistry: {}
devEnvironments:
containerBuildConfiguration:
openShiftSecurityContextConstraint: container-build
defaultNamespace:
autoProvision: true
template: <username>-che
disableContainerBuildCapabilities: true
ignoredUnrecoverableEvents:
- FailedScheduling
maxNumberOfWorkspacesPerUser: -1
secondsOfInactivityBeforeIdling: 1800
secondsOfRunBeforeIdling: -1
+ security:
+ containerSecurityContext:
+ allowPrivilegeEscalation: true
+ readOnlyRootFilesystem: false
+ runAsNonRoot: true
+ podSecurityContext:
+ fsGroup: 1001
+ runAsUser: 1001
startTimeoutSeconds: 3000
storage:
pvcStrategy: per-user
gitServices: {}
networking:
auth:
gateway:
configLabels:
app: che
component: che-gateway-config
identityProviderURL: http://dex.dex:5556
oAuthClientName: eclipse-che
oAuthSecret: EclipseChe
domain: 192.168.49.2.nip.io
tlsSecretName: che-tls @RomanNikitenko I would argue this is either a DevWorkspace Operator bug (since it's responsible for setting the default pod & container security context on Kubernetes) or a chectl bug (you could argue the pod & container security context configured in the CheCluster CR could use the above values, but this probably wouldn't work for images that don't use USER 1001). Investigation needs to be done on the DWO side to see if removing the default pod security context used on Kubernetes will resolve this issue (though I have my doubts). The default 1234 UID value seems like it was an arbitrary choice to ensure the value was set. Here's the original PR where this was introduced: devfile/devworkspace-operator#748 |
Hello @AObuchow I tend to think it rather DWO bug then a chectl. |
@tolusha I agree, I think the default container and pod security context used in DWO on Kubernetes should perhaps be updated to match the requirements of the 1001 user used in the UBI (since this same user is also used in the UDI, and both the UDI and DWO are devfile projects). Documenting the requirements related to setting the correct UID (through the pod and container security configuration) for the images used when DWO is deployed on Kubernetes should also be done.
Unfortunately, I haven't had a chance to try this out, but I think this would work? Further testing needs to be done. |
I have same issue with minikube and che 7.94. whenever I start a workspace it keeps starting the IDE forever. "kubectl logs deploy/ingress-nginx-controller -n ingress-nginx" show-me |
@ronaldor1968 as a workaround, I suggest either using the quay.io/devfile/universal-developer-image:ubi8-latest (you can specify a custom image in the Dashboard's UI when creating a workspace) or configuring the Che Cluster CR in the way I mentioned here (which should prevent this problem for all workspaces created). |
Describe the bug
When Eclipse Che is deployed on Minkube, certain non-UDI based devfile samples don't properly start up. Their health check keeps returning a 502 bad gateway. The same devfiles work when Eclipse Che is deployed on OpenShift. Oddly enough, if you change the workspace so that it uses the UDI
quay.io/devfile/universal-developer-image:ubi8-latest
as the tooling image on Minikube, it seems to start up successfully.I'm not sure yet if:
Che version
7.92@latest
Steps to reproduce
minikube start --disk-size 50000mb
./build/scripts/minikube-tests/test-operator-from-sources.sh
from the Che Operator repo.Note:
quay.io/devfile/universal-developer-image:ubi8-latest
as the container image (from the Dashboard) results in the workspace starting up successfullyregistry.access.redhat.com/ubi9/python-39:1-197.1726664308
) causes the workspace startup to fail with a failed postStart event. Investigation needs to be done to see which postStart event from the Che Code editor devfile is failing.Expected behavior
The workspace should start up successfully and the devworkspace's mainURL should give a 200 response when curl'ing it
Runtime
minikube
Screenshots
No response
Installation method
chectl/next, other (please specify in additional context)
Environment
Linux
Eclipse Che Logs
Additional context
I installed Che using
./build/scripts/minikube-tests/test-operator-from-sources.sh
from the Che Operator repo. Verification to ensure this also happens with chectl needs to be done.The text was updated successfully, but these errors were encountered: