Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected workspace restart due to scale down of a replicaset #22750

Closed
akurinnoy opened this issue Jan 5, 2024 · 11 comments · Fixed by eclipse-che/che-dashboard#1048
Closed
Assignees
Labels
area/devworkspace-operator area/dogfooding Using Eclispe Che to code, test and build Eclipse Che kind/bug Outline of a bug - must adhere to the bug report template. severity/P1 Has a major impact to usage or development of the system. sprint/next

Comments

@akurinnoy
Copy link
Contributor

akurinnoy commented Jan 5, 2024

Describe the bug

Using Che on the dogfooding cluster, I noticed my workspace being restarted. It seems to happen when I pause typing in the editor for a few minutes.

Che version

7.79@latest

Steps to reproduce

(updated)

  1. Open the User Dashboard and create a workspace using this Git repository: https://github.com/eclipse-che/che-dashboard
  2. Once the project is cloned, find the devfile in the root of the project and replace its content with this gist (https://gist.github.com/akurinnoy/a96ac53d5df6354c95339670cf665560)
  3. Reload the workspace using the local devfile.
  4. Run tasks:
    • devfile: watch frontend
    • devfile: dogfooding start
  5. Wait for the workspace to be restarted.

Expected behavior

The workspace does not restart on its own.

Runtime

OpenShift

Screenshots

Screenshot 2024-01-05 at 11 10 23

Installation method

other (please specify in additional context)

Environment

other (please specify in additional context)

Eclipse Che Logs

No response

Additional context

No response

@akurinnoy akurinnoy added the kind/bug Outline of a bug - must adhere to the bug report template. label Jan 5, 2024
@che-bot che-bot added the status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. label Jan 5, 2024
@ibuziuk
Copy link
Member

ibuziuk commented Jan 5, 2024

@amisevsk do you happen to have any ideas why deployment could be scaled down randomly?

@ibuziuk ibuziuk added sprint/next severity/P1 Has a major impact to usage or development of the system. area/devworkspace-operator and removed status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. labels Jan 5, 2024
@ibuziuk
Copy link
Member

ibuziuk commented Jan 8, 2024

Adding devworkspace-operator label for initial investigation, however, it might completely unrelated to DWO

@ibuziuk ibuziuk changed the title [dogfooding] Unexpected workspace restart [dogfooding] Unexpected workspace restart due to scale down of a replicaset Jan 8, 2024
@ibuziuk ibuziuk changed the title [dogfooding] Unexpected workspace restart due to scale down of a replicaset Unexpected workspace restart due to scale down of a replicaset Jan 8, 2024
@ibuziuk ibuziuk added the area/dogfooding Using Eclispe Che to code, test and build Eclipse Che label Jan 8, 2024
@amisevsk
Copy link
Contributor

amisevsk commented Jan 8, 2024

I'm not sure the steps to reproduce is complete -- when I start a workspace from the provided devfile, the dashboard is not cloned and I don't see any commands (maybe a bug in the che-tasks-extension?). I cloned the che-dashboard repository and opened it in code but the devfile tasks failed to run at that point.

@akurinnoy
Copy link
Contributor Author

Hi @amisevsk ,

That's my fault, I missed some steps in the How to reproduce section. I've added missed steps, and hopefully, now it should work.

@amisevsk
Copy link
Contributor

amisevsk commented Jan 9, 2024

Hi @akurinnoy -- looks like your edits didn't get saved. The steps look the same as before to me :)

@akurinnoy
Copy link
Contributor Author

@amisevsk Finally I've updated the description, sorry for that :)

@amisevsk
Copy link
Contributor

amisevsk commented Jan 9, 2024

Thanks! I was able to reproduce the issue now (after running update dependencies as well, in case anyone runs into issues).

From the DWO logs, it is indeed updating the workspace deployment which is triggering a rollout (i.e. a new pod since the replicaset is updated). This seems to be due to something updating the DevWorkspace to have a new environment variable:

  		Name: \"CHE_DASHBOARD_URL\",
  		Value: strings.Join({
  			\"http\",
- 			\"://amisevsk-che-dashboard-local-server\",
+ 			\"s://che-dogfooding\",
  			\".<redacted>.com\",
  		}, \"\"),

which is coming from the che-code editor's DevWorkspaceTemplate.

@akurinnoy Is there a chance the dashboard, while running in dogfooding mode, is updating this DevWorkspaceTemplate to use its current dashboard URL?

@akurinnoy
Copy link
Contributor Author

@amisevsk let me check it.

@akurinnoy
Copy link
Contributor Author

Hi @amisevsk ,
I got two reloads in a row, and there were no requests that may update the DevWorkspaceTemplate...

@amisevsk
Copy link
Contributor

I'll take another look.

@amisevsk
Copy link
Contributor

I've tested it again on dogfooding and what I've found is that as soon as I load the dashboard in my browser, the URL for CHE_DASHBOARD_URL is updated with the current dashboard's URL (which is different in the running-within-a-workspace case).

Since DWO does not automatically trigger reconciles when DevWorkspaceTemplates are updated, this change is not initially noticed by the DevWorkspace Operator. However, when something does trigger a reconcile on the workspace, DWO will notice that the resolve workspace has a different CHE_DASHBOARD_URL and update the deployment, triggering a restart of the workspace. You can test this by manually triggering a reconcile by e.g. adding a nonsense annotation to the DevWorkspace:

oc annotate --overwrite dw $DEVWORKSPACE_NAME -n $DEVWORKSPACE_NAMESPACE "recon
cile=$(date +%s)"

(Run this in the workspace terminal)

To track changes to the CHE_DASHBOARD_URL env var, I used the following bash snippet:

while true; do
  date
  oc get dwt che-code-che-dashboard -o yaml | grep -A 1 CHE_DASHBOARD_URL
  sleep 1
done

while this is running, reload either the usual Che dashboard or the one running within your workspace to see the effect.

I'm still not sure where the env var is getting set, but it seems to be on first load of the dashboard page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/devworkspace-operator area/dogfooding Using Eclispe Che to code, test and build Eclipse Che kind/bug Outline of a bug - must adhere to the bug report template. severity/P1 Has a major impact to usage or development of the system. sprint/next
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants