-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Websocket error when running Filet #22
Comments
I've been running jobs in a bigger machine (n2-highmem-48 with 384GB of RAM instead of 128GB). All was working fine but after 2 days, some Lily daemons have started to die. 😿 Same error as always, Sentinel Archiver says That said, after manually 😅 checking lots of machines, I discovered this error:
|
cc @rvagg: the above looks like the area you worked around some time back |
gee, we're digging back into history on this one
My best guess is that we're dealing with a |
Small update. Got the error again when running a |
@davidgasquez are you sure the only task running was the |
It was only running Need to dig a bit more on this one though. As I mentioned, the end error is related to |
From time to time,
filet
jobs will get stuck in Google Cloud Batch. Thelily
daemon gets killed andsentinel-archiver
hangs waiting for it to come back.This is how the resources looks like:
The log produced by
lily
reportsno route found for ::
andwebsocket: close 1000 (normal)
.The issue might be related with the job missing resources.
The text was updated successfully, but these errors were encountered: