how to recreate cluster? #120
-
Hi, All! This is probably a dumb question, but I don't understand enough about what's happening behind the scenes in order to tell how or why it's dumb... which I guess makes it non-dumb? I am trying to prototype a hopeful analysis, but am currently just stuck fiddling to figure out how to the optimize pipeline structure and the dask cluster memory and other configurations. As part of this process, I've found that I run a trial, watch the dask dashboard, watch workers fail, and then need to revisit my setup (e.g., increase the RAM per worker). When I do this, then launch and connect to a new cluster and rerun the code I'm stuck, I typically find that the dashboard never populates with workers and work never starts to happen. I imagine it's just hanging, waiting for resources to be provisioned. I thought I was fine iterating this way, as long as I was calling Thanks for any insight! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
|
Beta Was this translation helpful? Give feedback.
-
@erthward if you're having memory issues with Dask, you might also want to try this: dask/distributed#7128 |
Beta Was this translation helpful? Give feedback.
cluster.close()
should be releasing the resources, so I think you're doing things correctly. You can checkgateway.list_cluster()
to see if you have any other clusters. It's possible that there's some delay between callingcluster.close()
and the resources actually being freed by Kubernetes (and you don't have access to the Kubernetes API, so you don't have any direct visibility into when the resources are freed).