Refinery pods using LiveReload
become unresponsive and eventually consume all CPU
#836
Labels
type: bug
Something isn't working
Versions
Steps to reproduce
LiveReload
enabledAdditional context
Important preface: I'm going to give some context around an outage here, but I want to be clear: I'm not creating this issue to complain about free software! I 💝 refinery, I'm so happy we have it, and I'm very grateful for the work y'all put in to creating and maintaining it! As with all outages we learned some good lessons, and ultimately I just want to make sure we understand what went wrong and how we and others can avoid it in the future!
We had an outage yesterday and while I'm still trying to piece together exactly what happened, I'm relatively certain that LiveReload'ing refinery pods were the main driver. Here's what I know:
refinery
instances usingkubectl
. After I did this two nodes which had not been reporting for the past hour-ish suddenly reported that they had been under near 100% load for the entire duration of the outage, and they recovered as soon as I restarted the refinery pods on them.LiveReload
might cause infinite CPU usage. This seems like it would be related to fix: live reload deadlock #810 maybe? But we should have that patch already since we are on2.1.2
?The text was updated successfully, but these errors were encountered: