Worker can get caught in infinite loop when redis connection is closed unexpectedly #389

airhorns · 2021-02-04T17:13:38Z

Because of https://github.com/taskforcesh/bullmq/blob/master/src/classes/worker.ts#L192-L195 , I am seeing workers get caught in an infinite loop of trying to get the next job only to error again trying to get the next job.

The loop is this:

Redis connection is closed somehow (I assume this is my code doing this but don't have a smoking gun yet)
worker is not closed explicitly, this.closing inside the worker is undefined
worker run loop calls getNextJob asynchronously, which calls waitForJob, which calls BRPOPLPUSH
Redis client synchronously throws, which interrupts execution of waitForJob, and is caught in getNextJob, and swallowed
getNextJob call wins the Promise.race in worker run loop
getNextJob call returns nothing, so worker doesn't work the job
worker run loop repeats the process

I am not sure why that error is being swallowed, or why the Connection is closed. message is special. If I had to guess, special care has to be taken to handle connection closes that we expect on the blocking calls like that. In this case though, the worker has not been explicitly closed, and it's being asked to use a closed redis connection, which I think should be an error that at least gets emitted and maybe takes down the process if unhandled.

Happy to do up a PR if someone can tell me what the semantics should be!

The text was updated successfully, but these errors were encountered:

airhorns · 2021-02-04T17:19:09Z

Also, I think this is actually the root cause of #359, not pausing! The event loop starvation happens where my timeouts for a test or what have you never fire because of this async-but-infinite loop that has higher priority in the node tick order I think.

manast · 2021-02-04T21:22:03Z

I am working on a fix for this. This issue exists in bull v3 and I expect to have a fix by tomorrow.

## [1.14.3](v1.14.2...v1.14.3) (2021-02-07) ### Bug Fixes * **worker:** avoid possible infinite loop fixes [#389](#389) ([d05566e](d05566e))

github-actions · 2021-02-07T09:32:02Z

🎉 This issue has been resolved in version 1.14.3 🎉

The release is available on:

Your semantic-release bot 📦🚀

airhorns · 2021-02-07T13:37:39Z

Thanks @manast !

manast added a commit that referenced this issue Feb 6, 2021

fix(worker): avoid possible infinite loop fixes #389

f97af53

manast closed this as completed in d05566e Feb 7, 2021

github-actions bot pushed a commit that referenced this issue Feb 7, 2021

chore(release): 1.14.3 [skip ci]

8266fcc

## [1.14.3](v1.14.2...v1.14.3) (2021-02-07) ### Bug Fixes * **worker:** avoid possible infinite loop fixes [#389](#389) ([d05566e](d05566e))

github-actions bot added the released label Feb 7, 2021

airhorns mentioned this issue Feb 10, 2021

Repeatedly pausing a worker can cause it to starve the event loop #359

Closed

wenq1 mentioned this issue Apr 23, 2021

Pausing/resuming worker causes process to crash #155

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Worker can get caught in infinite loop when redis connection is closed unexpectedly #389

Worker can get caught in infinite loop when redis connection is closed unexpectedly #389

airhorns commented Feb 4, 2021

airhorns commented Feb 4, 2021

manast commented Feb 4, 2021

github-actions bot commented Feb 7, 2021

airhorns commented Feb 7, 2021

Worker can get caught in infinite loop when redis connection is closed unexpectedly #389

Worker can get caught in infinite loop when redis connection is closed unexpectedly #389

Comments

airhorns commented Feb 4, 2021

airhorns commented Feb 4, 2021

manast commented Feb 4, 2021

github-actions bot commented Feb 7, 2021

airhorns commented Feb 7, 2021