Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some jobs stay in Pending Status. More info given to help re-create #577

Closed
jonnywilliamson opened this issue Mar 5, 2019 · 54 comments
Closed

Comments

@jonnywilliamson
Copy link

jonnywilliamson commented Mar 5, 2019

@themsaid

So I have seen the previous issues related to this
#507
#503
#411

And I'm suffering the same problem. Some of my jobs are reporting as PENDING even though I know they have completed successfully, and also HORIZON shows the same job as being done correctly.

I've spent a lot of time trying to recreate this using as minimal amount of code as possible, and I hope I've succeeded in providing you with an example that will replicate on your system.

Very quickly as an overall picture.

  • I have a LONG running job that gets dispatched.
  • Inside that job other events (jobs) get dispatched too, in this case its a broadcast event to update user with some info.
  • The LONG running job will always show correct status.
  • The broadcast events/jobs do NOT show correct status on TELESCOPE, but DO on horizon.
  • If queues are NOT used (ie sync) everything is OK.
  • A different queue is used between Long runing job and Broadcast jobs.

Here's some code.

// web.php
Route::get('/test', function () {
    dispatch(new \App\Jobs\LongJob());
    return "done";
});
// \App\Jobs\LongJob.php
<?php

namespace App\Jobs;

use App\Events\SendUpdate;
use Illuminate\Bus\Queueable;
use Illuminate\Queue\SerializesModels;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;

class LongJob implements ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    public $tries = 1;
    public $timeout = 120;

    public function handle()
    {
        sleep(5);
        event(new SendUpdate("The first time is:" . time()));
        sleep(5);
        event(new SendUpdate("The second time is:" . time()));
        sleep(10);
    }
}
// App\Events\SendUpdate.php
<?php

namespace App\Events;

use Illuminate\Broadcasting\Channel;
use Illuminate\Queue\SerializesModels;
use Illuminate\Broadcasting\PrivateChannel;
use Illuminate\Foundation\Events\Dispatchable;
use Illuminate\Broadcasting\InteractsWithSockets;
use Illuminate\Contracts\Broadcasting\ShouldBroadcast;

class SendUpdate implements ShouldBroadcast
{
    use Dispatchable, InteractsWithSockets, SerializesModels;

    private $message;
    public $broadcastQueue = 'broadcast';

    public function __construct($message)
    {
        $this->message = $message;
    }

    public function broadcastOn()
    {
        return new PrivateChannel('App.User.1');
    }
}

I have started horizon, and ensured that it is processing jobs from "default" and "broadcast" queues.

Now when I start the whole process job (hitting "/test" in the browser) I'll get the following:

image

But in horizon it's perfect:
image

The next post is some info on my debugging attempts.

@jonnywilliamson
Copy link
Author

I've spend a few hours on this trying to debug.

I found out that for every job I fired, telescope DID pick it and assign it a UUID.
I found out that the recordProcessedJob method in JobWatcher.php WAS being hit for EVERY job that was processed (but still didn't change from PENDING).

Where everything broke down was here in the DatabaseEntriesRepository.php update function.

https://github.com/laravel/telescope/blob/2.0/src/Storage/DatabaseEntriesRepository.php#L183-L191

When the entry was attempted to be found, IF it was a sub job (ie in my case SendUpdate) it COULD NOT be found, so the $entry variable returned null/false and the method ended.

If the job was the LongJob etc, it was ALWAYS found and so the status was changed successfully.

That's as far as I could go because I don't understand when or how the 'telescope_entries' table gets populated in the entire Telescope app cycle.

It would appear this bug appears when the status update happens before the original entry has had a chance to get saved to the database.

Hope that helps somewhat.

@jonnywilliamson
Copy link
Author

Can I provide any more information or has anyone been able to duplicate this at all?

@jonnywilliamson
Copy link
Author

Is anyone else still suffering from this or is it just my own setup that is unique?

@johnpaulmedina
Copy link

I am also experiencing this issue. Not too sure what may be causing it.

@johnpaulmedina
Copy link

I added a few seconds of delay, solved it for me

@johnpaulmedina
Copy link

It seems like if something were to modify the record prior to is what is causing the issue for me. The delay solved it.

@jonnywilliamson
Copy link
Author

I added a few seconds of delay, solved it for me

What do you mean you add a few seconds of delay? Can you elaborate? Thanks.

@johnpaulmedina
Copy link

Yeah sorry. Ok so from what I gathered and reading what you wrote about the update it seems that if an eloquent model is pushed to a job and prior to that job executing if the model was modified it seems to be what was causing telescope to show the job pending even after completion. Possibly the way telescope is storing the information on the telescope tables. When I added a small delay on the execution of the job then it completed and showed that the job was processed correctly.

dispatch(new SomeJob($model))->delay(now()->addSeconds(5))

@jonnywilliamson
Copy link
Author

Ok, Its not ideal but I can try that out and see.

@spielfigur
Copy link

spielfigur commented May 8, 2019

Having the same phenomenon, but it happens very rarely.
in horizon everything is fine, telescope just says: pending
it would be nice to see/debug more information about this 'pending' jobs.

Edit:
well, it is even rarer now. 2 from 24 jobs in the last hour.
in telescope pending, in horizon all checks are green.
confusing

@amosmos
Copy link

amosmos commented Jul 9, 2019

Started happening to me too.
And I don't have neither a job in a job nor an eloquent model passed to it.

What can it be?

@damms005
Copy link

damms005 commented Jul 20, 2019

Whao! several straight hours wasted on this! Was more perplexed when I went to clear the jobs table straight from mysql

#edit: more context: default queues work fine but any job pushed to a custom queue (a queue that is not named default) just gets stuck in status pending

@driesvints driesvints added the bug label Dec 9, 2019
@sadhakbj
Copy link

sadhakbj commented Jan 7, 2020

Facing this issue heavily.

@stotes
Copy link

stotes commented Jan 17, 2020

Also experiencing this issue.

@johnpaulmedina
Copy link

Have you tried to add a delay to the job before processing?

On the queues that have a custom name set a sleep value in between jobs and see if it still happens?

Just curious - let me know why the outcome is. Thanks.

@spielfigur
Copy link

Well, I´m dispatching the jobs since I've had the problem last year with ->addSeconds(rand(5, 40)) and it looks like the issue has gone away.
running > 10.000 jobs per hour
is there a mysql query for these pending jobs to search for?

@johnpaulmedina
Copy link

Yeah I haven’t had it since I added a few seconds of delay processing on the jobs. If I had to guess there is some asynchronous code execution happening that doesn’t let it update the telescope column status in the database or the pending status is being applied after the record is saved/updated/created in the database.

@sadhakbj @stotes can you confirm if you dispatch with a delay if they are left in pending?

@jonnywilliamson
Copy link
Author

@johnpaulmedina The trouble I have with using a delay in dispatching the jobs onto the custom queue is that those jobs are status messages being sent out as broadcast events.

Deliberately adding a delay to those jobs is frustrating as I want those messages processed and sent to the user ASAP.

@drfraker
Copy link

drfraker commented Apr 5, 2020

I'm seeing the same issue. I'm using Laravel Vapor with AWS SQS queues.

@amosmos
Copy link

amosmos commented Apr 5, 2020

It happens to me sometimes and come back to this post and the tips here always help, but one thing I noticed and not written here is that sometimes my fail function also fails, and that keeps that job pending.

@baykier
Copy link

baykier commented Apr 12, 2020

i have the same issue today , and when i installed redis extention , everything works well! check if your project have installed phpredis extention or predis package , laravel queue console command does't warning any message for this error, suprised

@driesvints
Copy link
Member

Hey there,

This indeed looks like a valid bug! However, we are not able to to allocate any time to fixing it ourselves in the near future. But, don’t lose hope! We would be open to receiving a pull request to fix this bug and will leave this issue open so that other possible contributors may easily find it.

Thanks!

@amosmos
Copy link

amosmos commented Apr 20, 2020

Hi @driesvints,

Can you be more specific regarding where the bug is?
It seems like there are several different cases when jobs are "stuck" - for example if there's a guzzle request without timeout that takes forever, or if the fail function fail itself, and more.

So I just curious if you found that there's a real bug that is not mentioned here.

Thanks!
Amos

@driesvints
Copy link
Member

@amosmos I have no idea. Lots of people reporting this but we're not planning on investigating this ourselves. Anyone's free to help out here.

@SilvioDoMine
Copy link

Can confirm this issue

@rodrigopedra
Copy link

rodrigopedra commented Apr 27, 2020

I was facing this issue today. All the jobs sending to a queue were moved as pending instead of being processes.

On my app I fixed on always dispatching form the Bus, for convenience using the DispatchesJob trait.

use Illuminate\Foundation\Bus\DispatchesJobs;
class MyClass {
  use DispatchesJobs;

  public function myMethod() {
    // using this instead of MyJob::dispatch() worked for me
    $this->dispatch(new MyJob());
  }
}

Using the methods added by the Dispatchable trait (default to Jobs on creation) or the dispatch(...) helper seems to handle queuing through a PendingJob wrapper class, whereas the implementation from the Bus behaves differently.

That might indicate where the problem lies, but unfortunately I don't have the time now to investigate it further.

Just for the note I found this issue when looking for jobs going straight to the Pending Jobs filter on Horizon and never getting processed. I am not using Telescope in this particular project.

@karlshea
Copy link

I'm seeing this in Laravel 8, I'm dispatching a job in the deleting function in a model observer. The job completes but is listed as pending in Telescope.

@bram-pkg
Copy link

Still happening in Laravel 8.

@ejntaylor
Copy link

@bramceulemans Same here - any ideas on manually clearing them? Where is the telescope cache these must be saved in?

@bram-pkg
Copy link

@bramceulemans Same here - any ideas on manually clearing them? Where is the telescope cache these must be saved in?

I have clue to be honest. All the logs are in the telescope table.

@ejntaylor
Copy link

Looks like php artisan telescope:clear works

@legreco
Copy link

legreco commented Jun 28, 2022

Still happening, Laravel 8 ....

@hose1021
Copy link

hose1021 commented Jul 1, 2022

Same
image
Without Horizon

@redhedded1
Copy link

still happening in Laravel 9 trying to run queable actions from a job with Spatie package Queable Actions

@grenaria
Copy link

I had a similar problem. I found that Telescope puts the job history into a Redis cache record. Creating a job, however, creates a Redis queue record. They have no relationship with each other.

In my case, I was running two Laravel applications on one Redis instance. All of the keys for session and cache data are prefixed (e.g. scoped) using the app.name config. However, the queue records are NOT scoped similarly. Since I had queues of the same name in both applications, one application was trying to run the queue of another, failing in the second instance, and leaving the Telescope record saying that the job is pending. The fix was to update database.php to make sure the applications used different Redis database numbers for the queue connection.

@erickneverson
Copy link

erickneverson commented Nov 19, 2022

This behavior happens even without telescope (as is the case for me). I've been using horizon with laravel 8 for months now and this is the 1st week where I've experienced this. Not sure exactly how to resolve this atm.

I can see that my job actually starts processing but never completes and therefore times out. It's status in the queue remains pending.

@adhityairvan
Copy link

I had a similar problem. I found that Telescope puts the job history into a Redis cache record. Creating a job, however, creates a Redis queue record. They have no relationship with each other.

In my case, I was running two Laravel applications on one Redis instance. All of the keys for session and cache data are prefixed (e.g. scoped) using the app.name config. However, the queue records are NOT scoped similarly. Since I had queues of the same name in both applications, one application was trying to run the queue of another, failing in the second instance, and leaving the Telescope record saying that the job is pending. The fix was to update database.php to make sure the applications used different Redis database numbers for the queue connection.

thanks, this error was so confusing because I was running 2 laravel instance (prod/stg) on the same redis server. When I read this, I check the prod worker log and it was true, the prod worker trying to run the staging jobs and failing to do so. fixed it with specifying different redis db on each env file. thanks!!!

@denissceluiko
Copy link

For me the perpetual hanging status was because Horizon's workers were running out of memory and failing silently. So if you're desparate and nothing makes sense, this might be the cause. Try monitoring memory usage. There's an article with an idea on how to do that.

While my struggle was with Horizon not Telescope, Google led me here so maybe tthis comment saves sombody a couple of hours of despair.

@erickneverson
Copy link

@rforced any idea on your memory consumption while this happens? Just curious.

@miaotiao
Copy link

我认为我找到了问题。每次请求结束后,telescope 才将 entry 插入到数据库中,但是如果 job 实际执行时间短,在请求结束前就已经完成( job 完成时会更新状态),数据库查找不到该条数据,所以就无法更新成 "processed" 了。

@miaotiao
Copy link

image

@miaotiao miaotiao mentioned this issue Apr 17, 2023
@miaotiao
Copy link

我认为我找到了问题。每次请求结束后,telescope 才将 entry 插入到数据库中,但是如果 job 实际执行时间短,在请求结束前就已经完成( job 完成时会更新状态),数据库查找不到该条数据,所以就无法更新成 "processed" 了。

When jobs are dispatched in a request, they are only inserted into the database when the request ends. If the job is processed before the request ends, the job cannot be updated.

@isaackearl
Copy link

still an issue as far as I can tell.

@tuktukvlad
Copy link

@nunomaduro Hi
Still an issue
I'm using latest versions of Laravel & Telescope and Vapor
image
You can try this in my environment
Run batch: https://iqumrcf5fkdnip766sgi7enppe0gimbz.lambda-url.eu-west-1.on.aws/batch
Telescope: https://iqumrcf5fkdnip766sgi7enppe0gimbz.lambda-url.eu-west-1.on.aws/telescope/batches

@rvzug
Copy link

rvzug commented Aug 15, 2023

Also seeing this issue on AWS SQS (vapor)

        queues:
            - default-production

@nunomaduro
Copy link
Member

@tuktukvlad @rvzug The fix I've made is disabled for Laravel Vapor, as it would dispatch some extra jobs for existing customers. This could represent an extra amount of jobs that customers would not be expecting.

@rvzug
Copy link

rvzug commented Aug 15, 2023

@nunomaduro Ok, I guess pending in our case is mainly "something went wrong". How can I see what went wrong, as we have no telescope-updates and also no Horizon or worker-cli to see what went wrong? The only way is the Vapor logs in the vapor backend?

@ChristopherDosin
Copy link

We have the same issue with Vapor and Laravel 9.

@c0ntr-all
Copy link

I also have this problem, but because of the wrong time zone in config.

@jonnywilliamson
Copy link
Author

@c0ntr-all Could you expand on that? What do you mean by a wrong timezone in the config. Surely all timezones are valid, no matter which one is used?

@tuyennv22ntq
Copy link

Still happening, Laravel 11.9, telescope 5.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests