Quartz - dependent bean jobs should reuse instance for interruption #42096

manovotn · 2024-07-24T08:48:12Z

Follows up on #42072
Related to #42034

ilia1243 · 2024-07-24T10:06:59Z

extensions/quartz/runtime/src/main/java/io/quarkus/quartz/runtime/CdiAwareJob.java

    public void interrupt() throws UnableToInterruptJobException {
-        Instance.Handle<? extends Job> handle = jobInstance.getHandle();
        // delegate if possible; throw an exception in other cases
        if (InterruptableJob.class.isAssignableFrom(handle.getBean().getBeanClass())) {
-            try {
-                ((InterruptableJob) handle.get()).interrupt();
-            } finally {
-                if (handle.getBean().getScope().equals(Dependent.class)) {
-                    handle.destroy();
-                }
-            }
+            ((InterruptableJob) handle.get()).interrupt();


This may call interrupt() on already destroyed instance. Though, not sure if this should be handled somehow...

That is something the bean (the job) should take into consideration. I don't think we can prevent it because the way interrupt and execute interact depend on the the job implementation anyway.
ATM we are delaying the destruction as far as we can - after the execute() is done.

That is something the bean (the job) should take into consideration. I don't think we can prevent it because the way interrupt and execute interact depend on the the job implementation anyway.

Job implementation can control only internals of CustomInterruptableJob.execute(context), and CustomInterruptableJob.interrupt(). It cannot control work of the CdiAwareJob wrapper. For example,

CustomInterruptableJob.execute(context) is finished and goes out of control of the CustomInterruptableJob.

Instance.Handle.destroy() is finished.

Accidentally, CdiAwareJob.interrupt() is invoked and fails with "Instance already destroyed". The control is not even passed to the CustomInterruptableJob.interrupt().

So this is not a matter of custom interrupt and execute interaction, it is a matter of CdiAwareJob's interrupt and execute interaction.

Hm, Quartz impl only calls interrupt() if it finds a running job but it doesn't seem to care about race conditions which is what you imply IIUIC.

In that case, for a dependent bean that has been destroyed prior to interrupt we could:

Not invoke interrupt at all

Keep contextual reference and invoke it anyway even though bean's @PreDestroy was already triggered

I assume the latter would be more user friendly as there can be logging or general logic not related to this bean that you want to happen when interruption is invoked.

Actually I was wrong, this is more complex because of how re-firing jobs work.
A job throwing a proper exception can trigger a re-fire (see here) which will use the same job instance and we want a new bean for that.

manovotn · 2024-07-24T12:43:44Z

Ok, second attempt.

After some digging I realized that we cannot just pass Instance.Handle into CdiAwareJob - we must have Instance as jobs can potentially re-fire (as shown in DependentBeanJobTest) which will use the same CdiAwareJob instance.

So, the current solution:

Passes an Instance into CdiAwareJob
Keeps track of Instance.Handle for a single job execution so that we can destroy the bean after the execute() method is done
Keeps track of contextual instance as we need that for interrupt() method which can, in a rare race condition, occur after `execute()
Creates new Handle and contextual ref. in the execute() method if not present and nullifies the Handle it afterwards. For most jobs, this does nothing but for re-fired jobs backed by dep. beans it makes sure we get a fresh bean on every invocation.

@mkouba @ilia1243 feedback welcome :)

ilia1243 · 2024-07-24T13:21:25Z

extensions/quartz/runtime/src/main/java/io/quarkus/quartz/runtime/CdiAwareJob.java

-                    handle.destroy();
-                }
-            }
+            ((InterruptableJob) contextualReference).interrupt();


Approximately the same rare race condition: the job can be interrupted before contextualReference is set...

I suppose custom interrupt() should be called anyway, so need to initialize bean.

Need not destroy the bean because execute() should be run on the same instance.

The only left is to believe that execute() will really be called to have a chance to destroy the bean...

Some alternative thought about interrupt() after execute().

Since CustomInterruptableJob.execute(context) is already finished in this case, there is seemingly nothing that can control whether CustomInterruptableJob.interrupt() will be called or not. I mean, just a moment later the job will leave the list of currently running job, Scheduler.interrupt() will not find it and return false.

If considering this as an undefined behavior, one could stick to one more option - always return false, but this requires support by Quartz inside QuartzScheduler.interrupt(), like this:

if (job instanceof InterruptableJob) { try { ((InterruptableJob)job).interrupt(); interrupted = true; } catch (JobWasNotReallyInterrupted) { interrupted = false; } }

and raise JobWasNotReallyInterrupted in CdiAwareJob if the job is (being) destroyed.

EDIT the same logic for not yet started to execute() jobs.

Approximately the same rare race condition: the job can be interrupted before contextualReference is set...

Hm, I was trying to avoid using locks here as it would affect all the jobs and make them more expensive for the sake of a corner case...
We could initialize the first state in constructor which would eliminate this issue for anything but re-firing jobs where this slim chance still exists.

I was trying to avoid using locks here as it would affect all the jobs and make them more expensive

Not dramatically more expensive as locks are already used, e.g. in LazyInstanceHandle.

We could initialize the first state in constructor

You mean to call Instance.getHandle() inside constructor? This will in particular rely on if execute() is called eventually to be able to destroy it. I am not sure if this is always the case.

EDIT the same logic for not yet started to execute() jobs.

I am not sure that's the case, quartz wouldn't find that job as running to invoke interrupt on it, would it?

As far as I see from JobRunShell.run(), it

Calls notifyListenersBeginning() that in particular registers the job as running.

Calls execute()

Calls notifyJobListenersComplete() that in particular unregisters the job.

The problem in what should happen with interrupt() in periods between 1 and 2, and between 2 and 3 (plus some time after 3, because Quartz iterates over copy of running jobs out of lock to find the job for interrupting).

In comment I thought about if behavior in these periods can be considered undefined, so that it would be possible to choose the specific one mentioned in that comment.

quarkus-bot · 2024-07-24T14:10:06Z

Status for workflow `Quarkus CI`

This is the status report for running Quarkus CI on commit 44c8cfa.

✅ The latest workflow run for the pull request has completed successfully.

It should be safe to merge provided you have a look at the other checks in the summary.

You can consult the Develocity build scans.

manovotn · 2024-07-30T13:34:16Z

Superseeded by #42223
I will move backport labels there as well.

manovotn requested a review from mkouba July 24, 2024 08:48

quarkus-bot bot added the area/scheduler label Jul 24, 2024

manovotn force-pushed the 42034_improve branch from 8d41297 to 1fc14bf Compare July 24, 2024 09:05

manovotn mentioned this pull request Jul 24, 2024

Quartz - allow bean based jobs to be interruptable #42072

Merged

This comment has been minimized.

Sign in to view

ilia1243 reviewed Jul 24, 2024

View reviewed changes

Quartz - dependent bean jobs should reuse instance for interruption

44c8cfa

manovotn force-pushed the 42034_improve branch from 1fc14bf to 44c8cfa Compare July 24, 2024 12:32

ilia1243 reviewed Jul 24, 2024

View reviewed changes

gsmet added triage/backport-3.8 triage/backport labels Jul 25, 2024

manovotn mentioned this pull request Jul 30, 2024

Quartz - fix job refire behavior, improve dep. bean job scenario #42223

Merged

manovotn closed this Jul 30, 2024

quarkus-bot bot added triage/invalid This doesn't seem right and removed triage/backport-3.8 triage/backport labels Jul 30, 2024

manovotn deleted the 42034_improve branch July 30, 2024 13:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quartz - dependent bean jobs should reuse instance for interruption #42096

Quartz - dependent bean jobs should reuse instance for interruption #42096

manovotn commented Jul 24, 2024

This comment has been minimized.

ilia1243 Jul 24, 2024

manovotn Jul 24, 2024

ilia1243 Jul 24, 2024

manovotn Jul 24, 2024

manovotn Jul 24, 2024

manovotn commented Jul 24, 2024

ilia1243 Jul 24, 2024 •

edited

Loading

manovotn Jul 24, 2024

ilia1243 Jul 24, 2024

manovotn Jul 24, 2024

ilia1243 Jul 24, 2024 •

edited

Loading

quarkus-bot bot commented Jul 24, 2024 •

edited by github-actions bot

Loading

manovotn commented Jul 30, 2024

Quartz - dependent bean jobs should reuse instance for interruption #42096

Quartz - dependent bean jobs should reuse instance for interruption #42096

Conversation

manovotn commented Jul 24, 2024

This comment has been minimized.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

manovotn commented Jul 24, 2024

ilia1243 Jul 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ilia1243 Jul 24, 2024 • edited Loading

Choose a reason for hiding this comment

quarkus-bot bot commented Jul 24, 2024 • edited by github-actions bot Loading

Status for workflow Quarkus CI

manovotn commented Jul 30, 2024

ilia1243 Jul 24, 2024 •

edited

Loading

ilia1243 Jul 24, 2024 •

edited

Loading

quarkus-bot bot commented Jul 24, 2024 •

edited by github-actions bot

Loading

Status for workflow `Quarkus CI`