Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quarkus - SmallryeHealth - PreDestroy not called on HealthCheck #38878

Open
olivierbeltrandocintoo opened this issue Feb 20, 2024 · 33 comments
Open

Comments

@olivierbeltrandocintoo
Copy link

olivierbeltrandocintoo commented Feb 20, 2024

Describe the bug

I expect that having a class implementing HealthCheck that gets injected a RequestScoped object will call the @PreDestroy method of the injected object at the end of the check.

Expected behavior

On each call to http://localhost:8080/q/health/ the PreDestroy is called on the HealtCheck marked as RequestScoped

Actual behavior

On Quarkus 3.6.7:
On call http://localhost:8080/q/health/ the PreDestroy is called only the first time.

On Quarkus 3.7.3
On call http://localhost:8080/q/health/ the PreDestroy is never called.

How to Reproduce?

Clone this repo based on the SmallRyeHealth tutorial

https://github.com/olivierbeltrandocintoo/quarkus-health-noPreDestroy

Note: I usually work in Kotlin but to have the simplest minimal case, I made it in Java :)

mvn quarkus:dev

Call the health endpoints and see that in the logs that the PreDestroy is either not called (for quarkus 3.7.3) or only called the first time (for quarkus 3.6.7)

$ curl localhost:8080
manual%
$ curl localhost:8080
manual%
$ curl localhost:8080
manual%
$ curl localhost:8080
manual%
$ curl localhost:8080/auto
auto%
$ curl localhost:8080/auto
auto%
$ curl localhost:8080/auto
auto%
$ curl localhost:8080/auto
auto%
$ curl localhost:8080/auto
auto%
$ curl localhost:8080/q/health
{
    "status": "UP",
    "checks": [
        {
            "name": "DbConnection",
            "status": "UP"
        }
    ]
}%
$ curl localhost:8080/q/health
{
    "status": "UP",
    "checks": [
        {
            "name": "DbConnection",
            "status": "UP"
        }
    ]
}%
$ curl localhost:8080/q/health
^[[A
{
    "status": "DOWN",
    "checks": [
        {
            "name": "DbConnection",
            "status": "DOWN",
            "data": {
                "Exception": "Could not acquire on time"
            }
        }
    ]
}%

logs related

__  ____  __  _____   ___  __ ____  ______ 
 --/ __ \/ / / / _ | / _ \/ //_/ / / / __/ 
 -/ /_/ / /_/ / __ |/ , _/ ,< / /_/ /\ \   
--\___\_\____/_/ |_/_/|_/_/|_|\____/___/   
2024-02-20 11:28:50,928 INFO  [io.quarkus] (Quarkus Main Thread) microprofile-health-quickstart 1.0.0-SNAPSHOT on JVM (powered by Quarkus 3.7.3) started in 1.681s. Listening on: http://0.0.0.0:8080

2024-02-20 11:28:50,931 INFO  [io.quarkus] (Quarkus Main Thread) Profile dev activated. Live Coding activated.
2024-02-20 11:28:50,931 INFO  [io.quarkus] (Quarkus Main Thread) Installed features: [cdi, resteasy-reactive, smallrye-context-propagation, smallrye-health, vertx]
2024-02-20 11:28:52,135 INFO  [myp.Endpoint] (executor-thread-1) manual
2024-02-20 11:28:52,138 INFO  [myp.DbSource] (executor-thread-1) Acquire...
2024-02-20 11:28:52,138 INFO  [myp.DbSource] (executor-thread-1) Acquire OK!1
2024-02-20 11:28:52,139 INFO  [myp.DbSource] (executor-thread-1) Release
2024-02-20 11:28:52,142 WARN  [myp.DbSource] (executor-thread-1) endOfRequest called
2024-02-20 11:28:53,911 INFO  [myp.Endpoint] (executor-thread-1) manual
2024-02-20 11:28:53,912 INFO  [myp.DbSource] (executor-thread-1) Acquire...
2024-02-20 11:28:53,913 INFO  [myp.DbSource] (executor-thread-1) Acquire OK!1
2024-02-20 11:28:53,913 INFO  [myp.DbSource] (executor-thread-1) Release
2024-02-20 11:28:53,914 WARN  [myp.DbSource] (executor-thread-1) endOfRequest called
2024-02-20 11:28:54,579 INFO  [myp.Endpoint] (executor-thread-1) manual
2024-02-20 11:28:54,580 INFO  [myp.DbSource] (executor-thread-1) Acquire...
2024-02-20 11:28:54,580 INFO  [myp.DbSource] (executor-thread-1) Acquire OK!1
2024-02-20 11:28:54,580 INFO  [myp.DbSource] (executor-thread-1) Release
2024-02-20 11:28:54,581 WARN  [myp.DbSource] (executor-thread-1) endOfRequest called
2024-02-20 11:28:55,112 INFO  [myp.Endpoint] (executor-thread-1) manual
2024-02-20 11:28:55,112 INFO  [myp.DbSource] (executor-thread-1) Acquire...
2024-02-20 11:28:55,113 INFO  [myp.DbSource] (executor-thread-1) Acquire OK!1
2024-02-20 11:28:55,113 INFO  [myp.DbSource] (executor-thread-1) Release
2024-02-20 11:28:55,114 WARN  [myp.DbSource] (executor-thread-1) endOfRequest called
2024-02-20 11:28:59,089 INFO  [myp.Endpoint] (executor-thread-1) auto
2024-02-20 11:28:59,090 INFO  [myp.DbSource] (executor-thread-1) Acquire...
2024-02-20 11:28:59,090 INFO  [myp.DbSource] (executor-thread-1) Acquire OK!1
2024-02-20 11:28:59,091 WARN  [myp.DbSource] (executor-thread-1) endOfRequest called
2024-02-20 11:28:59,091 INFO  [myp.DbSource] (executor-thread-1) Release
2024-02-20 11:29:00,082 INFO  [myp.Endpoint] (executor-thread-1) auto
2024-02-20 11:29:00,083 INFO  [myp.DbSource] (executor-thread-1) Acquire...
2024-02-20 11:29:00,083 INFO  [myp.DbSource] (executor-thread-1) Acquire OK!1
2024-02-20 11:29:00,084 WARN  [myp.DbSource] (executor-thread-1) endOfRequest called
2024-02-20 11:29:00,084 INFO  [myp.DbSource] (executor-thread-1) Release
2024-02-20 11:29:00,620 INFO  [myp.Endpoint] (executor-thread-1) auto
2024-02-20 11:29:00,620 INFO  [myp.DbSource] (executor-thread-1) Acquire...
2024-02-20 11:29:00,621 INFO  [myp.DbSource] (executor-thread-1) Acquire OK!1
2024-02-20 11:29:00,621 WARN  [myp.DbSource] (executor-thread-1) endOfRequest called
2024-02-20 11:29:00,621 INFO  [myp.DbSource] (executor-thread-1) Release
2024-02-20 11:29:01,068 INFO  [myp.Endpoint] (executor-thread-1) auto
2024-02-20 11:29:01,069 INFO  [myp.DbSource] (executor-thread-1) Acquire...
2024-02-20 11:29:01,069 INFO  [myp.DbSource] (executor-thread-1) Acquire OK!1
2024-02-20 11:29:01,069 WARN  [myp.DbSource] (executor-thread-1) endOfRequest called
2024-02-20 11:29:01,070 INFO  [myp.DbSource] (executor-thread-1) Release
2024-02-20 11:29:01,505 INFO  [myp.Endpoint] (executor-thread-1) auto
2024-02-20 11:29:01,506 INFO  [myp.DbSource] (executor-thread-1) Acquire...
2024-02-20 11:29:01,506 INFO  [myp.DbSource] (executor-thread-1) Acquire OK!1
2024-02-20 11:29:01,507 WARN  [myp.DbSource] (executor-thread-1) endOfRequest called
2024-02-20 11:29:01,507 INFO  [myp.DbSource] (executor-thread-1) Release
2024-02-20 11:29:05,521 INFO  [myp.DbSource] (vert.x-worker-thread-1) Acquire...
2024-02-20 11:29:05,521 INFO  [myp.DbSource] (vert.x-worker-thread-1) Acquire OK!1
2024-02-20 11:29:06,485 INFO  [myp.DbSource] (vert.x-worker-thread-1) Acquire...
2024-02-20 11:29:06,486 INFO  [myp.DbSource] (vert.x-worker-thread-1) Acquire OK!0
2024-02-20 11:29:07,150 INFO  [myp.DbSource] (vert.x-worker-thread-1) Acquire...
2024-02-20 11:29:10,153 INFO  [io.sma.health] (vert.x-eventloop-thread-1) SRHCK01001: Reporting health down status: {"status":"DOWN","checks":[{"name":"DbConnection","status":"DOWN","data":{"Exception":"Could not acquire on time"}}]}
2024-02-20 11:29:10,173 INFO  [myp.DbSource] (vert.x-worker-thread-1) Acquire...
2024-02-20 11:29:13,175 INFO  [io.sma.health] (vert.x-eventloop-thread-0) SRHCK01001: Reporting health down status: {"status":"DOWN","checks":[{"name":"DbConnection","status":"DOWN","data":{"Exception":"Could not acquire on time"}}]}

Output of uname -a or ver

5.10.102.1-microsoft-standard-WSL2 #1 SMP Wed Mar 2 00:30:59 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Output of java -version

OpenJDK Runtime Environment (build 17+35-2724)

Quarkus version or git rev

3.7.3 and 3.6.7

Build tool (ie. output of mvnw --version or gradlew --version)

Apache Maven 3.9.6

Additional information

No response

Copy link

quarkus-bot bot commented Feb 20, 2024

/cc @Ladicek (smallrye), @geoand (kotlin), @jmartisk (health,smallrye), @phillip-kruger (smallrye), @radcortez (smallrye), @xstefank (health)

@Ladicek
Copy link
Contributor

Ladicek commented Feb 20, 2024

I'll clone the reproducer and will investigate properly, but at the first sight, the @ActivateRequestContext annotation on a @RequestScoped bean seems 1. unnecessary, 2. potentially dangerous.

Also, the fact that a request-scoped bean is injected anywhere or what's the scope of the bean that injects it should not matter. A request-scoped bean must be destroyed when the request context terminates.

@gsmet
Copy link
Member

gsmet commented Feb 20, 2024

@Ladicek for your awareness, if you are having a look to this one. There are a few other oddities in progress:

Might not be related but better have the big picture :)

@Ladicek
Copy link
Contributor

Ladicek commented Feb 20, 2024

Thanks @gsmet, I'll keep that in mind!

@Ladicek
Copy link
Contributor

Ladicek commented Feb 20, 2024

Ah the README in the reproducer has an interesting bit of information: the issue occurs when calling the health check, but when calling any other endpoint, it doesn't. So the links above are indeed very much relevant.

@olivierbeltrandocintoo
Copy link
Author

I'll clone the reproducer and will investigate properly, but at the first sight, the @ActivateRequestContext annotation on a @RequestScoped bean seems 1. unnecessary, 2. potentially dangerous.

Also, the fact that a request-scoped bean is injected anywhere or what's the scope of the bean that injects it should not matter. A request-scoped bean must be destroyed when the request context terminates.

Hi @Ladicek ,
Sorry, the @ActivateRequestContext is an attempt of workaround (that did not work). Removing it from the sample. Thanks for the help :)

@xstefank
Copy link
Member

This is caused by manual Uni caching done in SmallRye Health for performance reasons. To correctly handle CDI context propagation, you need to set the quarkus.smallrye-health.context-propagation=true property - https://quarkus.io/guides/smallrye-health#context-propagation-into-the-health-check-invocations. We should add a warning if the user attempts to use CDI without this flag or maybe we should change the default to recreate Unis and just allow to set caching explicitly. WDYT @cescoffier I think that you originally proposed Uni caching in Health.

@cescoffier
Copy link
Member

Caching is fine if you handle the duplicated context correctly - See #37077

@olivierbeltrandocintoo
Copy link
Author

Hello @xstefank , I have tried adding the property quarkus.smallrye-health.context-propagation=true in the application.properties but the issue persists -> Ie. the @PreDestroy is not called.
You can test on the provided reproducer :)

  • add the line in the properties file
  • mvn clean quarkus:dev
  • call http://localhost:8080/q/health/ 3 times, the 4th time it is going to get DOWN -> see logs to check that the message endOfRequest called is missing

@gsmet
Copy link
Member

gsmet commented Feb 20, 2024

This is caused by manual Uni caching done in SmallRye Health for performance reasons.

What exactly are you caching?

@xstefank
Copy link
Member

xstefank commented Feb 20, 2024

@olivierbeltrandocintoo There was another issue that collided with this and will be fixed with #38888. With this PR, your reproducer works.

@gsmet I will need to fix it. In SR Health, when you invoke health, we compute a Uni for every major health group, let's say for readiness, we have readinessUni that is saved. On the next invocation, we just return this saved readinessUni once called again so Quarkus can resubscribe and call all end Uni subscriptions (health checks) again. But we save the finding and collecting of health checks into a parent Uni.

@olivierbeltrandocintoo
Copy link
Author

Hi ! I run the reproducer against Quarkus 3.7.4 that contains the fix #38888 and you are totally right, the @PreDestroy related to the RequestContext is invoked now :) !
And you were also right: the quarkus.smallrye-health.context-propagation=true needs to be set.
Thanks a lot.
Cheers,
Olivier

@xstefank
Copy link
Member

Just for the record, I will investigate whether we can also propagate context without recreating Unis.

@Ladicek
Copy link
Contributor

Ladicek commented Feb 28, 2024

@xstefank When we talked about this the other day, I learned that what we do is actually more correct than I thought, but it also got me thinking. On each call to a health endpoint, how many Unis are we creating and how many of them do we avoid creating by the virtue of caching?

Say that calling a health endpoint requires calling N health checks. We have cached a Uni that combines N different Unis (where each corresponds to 1 health check), those are also cached. So we have N+1 Unis cached. Is that right? However, each of those N Unis, when subscribed, creates a Uni that actually corresponds to the called health check. So we still create N Unis for each request. Is that still right? If so, I doubt we gain anything by caching (except overhead). I'd suggest to just get rid of it.

@olivierbeltrandocintoo
Copy link
Author

olivierbeltrandocintoo commented Mar 13, 2024

This same issue exists with Kotlin code, and is not solved.

Here is the reproducer (same code as in Java but in Kotlin): https://github.com/olivierbeltrandocintoo/quarkus-health-noPreDestroy-kotlin

Also it has been like this for as long as I know :)
Any idea ?

Also would you like me to open another ticket for this issue ?
(and there is also another issue I discovered (in Java (because the preDestroy is not called in Kotlin) if a probe is declared both as readiness and liveness then the preDestroy is called twice when calling the q/health endpoint -> should I open also another ticket for that or is it by design ?)

@Ladicek
Copy link
Contributor

Ladicek commented Mar 13, 2024

If a @PreDestroy method is called twice on a single object, that's definitely a bug. But please verify first if that's really on the same instance, and then file an issue.

@olivierbeltrandocintoo
Copy link
Author

Hi @Ladicek , this part behaves nicely (I wrongly set the counter as static in my example). Thanks :) I will monitor then how it behaves once the Kotlin version calls the preDestroy.

What do you think about the issue in Kotlin ?

@Ladicek
Copy link
Contributor

Ladicek commented Mar 13, 2024

I can't think of a reason why it should be different between Java and Kotlin.

@mkouba
Copy link
Contributor

mkouba commented Sep 17, 2024

Hi @Ladicek , this part behaves nicely (I wrongly set the counter as static in my example). Thanks :) I will monitor then how it behaves once the Kotlin version calls the preDestroy.

@olivierbeltrandocintoo Is this still a problem? Have you observed an incorrect behavior in Kotlin?

@geoand geoand added the triage/needs-feedback We are waiting for feedback. label Sep 17, 2024
@olivierbeltrandocintoo
Copy link
Author

Thanks for the ping. I have deactivated those in the meantime because the issue was not fixed. Let me check again and get back to you :)

@olivierbeltrandocintoo
Copy link
Author

Hi, so unfortunately the issue is still persisting even with latest versions of Quarkus (3.14.4) and Kotlin (from the bom 2.0.10).
image

Don't hesitate to use this project as reproducer: https://github.com/olivierbeltrandocintoo/quarkus-health-noPreDestroy-kotlin
steps:

  • mvn clean quarkus:dev
  • visit http://localhost:8080/q/health with your browser a few times, and notice the endOfRequest never being called for the health probe endpoint

Thanks :)

@geoand geoand removed the triage/needs-feedback We are waiting for feedback. label Sep 18, 2024
@mkouba
Copy link
Contributor

mkouba commented Sep 18, 2024

@olivierbeltrandocintoo Thanks! Will take a look...

@mkouba
Copy link
Contributor

mkouba commented Sep 18, 2024

I can confirm that the @PreDestroy callback is not invoked. And it's not a Kotlin-specific problem. If you add the following snippet to the SimpleHealthCheck.java from the quickstart:

    @PostConstruct
    void init() {
        Log.warn("Init called");
    }
    
    @PreDestroy
    void destroy() {
        Log.warn("Destroy called");
    } 

change the scope to @RequestScoped, and hit http://localhost:8080/q/health/live several times you'll see something like:

2024-09-18 09:12:06,983 WARN  [org.acm.mic.hea.SimpleHealthCheck] (vert.x-worker-thread-2) Init called
2024-09-18 09:12:06,988 WARN  [org.acm.mic.hea.SimpleHealthCheck] (vert.x-eventloop-thread-0) Destroy called
2024-09-18 09:12:10,664 WARN  [org.acm.mic.hea.SimpleHealthCheck] (vert.x-worker-thread-1) Init called
2024-09-18 09:12:14,016 WARN  [org.acm.mic.hea.SimpleHealthCheck] (vert.x-worker-thread-2) Init called
2024-09-18 09:12:14,201 WARN  [org.acm.mic.hea.SimpleHealthCheck] (vert.x-worker-thread-1) Init called

I'm actually quite surprised that it's not tested by the MP TCK. The spec does not explicitly require the support for @RequestScoped but it's IMO implicit.

CC @xstefank @Ladicek

@xstefank
Copy link
Member

@mkouba it's not, app scoped health checks are normal. I would think that handling of PreDestroy is CDI responsibility, not something I should do manually in sr-health. Except for Dependent health checks which are fixed (not yet in Quarkus) smallrye/smallrye-health#563.

@mkouba
Copy link
Contributor

mkouba commented Sep 18, 2024

@mkouba it's not, app scoped health checks are normal. I would think that handling of PreDestroy is CDI responsibility, not something I should do manually in sr-health. Except for Dependent health checks which are fixed (not yet in Quarkus) smallrye/smallrye-health#563.

I'm sorry, you have to change the scope of SimpleHealthCheck to @RequestScoped too.

@olivierbeltrandocintoo
Copy link
Author

Please keep in mind that originally this ticket described a bug that was present in both java and kotlin. And the bug was fixed for Java (see this repo https://github.com/olivierbeltrandocintoo/quarkus-health-noPreDestroy).
image

The Kotlin code tries to be similar to the java but the endOfRequest is not called.

@mkouba
Copy link
Contributor

mkouba commented Sep 18, 2024

And the bug was fixed for Java

I don't think it was fixed for Java...

@Ladicek
Copy link
Contributor

Ladicek commented Sep 18, 2024

For the record, I would be strongly in favor of removing the Uni caching in SmallRye Health. I don't believe there's any gain to be had from that caching, as explained in my previous comment.

@geoand
Copy link
Contributor

geoand commented Sep 18, 2024

+100

@mkouba
Copy link
Contributor

mkouba commented Sep 19, 2024

For the record, I would be strongly in favor of removing the Uni caching in SmallRye Health. I don't believe there's any gain to be had from that caching, as explained in my previous comment.

We could do it but this solution just masks the original problem.

I was digging in the code and did some experiments and it seems that the non-cached solution (with quarkus.smallrye-health.context-propagation=true) only works when SmallRye Context Propagation kicks in. If I disable CDI context propagation (with quarkus.arc.context-propagation.enabled=false) and rely purely on Vertx duplicated context a health check always fails with ContextNotActiveException because HealthChecks are always executed on a new duplicated context and the request context is not activated separately.

Now, if I replace VertxContext.createNewDuplicatedContext(vertx.getOrCreateContext()) with Vertx.currentContext() then it seems to work; except there's the BlockingChecksVertxContextDuplicationTest that fails because it expects different contexts.

I believe that health checks should be executed on the same Vertx context that is used to proceed the HTTP request. But I'm no SR Health/Vertx expert.

Also the necessity of using SR CP just for a health check seems like an overkill to me.

CC @xstefank @cescoffier

@geoand
Copy link
Contributor

geoand commented Sep 19, 2024

Also the necessity of using SR CP just for a health check seems like an overkill to me.

+100

@cescoffier
Copy link
Member

We can keep using the current duplicated context if we have one and create a new only if we are called from a root context (or no context).

In the latter case, activating the request scope can be an option (in the execute blocking block)

@xstefank
Copy link
Member

We had to introduce new vertx context for MDC logs propagation which are coupled to it. I'll give a shot to what @cescoffier is proposing if that would still work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants