Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Observation in ServerHttpObservationFilter is never stopped for asynchronous requests #32730

Closed
wojtassi opened this issue Apr 29, 2024 · 5 comments
Assignees
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket) status: backported An issue that has been backported to maintenance branches theme: observability An issue related to observability and tracing type: bug A general bug
Milestone

Comments

@wojtassi
Copy link

Spring Framework 6.0.19
CometD 7.0.10
Micrometer 1.12.5

I'm not exactly sure which component is responsible for this issue but since spring-web contains ServerHttpObservationFilter, I am submitting this bug here.
We have recently run into a memory leak of DefaultLongTaskTimer$SampleImpl. This was introduced with micrometer introducing histograms for http.server.requests.active.

The leak happens only for requests to CometD and it happens because those requests are asynchronous.
In https://github.com/spring-projects/spring-framework/blob/main/spring-web/src/main/java/org/springframework/web/filter/ServerHttpObservationFilter.java:

  1. Observation is created at the start of doFilterInternal.
  2. In synchronous requests, Observation is closed at the end of that function.
  3. For asynchronous requests, Observation is never closed, by design, which was fine until histograms were introduced that use Observation.stop to stop DefaultLongTaskTimer$SampleImpl.
  4. Since Observation is never stopped, we end up with DefaultLongTaskTimer$SampleImpl leak

The problem is that there is no corresponding callback that would close the Observation once asynchronous requests completes.

Thanks,

@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged or decided on label Apr 29, 2024
@jhoeller jhoeller added in: web Issues in web modules (web, webmvc, webflux, websocket) theme: observability An issue related to observability and tracing labels Apr 30, 2024
@bclozel
Copy link
Member

bclozel commented Jun 4, 2024

We do manage async requests for many use cases with Spring MVC and WebFlux - @Async annotated requests, methods return reactive types like Flux, etc.

We have not seen any leak for observations there. I have just checked with a simple application and I'm seeing observations being closed properly, and the "http.server.requests.active" request count decrease as expected when streaming requests are done.

The ServerHttpObservationFilter should work fine with asynchronous requests. The Servlet container is responsible for REQUEST and ASYNC dispatches, involving the filter again when async processing is done. What makes you think that this approach is not correct?

Can you share a minimal sample application that shows the problem? As we don't support CometD in our stack, please bear in mind that we can only justify spending time on a very minimal sample, ideally using an application generated on start.spring.io as a baseline.

@bclozel bclozel added the status: waiting-for-feedback We need additional information before we can continue label Jun 4, 2024
fogninid added a commit to fogninid/spring-framework-32730 that referenced this issue Jun 5, 2024
@fogninid
Copy link

fogninid commented Jun 5, 2024

We are experiencing a very similar leak on a spring boot application configured to use jersey asynchronous requests.

Can you have a look at the reproducer I pushed to https://github.com/fogninid/spring-framework-32730?

On my machine a few minutes are sufficient to get a heap dump showing this memory leak: simply run mvn test.

While the test is running, you can also observe how the values of the /actuator/metrics/http.server.requests.active delivers non-plausible and increasing values, although all requests are already completed.

@spring-projects-issues spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue labels Jun 5, 2024
@wojtassi
Copy link
Author

wojtassi commented Jun 5, 2024

Thanks @fogninid for reproducer. That saved me from carving out a sample from our production code.

The ServerHttpObservationFilter should work fine with asynchronous requests. The Servlet container is responsible for REQUEST and ASYNC dispatches, involving the filter again when async processing is done. What makes you think that this approach is not correct?

This is the part that I did not understand. I just didn't see this behavior in the code for cometD and then I found this snippet:
https://docs.spring.io/spring-framework/reference/web/webmvc/mvc-ann-async.html#mvc-ann-async-vs-webflux.
Which let me to try it out using DeferredResult:
https://github.com/wojtassi/spring-framework-32730
I've modified @fogninid reproducer to also include a sample that uses DeferredResult that does not leak. If I put a breakpoint in ServerHttpObservationFilter, it indeed is called twice. Once when initial request is created and once when async response completes.

So it appears that this Observation mechanism does not work with jakarta.ws.rs annotations that returns AsyncResponse.
For those Observations to work it needs to use DeferredResult which is spring specific.

@wojtassi
Copy link
Author

wojtassi commented Jun 5, 2024

For people that are running into this leak here is a bean that can be used to disable observations for specific endpoints:

private static final List<String> COMETD_ENDPOINTS = List.of(
            "/notifications/connect",
           "/notifications/connect/",
            "/notifications/handshake",
            "/notifications/handshake/",
            "/notifications/subscribe",
            "/notifications/subscribe/",
            "/notifications/disconnect",
            "/notifications/disconnect/");

@Bean
    ObservationPredicate micrometerCometDContextExclusionPredicate() {
        return (name, context) -> {
            if (name.equals("http.server.requests") && context instanceof ServerRequestObservationContext serverContext) {
                final String requestURI = serverContext.getCarrier().getRequestURI().toLowerCase(Locale.ENGLISH);
                if (requestURI.contains("notifications")) {
                    return !COMETD_ENDPOINTS.stream().anyMatch(m -> requestURI.endsWith(m));
                }
            }
            return true;
        };
    }

based on this:
spring-projects/spring-boot#34801

This is specific to cometD but should be easily adaptable to other end points.

@bclozel bclozel self-assigned this Jun 7, 2024
@bclozel bclozel added type: bug A general bug and removed status: waiting-for-triage An issue we've not yet triaged or decided on status: feedback-provided Feedback has been provided labels Jun 7, 2024
@bclozel bclozel added this to the 6.1.9 milestone Jun 7, 2024
@bclozel
Copy link
Member

bclozel commented Jun 7, 2024

Previously, our filter implementation was relying on the fact that async implementations would dispatch back to the container so that our filter would be executed again after the chain. It turns out that calling AsyncContext#complete without dispatching back is legal according to the spec, so we will revisit our filter implementation to support this case.

Thanks a lot for your helpful report @wojtassi @fogninid ! I will schedule this issue for the next maintenance release and backport the fix accordingly.

@github-actions github-actions bot added status: backported An issue that has been backported to maintenance branches and removed for: backport-to-6.0.x labels Jun 7, 2024
@bclozel bclozel closed this as completed in 6681394 Jun 7, 2024
bclozel added a commit to bclozel/spring-framework that referenced this issue Jun 26, 2024
Prior to this commit, the fix for spring-projectsgh-32730 disabled the involvment of
the osbervation filter for async dispatches. Instead of relying on ASYNC
dispatches to close the observation for async requests, this is now
using an async listener instead: async dispatches are not guaranteed to
happen once the async request is handled.

This change caused another side-effect: because async dispatches are not
considered anymore by this filter, the observation scope is not
reinstated for async dispatches. For example, `ResponseBodyAdvice`
implementations do not have the observation scope opened during their
execution.

This commit re-enables async dispatches for this filter, but ensures
that observations are not closed during such dispatches as this will be
done by the async listener.

Fixes spring-projectsgh-33091
bclozel added a commit that referenced this issue Jul 1, 2024
Prior to this commit, the fix for gh-32730 disabled the involvment of
the osbervation filter for async dispatches. Instead of relying on ASYNC
dispatches to close the observation for async requests, this is now
using an async listener instead: async dispatches are not guaranteed to
happen once the async request is handled.

This change caused another side-effect: because async dispatches are not
considered anymore by this filter, the observation scope is not
reinstated for async dispatches. For example, `ResponseBodyAdvice`
implementations do not have the observation scope opened during their
execution.

This commit re-enables async dispatches for this filter, but ensures
that observations are not closed during such dispatches as this will be
done by the async listener.

Fixes gh-33091
bclozel added a commit that referenced this issue Jul 1, 2024
Prior to this commit, the fix for gh-32730 disabled the involvment of
the osbervation filter for async dispatches. Instead of relying on ASYNC
dispatches to close the observation for async requests, this is now
using an async listener instead: async dispatches are not guaranteed to
happen once the async request is handled.

This change caused another side-effect: because async dispatches are not
considered anymore by this filter, the observation scope is not
reinstated for async dispatches. For example, `ResponseBodyAdvice`
implementations do not have the observation scope opened during their
execution.

This commit re-enables async dispatches for this filter, but ensures
that observations are not closed during such dispatches as this will be
done by the async listener.

Fixes gh-33128
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket) status: backported An issue that has been backported to maintenance branches theme: observability An issue related to observability and tracing type: bug A general bug
Projects
None yet
Development

No branches or pull requests

5 participants