Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement host configuration property for handling pre-cancelled invocation requests #9523

Merged
merged 18 commits into from
Jan 22, 2024

Conversation

liliankasem
Copy link
Member

@liliankasem liliankasem commented Sep 8, 2023

Issue describing the changes in this PR

resolves #9387

Pull request checklist

  • My changes do not require documentation changes
    • Otherwise: Documentation issue linked to PR (in progress)
  • My changes should not be added to the release notes for the next release
    • Otherwise: I've added my notes to release_notes.md
  • My changes do not need to be backported to a previous version
    • Otherwise: Backport tracked by issue/PR #issue_or_pr
  • My changes do not require diagnostic events changes
    • Otherwise: I have added/updated all related diagnostic events and their documentation (Documentation issue linked to PR)
  • I have added all required tests (Unit tests, E2E tests)

Additional information

Recently there has been an uptick of GitHub issues and CRIs related to the FunctionInvocationCanceled exception introduced in this PR. This is because the PR made changes that now expose existing issues that were hidden prior and thus has lead to an increased volume of invocations failures with this exception. Of the cases investigated, approximately 90% of them are due to client disconnects.

After some brainstorming, the team has decided that we should:

a) Change the log level of the "Cancellation has been requested" message to be information. Add to provide a link to documentation to explain to customers what could be causing a cancellation before an invocation is sent to the worker.

b) To make this behaviour configurable. By default, if cancellation has been requested before we send an invocation to the worker, we will fail the invocation with a FunctionInvocationCanceled exception (this is what happens today). If a worker handles the InvocationCancel capability, we will not cancel and will send the invocation to the worker. We are also adding a host configuration property to override that behaviour where if the worker handles InvocationCancel AND the property is set to false, we will cancel the result source and not send the cancelled invocation to the worker.

@jviau
Copy link
Contributor

jviau commented Sep 13, 2023

I am not sure I like the idea of option (b). If we look at our host as a reverse-proxy in this case (which we essentially are, especially in the AspNetCore efforts), I don't think it is standard to forward a canceled request to the next node (the worker in this case). To me, adding this capability seems like extra effort on our end for something that is arguably a bad pattern for a customer to even use.

Do we have customer demand for this capability? If so, what are they wanting to do with that disconnected request?

@liliankasem liliankasem marked this pull request as ready for review September 14, 2023 00:51
@liliankasem liliankasem requested a review from a team as a code owner September 14, 2023 00:51
@cjaliaga
Copy link
Member

Do we have customer demand for this capability? If so, what are they wanting to do with that disconnected request?

We've seen some customers implementing a (not properly) fire-and-forget pattern with Http Triggers and on those scenarios the response is not important for them. I do agree this is a bad pattern and we have other options on the platform, such Durable, to implement the same in a better a safer way.

I would let the new behavior as default, not sending pre-cancelled invocations to the worker.

@jviau
Copy link
Contributor

jviau commented Sep 14, 2023

We've seen some customers implementing a (not properly) fire-and-forget pattern with Http Triggers and on those scenarios the response is not important for them.

I think this is a pattern we explicitly do not want to support. Customer code is meant to run within the context of a trigger and not leak outside it. It is not just a bad pattern in your example, but an explicitly unsupported pattern.

@liliankasem liliankasem changed the title Add capability for handling pre-cancelled invocation requests [draft] Add capability for handling pre-cancelled invocation requests Sep 14, 2023
@liliankasem
Copy link
Member Author

liliankasem commented Sep 21, 2023

@jviau @brettsam - in the last design meeting we discussed setting the execution log message to "Canceled", this change goes into the WebJobs is SDK and roughly looks like this (and potentially more changes depending on the behaviour we want to expose):

With this change, the logs look like this:

warn: Function.Func[1]
      Executing 'Functions.Func' (Reason='This function was programmatically called via the host APIs.', Id=5050cfe6-9514-43a1-8ffe-caf3da59c9a1)
info: Function.Func[4]  // or we can log this as Warning
      Executed 'Functions.Func' (Canceled, Id=5050cfe6-9514-43a1-8ffe-caf3da59c9a1, Duration=9338ms)
fail: Host.Results[0]
      Microsoft.Azure.WebJobs.Host.FunctionInvocationException: Exception while executing function: Functions.Func
       ---> Microsoft.Azure.WebJobs.Host.FunctionInvocationCanceledException: The invocation request with id '5050cfe6-9514-43a1-8ffe-caf3da59c9a1' was canceled before the request was sent to the worker.

I think this is a breaking change potentially for WebJobs and for any services/detectors/portal etc. that rely on this log so I don't think is actually viable (at least not in v4).

Edit: Also I was looking into the ASP.NET Core behaviour for this and my understanding is that the client that disconnects gets a TaskCanceledException but the request still executes - might be worth a sanity check?

cc: @fabiocav fyi

@jviau
Copy link
Contributor

jviau commented Sep 27, 2023

Also I was looking into the ASP.NET Core behaviour for this and my understanding is that the client that disconnects gets a TaskCanceledException but the request still executes - might be worth a sanity check?

This depends entirely on individual apps. ASP.NET Core uses cooperating cancellation, so HttpContext.RequestAborted will be in a cancelled state. So, it fails on the first thing that decides to observe and throw that cancellation. This means it may throw in middleware, in MVC, in the customer controller - entirely depends on when it is canceled and what the request pipeline looks like for a given app.

@liliankasem liliankasem changed the title [draft] Add capability for handling pre-cancelled invocation requests Implement host configuration property for handling pre-cancelled invocation requests Oct 5, 2023
@liliankasem liliankasem requested review from kshyju and brettsam October 5, 2023 20:52
@liliankasem liliankasem force-pushed the liliankasem/cancellation/make-configurable branch from 4162ca3 to 7830810 Compare October 9, 2023 22:15
@liliankasem liliankasem force-pushed the liliankasem/cancellation/make-configurable branch from cc25d11 to 576d177 Compare October 23, 2023 23:20
@fabiocav fabiocav self-assigned this Nov 8, 2023
@liliankasem liliankasem force-pushed the liliankasem/cancellation/make-configurable branch from 659f169 to 05ab1fd Compare January 8, 2024 18:46
Copy link
Member

@fabiocav fabiocav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nits.

Would flagged as approved, but I think we still need to review the setup. Once that is addressed, we should be able to get the change in.

src/WebJobs.Script.Grpc/Channel/GrpcWorkerChannel.cs Outdated Show resolved Hide resolved
src/WebJobs.Script/Config/ConfigurationPropertyNames.cs Outdated Show resolved Hide resolved
src/WebJobs.Script/Config/ScriptJobHostOptionsSetup.cs Outdated Show resolved Hide resolved
src/WebJobs.Script.Grpc/Channel/GrpcWorkerChannel.cs Outdated Show resolved Hide resolved
src/WebJobs.Script.Grpc/Channel/GrpcWorkerChannel.cs Outdated Show resolved Hide resolved
release_notes.md Outdated Show resolved Hide resolved
@liliankasem liliankasem merged commit 761735e into dev Jan 22, 2024
9 checks passed
@liliankasem liliankasem deleted the liliankasem/cancellation/make-configurable branch January 22, 2024 22:52
v-smanchem pushed a commit that referenced this pull request Jan 22, 2024
v-smanchem added a commit that referenced this pull request Jan 22, 2024
…cation requests (#9523) (#9826)

Co-authored-by: Lilian Kasem <likasem@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants