-
Notifications
You must be signed in to change notification settings - Fork 16.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement a way to abort / cancel request #5340
Comments
did you find a way to do this? |
Nope. What I did to workaround it:
So the need to have this kinda less pressing now, but it still seems a must to have |
Hi, @dzmitry-kankalovich! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale. From what I understand, you are requesting a way to cancel requests in the Python version of LangChain. Currently, you are terminating the process to cancel requests, but you mentioned that this is not ideal for web applications. There has been some discussion in the comments, with you mentioning workarounds you have implemented. However, you still believe having a documented way to cancel requests is important. Before we close this issue, we wanted to check with you if this issue is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days. Thank you for your understanding and contribution to the LangChain project! |
Seems to still be relevant. I would be able to improve UX of my software significantly if I can added cancellation feature |
@baskaryan Could you please help @dzmitry-kankalovich with the cancellation feature they are requesting? They mentioned that it would significantly improve the UX of their software. Thank you! |
I would also like to be able to cancel streams gracefully from the python client. |
Is there any update on the right patterns to cancel chains? This is an architectural need at multiple levels:
Supporting everywhere at once is obviously not easy, but providing a pattern for the basics -- batch & streaming chain calls to openai -- seems like a clear start... |
agree that this is critical to building a scalable system. |
Any updates on this? |
🤖 Thank you for your feature request and for providing your workaround. As of now, the Python version of LangChain does not support request cancellation. However, there are no inherent technical limitations within the Python language or the LangChain framework that would prevent the implementation of a request cancellation feature. In Python, you can use the Please note that implementing a cancellation feature would require careful handling of exceptions and cleanup to ensure that resources are properly released when a request is cancelled. You can refer to the Requests class for more details. As for the costs associated with abandoned requests, it would depend on the specific pricing model of the language model provider. It's recommended to check with the provider for accurate information. Your suggestion for feature parity with the JS version of LangChain is valid and would be a valuable addition to the Python version. As you've mentioned that you may have the capacity to work on the implementation in the coming weeks, your contribution would be greatly appreciated. In the meantime, I'll make a note of this feature request for future reference. Please feel free to reach out if you have any further questions or need additional assistance. Dosu SourcesThis response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
I have a requirement for this as well, and am looking for an update to langchain to support this generally. As a short term workaround I extended the client class:
and keep track of the pending requests within the client, exposing some methods to list the in-progress requests and interrupt them. Just mentioning this here in case that helps someone waiting for the more thorough solution, or if someone has a better idea 😄 |
Can you post that as a gist so at least it could be adopted in the meantime?
…________________________________
From: Marc Hadfield ***@***.***>
Sent: Sunday, February 11, 2024 5:32:35 PM
To: langchain-ai/langchain ***@***.***>
Cc: Amit Sudharshan ***@***.***>; Comment ***@***.***>
Subject: Re: [langchain-ai/langchain] Implement a way to abort / cancel request (Issue #5340)
EXTERNAL SENDER: Verify sender before providing any information or opening links and attachments.
________________________________
I have a requirement for this as well, and am looking for an update to langchain to support this generally.
As a short term workaround I extended the client class:
httpx.AsyncClient
overriding the "send":
async def send(self, request, *args, **kwargs)
and the response class:
httpx.Response
and used this client in places like:
model = ChatOpenAI(
model_name=model_id,
streaming=True,
http_client=client,
callbacks=[callback]
)
and keep track of the pending requests within the client, exposing some methods to list the in-progress requests and interrupt them.
Just mentioning this here in case that helps someone waiting for the more thorough solution, or if someone has a better idea 😄
—
Reply to this email directly, view it on GitHub<#5340 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/A7AF7YFPUW5QRBRMPUBEWNLYTD6DHAVCNFSM6AAAAAAYREO5SCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZXHAYTOMZUGQ>.
You are receiving this because you commented.Message ID: ***@***.***>
________________________________
This e-mail and any files transmitted with it are intended only for the person(s) or entity to which it is addressed and may contain confidential, proprietary, copyrighted and/or privileged material. Any unauthorised review, retransmission, dissemination or other use of this information by persons or entities other than the intended recipient is prohibited. If the reader of this e-mail is not the intended recipient or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any use, dissemination, forwarding, printing or copying of this email is strictly prohibited.
Rokos Capital Management (including its group companies such as Rokos Capital Management LLP (“RCM LLP”) and Rokos Capital Management (US) LP (“RCM US”))(“RCM”) is required by regulation to review and retain both outgoing and incoming e-mail and may be required to produce e-mail records to regulatory authorities or others with legal rights to the information. Internet communications cannot be guaranteed to be secure or error free as information could be intercepted, corrupted, lost, arrive late or contain viruses. The sender does not accept liability for any errors or omissions in the context of this message which arise as a result of internet transmission.
This communication is for informational purposes only. This communication is neither an offer to sell nor a solicitation of an offer to buy any security or other investment product, nor should it be construed as investment advice unless explicitly stated as such. This communication is not an official confirmation of any transaction. All market prices, data and other information are not warranted as to completeness or accuracy and are subject to change without notice. Any comments or statements made herein do not necessarily reflect the views of RCM. Details of how RCM collects, uses and processes personal data of job applicants and prospective investors is available on our website at: Privacy Notice<https://www.rokoscapital.com/Download.aspx?ID=6c24aff5-7.pdf&Name=Privacy-Notice.pdf>. If you are applying for a role at RCM (or any of its affiliates) or are a prospective investor in any RCM managed fund or vehicle, please take the time to read and understand this privacy notice.
RCM LLP is a limited liability partnership authorised and regulated by the Financial Conduct Authority of the United Kingdom (reg. no. 679721) and registered in England & Wales, having its principal office at 23 Savile Row, London, W1S 2ET. RCM LLP is also registered as an investment adviser with the U.S. Securities and Exchange Commission (the "SEC").
RCM US is a limited partnership registered as an investment adviser with the SEC, having its principal office at 600 Lexington Avenue, New York, New York, 10022.
Registration as an investment adviser does not imply a certain level of skill or training, nor does it constitute an endorsement by the SEC.
|
I think this should be straightforward if the stream context manager was used. The context manager will close the HTTP connection upon exit which will happen if the generator is terminated e.g. using a Instead of doing this in ChatOpenAi: for chunk in self.client.create(messages=message_dicts, **params):
yield chunk Do this: with self.client.create(messages=message_dicts, **params) as response:
for chunk in response:
yield chunk |
Showing interest as well. I have not much to add beyond what already has been discussed here. Canceling streaming requests is basic infrastructure to build production systems. Generating tokens that never been used is a waste of compute resources and money.
@snopoke is there anything I can help with, so your PR gets merged? |
@snopoke would you kindly provide an example how your change would allow to abort request? I was thinking that wrapping |
With either the sync or async methods you can just stop iterating over the stream in order to cancel the generation: for chunk in model.stream(message):
if _abort_stream(chunk):
break
print(chunk.content) This works because it exits the internal context manager which closes the HTTP connection to the API effectively aborting the stream. |
**Description:** Use the `Stream` context managers in `ChatOpenAi` `stream` and `astream` method. Using the context manager returned by the OpenAI client makes it possible to terminate the stream early since the response connection will be closed when the context manager exists. **Issue:** #5340 **Twitter handle:** @snopoke --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
This would be a great feature. It'd be really nice if there was a import threading
# Create the cancellation token
cancellation_token = threading.Event()
# Timeout after 60 seconds (or it could be cancelled by a user action)
threading.Timer(60.0, cancellation_token.set)
# Run the chain with the cancellation token
chain.invoke(input, {"is_cancelled": cancellation_token.is_set}) |
Sorry but if I use:
|
@snopoke Although this is a nice solution it is incomplete. It only works for streaming requests and I assume it only works for I was proposing a general solution that can be used by any chain and in any of the Now that I look closer at the JS cancellation docs I see that they have the same interface I proposed except using the name keyword |
@brendanator you example above seems to be JavaScript. In python you could do: response = await remoteChain.stream("What is your name ?")
async for chunk in response:
print(chunk)
if condition():
break
What is the goal here? Is it to cancel the request to prevent excess token usage? If so I don't think that makes sense outside of the stream APIs unless the API you're calling can detect the closed connection and abort the generation (which I think is unlikely).
This exists because the Javascript fetch api supports aborting requests. I'm not sure what the best way to support this in Python. |
The goal as described at the top of this issue is to halt processing a request (whether in a LLM model invocation, chain, or otherwise) perhaps in response to a user hitting a "stop" button in a UI. The main problem I found was that there was no way to proactively tell langchain to interrupt a call to a model, such as calling the OpenAI API (either a synchronous call or asynchronous call). This is the case mentioned at the top of this issue as the call may take a while. The solution that I found was to pass in a http_client when instantiating the LLM Model which tracks requests made with the client and is interruptible. Then when an interrupt is needed the low level http client request is interrupted which triggers an exception which hopefully would be caught by langchain to clean things up, equivalent to if the http_client was interrupted initiated by the server-side. This just covers the one scenario where the LLM Model call is the thing that needs to be interrupted to halt processing, and is not a more general solution that would cover other cases, for instance tool calls being interrupted. |
Found a way to do this by canceling the asyncio.Task. Do I assume correctly that token generation would stop and so we will not be charged for unused output tokens? |
**Description:** Use the `Stream` context managers in `ChatOpenAi` `stream` and `astream` method. Using the context manager returned by the OpenAI client makes it possible to terminate the stream early since the response connection will be closed when the context manager exists. **Issue:** #5340 **Twitter handle:** @snopoke --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
For what it's worth, 2 people in this thread commented that stopping the request (using |
Feature request
I've been playing around with OpenAI GPT-4 and ran into situation when response generation might take quite some time - say 5 minutes.
I switched over to streaming, but often I can immediately see the response is not what want, and therefore I'd like to cancel request.
Now here is the part that is unclear to me:
is there an official way to cancel request in Python's version of LangChain? I have found this described in JS/TS version of the framework, however scanning docs, sources and issues yields nothing for this repo.
For now I simply terminate process, which works good enough for something like Jupyter notebooks, but quite problematic for say web application.
Besides termination, it's also unclear if I may incur any unwanted costs or not for the abandoned request.
Should some sort of feature parity be made with JS LangChain?
Motivation
Provide a documented way to cancel long-running requests
Your contribution
At this point I have capacity only to test out potential implementation. May work on the implementation in later weeks.
The text was updated successfully, but these errors were encountered: