[Bug] 如何提前终止流式推理pipe.stream_infer #3106

youyc22 · 2025-01-31T08:08:06Z

Checklist

1. I have searched related issues but cannot get the expected help.
2. The bug has not been fixed in the latest version.
3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.

Describe the bug

您好：我遇到了这样的问题：我希望在第一个流式推理满足一定条件时提前终止（比如图中的tokens>500），紧接着进行第二个问题的流式推理，但是我发现尽管break退出了第一个循环，第一个问题的推理仍在继续（占用了第二个问题的计算资源）

请问是否提供了相应的接口来终止第一个问题的请求而不影响第二个问题的推理？

Reproduction

codes above

Environment

lmdeploy：0.7.0.post1

Error traceback

youyc22 · 2025-01-31T08:45:59Z

我尝试使用stop_session，但是并没有用
重新初始化pipe能解决问题，但是依然会浪费一定的时间

lzhangzz · 2025-02-01T04:28:07Z

测了一下 turbomind 引擎可以通过 break 终止生成，pytorch 引擎似乎还没有支持在 generator 的的循环中 break。

目前 async_engine.py 里面有个地方没有处理 CanceledError，可以先把这个位置的 add_done_callback 部分注释掉。

lmdeploy/lmdeploy/serve/async_engine.py

Line 430 in 637435f

    
           asyncio.run_coroutine_threadsafe(_infer(), loop).add_done_callback(lambda x: x.result())

另外由于推理线程跟接口是异步的，cancel 了以后可能还会多生成几个 token 才会停下。

youyc22 · 2025-02-01T04:42:20Z

测了一下 turbomind 引擎可以通过 break 终止生成，pytorch 引擎似乎还没有支持在 generator 的的循环中 break。

目前 async_engine.py 里面有个地方没有处理 CanceledError，可以先把这个位置的 add_done_callback 部分注释掉。

lmdeploy/lmdeploy/serve/async_engine.py

Line 430 in 637435f

asyncio.run_coroutine_threadsafe(_infer(), loop).add_done_callback(lambda x: x.result())
另外由于推理线程跟接口是异步的，cancel 了以后可能还会多生成几个 token 才会停下。

感谢

lvhan028 added the awaiting response label Feb 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] 如何提前终止流式推理pipe.stream_infer #3106

[Bug] 如何提前终止流式推理pipe.stream_infer #3106

youyc22 commented Jan 31, 2025 •

edited

Loading

youyc22 commented Jan 31, 2025 •

edited

Loading

lzhangzz commented Feb 1, 2025

youyc22 commented Feb 1, 2025

[Bug] 如何提前终止流式推理pipe.stream_infer #3106

[Bug] 如何提前终止流式推理pipe.stream_infer #3106

Comments

youyc22 commented Jan 31, 2025 • edited Loading

Checklist

Describe the bug

Reproduction

Environment

Error traceback

youyc22 commented Jan 31, 2025 • edited Loading

lzhangzz commented Feb 1, 2025

youyc22 commented Feb 1, 2025

youyc22 commented Jan 31, 2025 •

edited

Loading

youyc22 commented Jan 31, 2025 •

edited

Loading