We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
感谢分享出色的工作!我有个疑惑的点: 分享的代码看起来貌似是通过simulation来进行测试的,在真实的系统中是否是提前收集了一批query在不同的执行顺序下的推理性能结果(模拟不同的调度结果)?或者是在真实的LLM系统中实现了这个基于prediction的调度器?
The text was updated successfully, but these errors were encountered:
我们在vllm上有一个sjf简单的实现(来替换fcfs scheduler),还没release;可以参考vllm的priority scheduling feature,把sequence长度设置为priority的inverse即可
Sorry, something went wrong.
No branches or pull requests
感谢分享出色的工作!我有个疑惑的点:
分享的代码看起来貌似是通过simulation来进行测试的,在真实的系统中是否是提前收集了一批query在不同的执行顺序下的推理性能结果(模拟不同的调度结果)?或者是在真实的LLM系统中实现了这个基于prediction的调度器?
The text was updated successfully, but these errors were encountered: