-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GptManager pybind 2/4TP run demo #701
Comments
Thank you for reporting this. We shall add |
@byshiue @MartinMarciniszyn std::vector<int64_t> InferenceRequest::serialize() const
{
std::shared_ptr<tb::InferenceRequest> ir = toTrtLlm();
return ir->serialize();
}
std::shared_ptr<InferenceRequest> InferenceRequest::deserialize(const std::vector<int64_t>& packed)
{
std::shared_ptr<tb::InferenceRequest> ir = tb::InferenceRequest::deserialize(packed);
return TrtLlmTo(ir);
} But I encountered the same problem as #782 Only by setting [c224f3d064d0:30219:0:31671] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[c224f3d064d0:30220:0:31666] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[c224f3d064d0:30221:0:31672] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
[c224f3d064d0:30222:0:31667] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil))
==== backtrace (tid: 31667) ====
0 0x0000000000042520 __sigaction() ???:0
1 0x0000000000236d0a tensorrt_llm::batch_manager::GptManager::returnCompletedRequests() ???:0
2 0x000000000023c13e tensorrt_llm::batch_manager::GptManager::decoupled_execution_loop() ???:0
3 0x00000000000dc253 std::error_code::default_error_condition() ???:0
4 0x0000000000094ac3 pthread_condattr_setpshared() ???:0
5 0x0000000000125bf4 clone() ???:0
=================================
==== backtrace (tid: 31671) ====
0 0x0000000000042520 __sigaction() ???:0
1 0x0000000000236d0a tensorrt_llm::batch_manager::GptManager::returnCompletedRequests() ???:0
2 0x000000000023c13e tensorrt_llm::batch_manager::GptManager::decoupled_execution_loop() ???:0
3 0x00000000000dc253 std::error_code::default_error_condition() ???:0
4 0x0000000000094ac3 pthread_condattr_setpshared() ???:0
5 0x0000000000125bf4 clone() ???:0
=================================
==== backtrace (tid: 31672) ====
0 0x0000000000042520 __sigaction() ???:0
1 0x0000000000236d0a tensorrt_llm::batch_manager::GptManager::returnCompletedRequests() ???:0
2 0x000000000023c13e tensorrt_llm::batch_manager::GptManager::decoupled_execution_loop() ???:0
3 0x00000000000dc253 std::error_code::default_error_condition() ???:0
4 0x0000000000094ac3 pthread_condattr_setpshared() ???:0
5 0x0000000000125bf4 clone() ???:0
=================================
==== backtrace (tid: 31666) ====
0 0x0000000000042520 __sigaction() ???:0
1 0x0000000000236d0a tensorrt_llm::batch_manager::GptManager::returnCompletedRequests() ???:0
2 0x000000000023c13e tensorrt_llm::batch_manager::GptManager::decoupled_execution_loop() ???:0
3 0x00000000000dc253 std::error_code::default_error_condition() ???:0
4 0x0000000000094ac3 pthread_condattr_setpshared() ???:0
5 0x0000000000125bf4 clone() ???:0
================================= |
The |
Hi there~! Will an official release containing the commit that introduces pickle support for |
Please expect to see a release in the next couple of weeks. |
I'm trying to enable GptManager pybinding according to this, but this demo only provides a simple scheduling API.
I'm trying to run model in 2/4 TP. According to cpp gptManagerBenchmark, I encountered a problem,
TypeError: cannot pickle 'tensorrt_llm.bindings.InferenceRequest' object
, I investigated the corresponding functionserialize()
anddeserialize()
which are closed source.Can you provide a complete 2/4TP execution demo, or provide some ideas for tensorrt_llm.bindings.InferenceRequest python object serialization?
My demo:
The last line of code prompts an error:
TypeError: cannot pickle 'tensorrt_llm.bindings.InferenceRequest' object
.The text was updated successfully, but these errors were encountered: