Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] fix torch allocator resouce releasing #1708

Merged
merged 3 commits into from
Feb 6, 2023

Conversation

AllentDan
Copy link
Member

Reproduce codes in master branch:

from mmdeploy.backend.tensorrt.wrapper import TRTWrapper
trt_model = TRTWrapper(
    '/path_to/cls/resnet/end2end.engine',
    ['output'])
input = torch.rand(1, 3, 224, 224).cuda()
output_trt = trt_model.forward({'input': input})
print(output_trt)

Error might triggered:

/mmdeploy/utils/logging.py(28): get_logger
Exception caught in deallocate(): TypeError: 'NoneType' object is not callable

or

codes/mmdeploy/mmdeploy/backend/tensorrt/torch_allocator.py(61): deallocate

[ERROR] Exception caught in deallocate(): AttributeError: 'NoneType' object has no attribute 'caching_allocator_delete'

@lvhan028 lvhan028 requested review from lvhan028 and grimoire February 3, 2023 03:24
@grimoire
Copy link
Member

grimoire commented Feb 3, 2023

Unable to reproduce the error.
Why would logging and torch.cuda be None?

@AllentDan
Copy link
Member Author

AllentDan commented Feb 3, 2023

Unable to reproduce the error. Why would logging and torch.cuda be None?

Don't know either. Just happened at the end of the program. Would it be possible that torch.cuda and logging are released at that time?

@grimoire
Copy link
Member

grimoire commented Feb 3, 2023

More investigations are required.

@AllentDan
Copy link
Member Author

AllentDan commented Feb 3, 2023

After some experiments, we may have following conclusions:

  1. resources including torch and logging may be released randomly before we call deallocate
  2. this kind of releasing does not delete the allocated cache since we can still delete the cache manually.

@grimoire
Copy link
Member

grimoire commented Feb 3, 2023

Could report to TensorRT team.

@lvhan028 lvhan028 merged commit 12b3d18 into open-mmlab:master Feb 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants