-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement]Support broadcast_object_list in multi-machines & support Searcher running in single GPU #153
Conversation
Codecov Report
@@ Coverage Diff @@
## dev_v0.4.0 #153 +/- ##
==============================================
- Coverage 66.17% 65.51% -0.66%
==============================================
Files 92 93 +1
Lines 3376 3428 +52
Branches 615 630 +15
==============================================
+ Hits 2234 2246 +12
- Misses 1040 1080 +40
Partials 102 102
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
The modification of |
object_list[i] = _tensor_to_object(obj_view, obj_size) | ||
|
||
|
||
def broadcast_object_list(data: List[Any], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A warning needs to be added here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
* [Enhance] Add extra dataloader settings in configs (#141) * [Docs] fix md link failure in docs (#142) * [Docs] update Cream readme * delete 'readme.md' in model_zoo.md * fix md link failure in docs * [Docs] add myst_parser to extensions in conf.py * [Docs] delete the deprecated recommonmark * [Docs] delete recommandmark from conf.py * [Docs] fix md link failure and lint failture * [Fix] Fix seed error in mmseg/train_seg.py and typos in train.md (#152) * [Docs] update Cream readme * delete 'readme.md' in model_zoo.md * fix cwd docs and fix seed in #151 * delete readme of cream * [Enhancement]Support broadcast_object_list in multi-machines & support Searcher running in single GPU (#153) * broadcast_object_list support multi-machines * add userwarning * [Fix] Fix configs (#149) * fix configs * fix spos configs * fix readme * replace the official mutable_cfg with the mutable_cfg searched by ourselves * update https prefix Co-authored-by: pppppM <gjf_mail@126.com> * [BUG]Support to prune models containing GroupNorm or InstanceNorm. (#144) * suport GN and IN * test pruner * limit pytorch version * fix pytest * throw an error when tracing groupnorm with torch version under 1.6.0 Co-authored-by: caoweihan <caoweihan@sensetime.com> * Bump version to 0.3.1 Co-authored-by: qiufeng <44188071+wutongshenqiu@users.noreply.github.com> Co-authored-by: PJDong <1115957667@qq.com> Co-authored-by: humu789 <88702197+humu789@users.noreply.github.com> Co-authored-by: whcao <41630003+HIT-cwh@users.noreply.github.com> Co-authored-by: caoweihan <caoweihan@sensetime.com>
* [Enhance] Add extra dataloader settings in configs (open-mmlab#141) * [Docs] fix md link failure in docs (open-mmlab#142) * [Docs] update Cream readme * delete 'readme.md' in model_zoo.md * fix md link failure in docs * [Docs] add myst_parser to extensions in conf.py * [Docs] delete the deprecated recommonmark * [Docs] delete recommandmark from conf.py * [Docs] fix md link failure and lint failture * [Fix] Fix seed error in mmseg/train_seg.py and typos in train.md (open-mmlab#152) * [Docs] update Cream readme * delete 'readme.md' in model_zoo.md * fix cwd docs and fix seed in open-mmlab#151 * delete readme of cream * [Enhancement]Support broadcast_object_list in multi-machines & support Searcher running in single GPU (open-mmlab#153) * broadcast_object_list support multi-machines * add userwarning * [Fix] Fix configs (open-mmlab#149) * fix configs * fix spos configs * fix readme * replace the official mutable_cfg with the mutable_cfg searched by ourselves * update https prefix Co-authored-by: pppppM <gjf_mail@126.com> * [BUG]Support to prune models containing GroupNorm or InstanceNorm. (open-mmlab#144) * suport GN and IN * test pruner * limit pytorch version * fix pytest * throw an error when tracing groupnorm with torch version under 1.6.0 Co-authored-by: caoweihan <caoweihan@sensetime.com> * Bump version to 0.3.1 Co-authored-by: qiufeng <44188071+wutongshenqiu@users.noreply.github.com> Co-authored-by: PJDong <1115957667@qq.com> Co-authored-by: humu789 <88702197+humu789@users.noreply.github.com> Co-authored-by: whcao <41630003+HIT-cwh@users.noreply.github.com> Co-authored-by: caoweihan <caoweihan@sensetime.com>
* [Enhance] Add extra dataloader settings in configs (open-mmlab#141) * [Docs] fix md link failure in docs (open-mmlab#142) * [Docs] update Cream readme * delete 'readme.md' in model_zoo.md * fix md link failure in docs * [Docs] add myst_parser to extensions in conf.py * [Docs] delete the deprecated recommonmark * [Docs] delete recommandmark from conf.py * [Docs] fix md link failure and lint failture * [Fix] Fix seed error in mmseg/train_seg.py and typos in train.md (open-mmlab#152) * [Docs] update Cream readme * delete 'readme.md' in model_zoo.md * fix cwd docs and fix seed in open-mmlab#151 * delete readme of cream * [Enhancement]Support broadcast_object_list in multi-machines & support Searcher running in single GPU (open-mmlab#153) * broadcast_object_list support multi-machines * add userwarning * [Fix] Fix configs (open-mmlab#149) * fix configs * fix spos configs * fix readme * replace the official mutable_cfg with the mutable_cfg searched by ourselves * update https prefix Co-authored-by: pppppM <gjf_mail@126.com> * [BUG]Support to prune models containing GroupNorm or InstanceNorm. (open-mmlab#144) * suport GN and IN * test pruner * limit pytorch version * fix pytest * throw an error when tracing groupnorm with torch version under 1.6.0 Co-authored-by: caoweihan <caoweihan@sensetime.com> * Bump version to 0.3.1 Co-authored-by: qiufeng <44188071+wutongshenqiu@users.noreply.github.com> Co-authored-by: PJDong <1115957667@qq.com> Co-authored-by: humu789 <88702197+humu789@users.noreply.github.com> Co-authored-by: whcao <41630003+HIT-cwh@users.noreply.github.com> Co-authored-by: caoweihan <caoweihan@sensetime.com>
* Add doc * Remove spaces * sovle comments * Resolve comments
Motivation
fix init bug when running in single GPU #42
fix broadcast_object_list bug, which can not be executed in multi-machines.
Modification
Refactor
broadcast_object_list
to be consistent with pytorchBC-breaking (Optional)
broad_object_list
is without return value.broad_object_list
's parameter changed:object_list
->data
Use cases (Optional)
Examples:
>>> import torch
>>> import mmrazor.core.utils as dist
>>> # non-distributed environment
>>> data = ['foo', 12, {1: 2}]
>>> dist.broadcast_object_list(data)
>>> data
['foo', 12, {1: 2}]
>>> # distributed environment
>>> # We have 2 process groups, 2 ranks.
>>> if dist.get_rank() == 0:
>>> # Assumes world_size of 3.
>>> data = ["foo", 12, {1: 2}] # any picklable object
>>> else:
>>> data = [None, None, None]
>>> dist.broadcast_object_list(data)
>>> data
["foo", 12, {1: 2}] # Rank 0
["foo", 12, {1: 2}] # Rank 1