[Feature] Add multi machine `dist_train`. #1383

linfangjian01 · 2022-03-16T14:33:21Z

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Add more detailed documentation for users of single and multi-machine training to suit different development environments.

ref: open-mmlab/mmselfsup#232

Modification

Add the English and Chinese description documents of startup tasks. Modify the tools/dist_train.sh and tools/dist_test.sh files.

MeowZheng

please modify the tools/dist_train.py

MengzhangLI · 2022-03-16T15:54:24Z

Those files should be modified:

docs/en/train.md
docs/zh_cn/train.md
tools/dist_train.sh
tools/dist_test.sh

codecov · 2022-03-16T18:22:46Z

Codecov Report

Merging #1383 (872fa6b) into master (3d0c2eb) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #1383   +/-   ##
=======================================
  Coverage   90.39%   90.39%           
=======================================
  Files         133      133           
  Lines        7906     7906           
  Branches     1318     1318           
=======================================
  Hits         7147     7147           
  Misses        536      536           
  Partials      223      223

Flag	Coverage Δ
unittests	`90.39% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
mmseg/core/evaluation/class_names.py	`88.33% <ø> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3d0c2eb...872fa6b. Read the comment docs.

MeowZheng

also modify docs/cn/train.md

docs/en/train.md

docs/zh_cn/train.md

* [Feature] Support Resnet strikes back * fix url * [Feature] Add multi machine `dist_train`. (#1383) * Add training startup documentation * fix * fix * fix * fix * fix * fix * fix * fix * fix * modify R-50b rsb Co-authored-by: FangjianLin <93248678+linfangjian01@users.noreply.github.com>

* [Feature] Provide URLs of Swin Transformer pretrained models * [Feature] Add multi machine `dist_train`. (#1383) * Add training startup documentation * fix * fix * fix * fix * fix * fix * fix * fix * fix * delete pretrained=None in all six config files * [Fix] make arguments effective in tools/confusion_matrix.py (#1401) * add an argument for customizing `title' of the output figure * fix `color_theme' arguments not passing to plot function Signed-off-by: code14 <mob5566@gmail.com> * colab notebook: fix outdated link for doc (#1392) * colab notebook: fix outdated link for doc Fixed outdated link for how to customize your datasets by reorganizing data. * fix lint * fix typo (#1405) * [Fix] Fix windows-style path in `md2yml.py` in Windows pre-commit. (#1407) * test * avoid windows path * [Fix] fix the config name style description (#1414) Co-authored-by: FangjianLin <93248678+linfangjian01@users.noreply.github.com> Co-authored-by: Cody Wong <mob5566@gmail.com> Co-authored-by: Nemo Xiong <xiongnemo@126.com> Co-authored-by: Xiangxu-0103 <xuxiang0103@gmail.com> Co-authored-by: Rockey <41846794+RockeyCoss@users.noreply.github.com>

* Add training startup documentation * fix * fix * fix * fix * fix * fix * fix * fix * fix

* [Feature] Support Resnet strikes back * fix url * [Feature] Add multi machine `dist_train`. (open-mmlab#1383) * Add training startup documentation * fix * fix * fix * fix * fix * fix * fix * fix * fix * modify R-50b rsb Co-authored-by: FangjianLin <93248678+linfangjian01@users.noreply.github.com>

…lab#1389) * [Feature] Provide URLs of Swin Transformer pretrained models * [Feature] Add multi machine `dist_train`. (open-mmlab#1383) * Add training startup documentation * fix * fix * fix * fix * fix * fix * fix * fix * fix * delete pretrained=None in all six config files * [Fix] make arguments effective in tools/confusion_matrix.py (open-mmlab#1401) * add an argument for customizing `title' of the output figure * fix `color_theme' arguments not passing to plot function Signed-off-by: code14 <mob5566@gmail.com> * colab notebook: fix outdated link for doc (open-mmlab#1392) * colab notebook: fix outdated link for doc Fixed outdated link for how to customize your datasets by reorganizing data. * fix lint * fix typo (open-mmlab#1405) * [Fix] Fix windows-style path in `md2yml.py` in Windows pre-commit. (open-mmlab#1407) * test * avoid windows path * [Fix] fix the config name style description (open-mmlab#1414) Co-authored-by: FangjianLin <93248678+linfangjian01@users.noreply.github.com> Co-authored-by: Cody Wong <mob5566@gmail.com> Co-authored-by: Nemo Xiong <xiongnemo@126.com> Co-authored-by: Xiangxu-0103 <xuxiang0103@gmail.com> Co-authored-by: Rockey <41846794+RockeyCoss@users.noreply.github.com>

* Add training startup documentation * fix * fix * fix * fix * fix * fix * fix * fix * fix

* [Feature] Support Resnet strikes back * fix url * [Feature] Add multi machine `dist_train`. (open-mmlab#1383) * Add training startup documentation * fix * fix * fix * fix * fix * fix * fix * fix * fix * modify R-50b rsb Co-authored-by: FangjianLin <93248678+linfangjian01@users.noreply.github.com>

…lab#1389) * [Feature] Provide URLs of Swin Transformer pretrained models * [Feature] Add multi machine `dist_train`. (open-mmlab#1383) * Add training startup documentation * fix * fix * fix * fix * fix * fix * fix * fix * fix * delete pretrained=None in all six config files * [Fix] make arguments effective in tools/confusion_matrix.py (open-mmlab#1401) * add an argument for customizing `title' of the output figure * fix `color_theme' arguments not passing to plot function Signed-off-by: code14 <mob5566@gmail.com> * colab notebook: fix outdated link for doc (open-mmlab#1392) * colab notebook: fix outdated link for doc Fixed outdated link for how to customize your datasets by reorganizing data. * fix lint * fix typo (open-mmlab#1405) * [Fix] Fix windows-style path in `md2yml.py` in Windows pre-commit. (open-mmlab#1407) * test * avoid windows path * [Fix] fix the config name style description (open-mmlab#1414) Co-authored-by: FangjianLin <93248678+linfangjian01@users.noreply.github.com> Co-authored-by: Cody Wong <mob5566@gmail.com> Co-authored-by: Nemo Xiong <xiongnemo@126.com> Co-authored-by: Xiangxu-0103 <xuxiang0103@gmail.com> Co-authored-by: Rockey <41846794+RockeyCoss@users.noreply.github.com>

linfangjian01 added 2 commits March 16, 2022 07:30

Add training startup documentation

f58a43a

fix

b536ba4

linfangjian01 requested review from MeowZheng, RockeyCoss and MengzhangLI March 16, 2022 14:56

MeowZheng reviewed Mar 16, 2022

View reviewed changes

fix

5bd85ab

MeowZheng reviewed Mar 17, 2022

View reviewed changes

fix

23c7ee8

MengzhangLI changed the title ~~Add training startup documentation~~ [Feature] Add multi machine dist_train. Mar 17, 2022

linfangjian01 added 5 commits March 17, 2022 00:49

fix

cade11e

fix

63850c4

fix

c332cfe

fix

adfd72f

fix

906be39

MeowZheng reviewed Mar 17, 2022

View reviewed changes

docs/en/train.md Outdated Show resolved Hide resolved

docs/en/train.md Outdated Show resolved Hide resolved

docs/zh_cn/train.md Outdated Show resolved Hide resolved

fix

872fa6b

MeowZheng approved these changes Mar 17, 2022

View reviewed changes

MeowZheng merged commit 1b24ad6 into open-mmlab:master Mar 18, 2022

mob5566 pushed a commit to mob5566/mmsegmentation that referenced this pull request Apr 13, 2022

[Feature] Add multi machine dist_train. (open-mmlab#1383)

e601d4d

* Add training startup documentation * fix * fix * fix * fix * fix * fix * fix * fix * fix

ZhimingNJ pushed a commit to AetrexTechnology/mmsegmentation that referenced this pull request Jun 29, 2022

[Feature] Add multi machine dist_train. (open-mmlab#1383)

415b20f

* Add training startup documentation * fix * fix * fix * fix * fix * fix * fix * fix * fix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Add multi machine `dist_train`. #1383

[Feature] Add multi machine `dist_train`. #1383

linfangjian01 commented Mar 16, 2022 •

edited

Loading

MeowZheng left a comment

MengzhangLI commented Mar 16, 2022 •

edited by linfangjian01

Loading

codecov bot commented Mar 16, 2022 •

edited

Loading

MeowZheng left a comment

[Feature] Add multi machine dist_train. #1383

[Feature] Add multi machine dist_train. #1383

Conversation

linfangjian01 commented Mar 16, 2022 • edited Loading

Motivation

Modification

MeowZheng left a comment

Choose a reason for hiding this comment

MengzhangLI commented Mar 16, 2022 • edited by linfangjian01 Loading

codecov bot commented Mar 16, 2022 • edited Loading

Codecov Report

MeowZheng left a comment

Choose a reason for hiding this comment

[Feature] Add multi machine `dist_train`. #1383

[Feature] Add multi machine `dist_train`. #1383

linfangjian01 commented Mar 16, 2022 •

edited

Loading

MengzhangLI commented Mar 16, 2022 •

edited by linfangjian01

Loading

codecov bot commented Mar 16, 2022 •

edited

Loading