Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose early stopping hyper-parameters for all tasks #1241

Merged
merged 35 commits into from
Sep 21, 2022
Merged

Conversation

supersoob
Copy link
Contributor

@supersoob supersoob commented Sep 1, 2022

This PR includes

  • Add early stopping parameters in config.py
  • Overwrite early stopping hook
  • hyper-parameter change for multi-class classification to enable early stopping

@supersoob supersoob requested a review from a team as a code owner September 1, 2022 06:36
@github-actions github-actions bot added the ALGO Any changes in OTX Algo Tasks implementation label Sep 1, 2022
@supersoob
Copy link
Contributor Author

run ote-test

harimkang
harimkang previously approved these changes Sep 6, 2022
Copy link
Contributor

@harimkang harimkang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the adaptive interval of detection works well in SC, I think it can be merged.

@harimkang
Copy link
Contributor

harimkang commented Sep 6, 2022

+Currently, the minimum value of patience is 0 (editable in UI), so I think we need to check if it works well.

@supersoob
Copy link
Contributor Author

Test results
image

@supersoob
Copy link
Contributor Author

supersoob commented Sep 7, 2022

I checked all tasks are working well with early stopping hp. Also working well when patience=0. I will update the hyperparameters resulted from early stop exp. for classification soon. Please review the related MPA PR, too

@supersoob supersoob changed the base branch from releases/v0.3.0-sc1.2 to develop September 19, 2022 02:08
Copy link
Contributor

@JihwanEom JihwanEom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@supersoob
Copy link
Contributor Author

Could you resolve below issue? image

https://app.codacy.com/gh/openvinotoolkit/training_extensions/pullRequest?prid=10125167

I think it doesn't matter if it fails that check. There are longer lines in QUICK_START_GUIDE.md. Old PRs are merged even though it fails some codacy code analysis.

Copy link
Contributor

@harimkang harimkang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no other comments on the implementation of the code.
There are some modified model template values and hpo_config values, what kind of background does this have? (Maybe I missed some of the experimental results...)

(I missed it briefly, but hpo value is modified by Eun-Woo, Forget about hpo_config.)

@supersoob supersoob merged commit f4a4827 into develop Sep 21, 2022
@supersoob supersoob deleted the early_stop_hp branch September 21, 2022 01:43
harimkang added a commit that referenced this pull request Sep 26, 2022
…1284)

* Update submodule branch (#1222)

* Enhance training schedule for multi-label classification (#1212)

* [CVS-88098] Remove initialize from export functions (#1226)

* Train graph added (#1211)

Co-authored-by: Lee, Soobee <soobeele@intel.com>

* Add @attrs decorator for base configs (#1229)

Signed-off-by: Songki Choi <songki.choi@intel.com>
Co-authored-by: Harim Kang <harim.kang@intel.com>

* Pretrained weight download error in MobilenetV3-large-1 of deep-object-reid in SC (#1233)

* [Anomaly Task] Revert hpo template (#1230)

* 🐞 [Anomaly Task] Fix progress bar (#1223)

* [CVS-90555] Fix NaN value in classification (#1244)

* update hpo_config.yaml (#1240)

* [CVS-90400, CVS-91015] NNCF pruning supported tweaks (#1248)

* Turned off pruning_support visibility for anomaly models (CVS-91015)

* Disabled pruning for EfficientNet-V2-S (CVS-90400)

* [Anomaly Task] 🐞 Fix inference when model backbone changes (#1242)

* Fix CVS-91469 sseg compatibility issue

* [CVS-91472] Add pruning_supported value (#1263)

* Pruning supported tweaks (#1256)

* [CVS-90400, CVS-91015] NNCF pruning supported tweaks (#1248)

* Turned off pruning_support visibility for anomaly models (CVS-91015)

* Disabled pruning for EfficientNet-V2-S (CVS-90400)

* Revert "[CVS-90400, CVS-91015] NNCF pruning supported tweaks (#1248)" (#1269)

* [OTE-TEST] Disable obsolete test cases (#1220)

* [OTE-TEST] hot-fix for MPA performance tests (#1273)

* Expose early stopping hyper-parameters for all tasks (#1241)

* Resolve pre-commit issues (#1272)

* Remove LazyEarlyStopHook in model_multilabel.py (#1281)

* Removed xfail (#1239)

Signed-off-by: Songki Choi <songki.choi@intel.com>
Co-authored-by: Ashwin Vaidya <ashwin.vaidya@intel.com>
Co-authored-by: Jaeguk Hyun <jaeguk.hyun@intel.com>
Co-authored-by: Nikita Savelyev <nikita.savelyev@intel.com>
Co-authored-by: Vladisalv Sovrasov <sovrasov.vlad@gmail.com>
Co-authored-by: Jihwan Eom <jihwan.eom@intel.com>
Co-authored-by: Songki Choi <songki.choi@intel.com>
Co-authored-by: Soobee Lee <soobee.lee@intel.com>
Co-authored-by: Lee, Soobee <soobeele@intel.com>
Co-authored-by: Eugene Liu <eugene.liu@intel.com>
Co-authored-by: Emily Chun <emily.chun@intel.com>
Co-authored-by: ljcornel <ludo.cornelissen@intel.com>
Co-authored-by: Eunwoo Shin <eunwoo.shin@intel.com>
yunchu pushed a commit that referenced this pull request Nov 8, 2022
…1284)

* Update submodule branch (#1222)

* Enhance training schedule for multi-label classification (#1212)

* [CVS-88098] Remove initialize from export functions (#1226)

* Train graph added (#1211)

Co-authored-by: Lee, Soobee <soobeele@intel.com>

* Add @attrs decorator for base configs (#1229)

Signed-off-by: Songki Choi <songki.choi@intel.com>
Co-authored-by: Harim Kang <harim.kang@intel.com>

* Pretrained weight download error in MobilenetV3-large-1 of deep-object-reid in SC (#1233)

* [Anomaly Task] Revert hpo template (#1230)

* 🐞 [Anomaly Task] Fix progress bar (#1223)

* [CVS-90555] Fix NaN value in classification (#1244)

* update hpo_config.yaml (#1240)

* [CVS-90400, CVS-91015] NNCF pruning supported tweaks (#1248)

* Turned off pruning_support visibility for anomaly models (CVS-91015)

* Disabled pruning for EfficientNet-V2-S (CVS-90400)

* [Anomaly Task] 🐞 Fix inference when model backbone changes (#1242)

* Fix CVS-91469 sseg compatibility issue

* [CVS-91472] Add pruning_supported value (#1263)

* Pruning supported tweaks (#1256)

* [CVS-90400, CVS-91015] NNCF pruning supported tweaks (#1248)

* Turned off pruning_support visibility for anomaly models (CVS-91015)

* Disabled pruning for EfficientNet-V2-S (CVS-90400)

* Revert "[CVS-90400, CVS-91015] NNCF pruning supported tweaks (#1248)" (#1269)

* [OTE-TEST] Disable obsolete test cases (#1220)

* [OTE-TEST] hot-fix for MPA performance tests (#1273)

* Expose early stopping hyper-parameters for all tasks (#1241)

* Resolve pre-commit issues (#1272)

* Remove LazyEarlyStopHook in model_multilabel.py (#1281)

* Removed xfail (#1239)

Signed-off-by: Songki Choi <songki.choi@intel.com>
Co-authored-by: Ashwin Vaidya <ashwin.vaidya@intel.com>
Co-authored-by: Jaeguk Hyun <jaeguk.hyun@intel.com>
Co-authored-by: Nikita Savelyev <nikita.savelyev@intel.com>
Co-authored-by: Vladisalv Sovrasov <sovrasov.vlad@gmail.com>
Co-authored-by: Jihwan Eom <jihwan.eom@intel.com>
Co-authored-by: Songki Choi <songki.choi@intel.com>
Co-authored-by: Soobee Lee <soobee.lee@intel.com>
Co-authored-by: Lee, Soobee <soobeele@intel.com>
Co-authored-by: Eugene Liu <eugene.liu@intel.com>
Co-authored-by: Emily Chun <emily.chun@intel.com>
Co-authored-by: ljcornel <ludo.cornelissen@intel.com>
Co-authored-by: Eunwoo Shin <eunwoo.shin@intel.com>
goodsong81 added a commit that referenced this pull request Dec 28, 2022
* Add tiling module (#1200)

* Update submodule branch (#1222)

* Enhance training schedule for multi-label classification (#1212)

* [CVS-88098] Remove initialize from export functions (#1226)

* Train graph added (#1211)

* Add @attrs decorator for base configs (#1229)

* Pretrained weight download error in MobilenetV3-large-1 of deep-object-reid in SC (#1233)

* [Anomaly Task] Revert hpo template (#1230)

* 🐞 [Anomaly Task] Fix progress bar (#1223)

* [CVS-90555] Fix NaN value in classification (#1244)

* update hpo_config.yaml (#1240)

* [CVS-90400, CVS-91015] NNCF pruning supported tweaks (#1248)

* [Anomaly Task] 🐞 Fix inference when model backbone changes (#1242)

* [CVS-91472] Add pruning_supported value (#1263)

* Pruning supported tweaks (#1256)

* [CVS-90400, CVS-91015] NNCF pruning supported tweaks (#1248)

* Revert "[CVS-90400, CVS-91015] NNCF pruning supported tweaks (#1248)" (#1269)

* [OTE-TEST] Disable obsolete test cases (#1220)

* [OTE-TEST] hot-fix for MPA performance tests (#1273)

* [Anomaly Task] ✨ Upgrade anomalib (#1243)

* Expose early stopping hyper-parameters for all tasks (#1241)

* Resolve pre-commit issues (#1272)

* Remove LazyEarlyStopHook in model_multilabel.py (#1281)

* Removed xfail (#1239)

* Implement IB loss for incremental learning in multi-class classification (#1289)

* Edit num_workers and change MPA repo as a latest (#1314)

* fix annotation bug (#1320)

* Valid POT configs for small HRNet models (#1313)

* Disable NNCF optimization for FP16 models (#1312)

* fliter object less than 1 pixel  (#1305)

* Fix some tests (#1322)

* [Develop] Move drop_last into MPA (#1357)

* Apply changes from releases/v0.3.1-geti1.0.0 (#1337)

* anomaly save_model bugfix (#1300)

* upgrade networkx module version (#1303)

* Forward CVS-94422 size bug fix PR to release branch (#1326)

* Valid POT configs for small HRNet models (#1317)

* [Release branch] Disable NNCF optimization for FP16 models  (#1319)

* [RELEASE] CVS-95549 - Hierarchical classification training failed without obvious reason (#1329)

* Fix h-label: per-group softmax (#1332)

* Fix dataset length bug in mpa task (#1338)

* Fix drop_last key issue for det/set (#1340)

* Hot-fix for OV inference for iseg output (#1345)

* Fix nncf model export bug (#1346)

* Fixed merge error (#1359)

* Update evaluation iou_thr of ins-seg (#1354)

* fix pre-commit test (#1366)

* Fix dataset item tests (#1360)

* Fix OV Inference issues (tiling tests & detection tests) (#1361)

* fix black & add xfail test cases (#1367)

* Update check_nncf_graph. (#1330)

* [Develop] Hot-fix OV inference issue in rotated detection (#1375)

* [Develop] updated documents (#1383)

* [CVS-94911] Fix difference between train and validation normalization pipeline (#1310)

* Update configs for padim model (#1378)

* updated QUICK_START_GUIDE.md (#1397)

* Change ote threshold of openvino test for cls (#1401)

* Normalize top-1 metrics to [0, 1] (#1394)

* Tiling deployment (#1387)

* Replace current saliency map generation with Recipro-CAM for cls (#1363)

* Class-wise saliency map generation for the detection task (#1402)

* Change submodule to develop (#1410)

* Send full dataset to POT optimization function (#1379) & Convert NaN to num to make visible in geti UI (#1413)

* Add active score evaluation to the classification task

* [release/0.4.0][OTX] Enabling GPU execution for exported code (#1416)

* [OTE][Release][XAI] Detection fix two stage bbox_head error (#1414)

* Update SDK commit for exportable code (#1423)

* HRNet-x and HRNe-18--mod2 configs update (#1419)

* [Release] Enable tiling oriented detection for v0.4.0/geti1.1.0 (#1427)

* [OTE][Releases v0.4.0][XAI] Hot-fix for Detection fix two stage error (#1433)

* Temporary MPA branch while dev->otx merge process

* Update doc & install for dev->otx changes

* Update ote_sdk -> otx.api

* Update ote_cli -> otx.cli

* Update external/mmsegmentation -> otx/algorithms/segmentation

* Align saliency map media instantiation over tasks (#1447)

* Update external/d-o-r -> otx/algorithms/classification

* Update external/mmdetection -> otx/algorithms/detection

* Update external/mpa -> otx/algorithms/*

* Fix CLI test run for better error message

* Numpy constraint for deprecated np.bool error

* Capture stderr only

* Align numpy requirement

* [OTX/Anomaly] Add changes from external to otx (#1452)

* Add changes from external to otx

* Address PR comments

* Update config files + remove backbone from base

* Fix pre-merge checks

* Fix pre-commit issues

* Update exportable code commit

* Fix indent error

* Fix flake8 issue

* Resolve softmax issue w/ FIXME for future work

* Add tiling tests

* Revert MPA branch to otx

Signed-off-by: Songki Choi <songki.choi@intel.com>
Co-authored-by: Eugene Liu <eugene.liu@intel.com>
Co-authored-by: Ashwin Vaidya <ashwin.vaidya@intel.com>
Co-authored-by: Jaeguk Hyun <jaeguk.hyun@intel.com>
Co-authored-by: Nikita Savelyev <nikita.savelyev@intel.com>
Co-authored-by: Jihwan Eom <jihwan.eom@intel.com>
Co-authored-by: Harim Kang <harim.kang@intel.com>
Co-authored-by: Soobee Lee <soobee.lee@intel.com>
Co-authored-by: Lee, Soobee <soobeele@intel.com>
Co-authored-by: Emily Chun <emily.chun@intel.com>
Co-authored-by: ljcornel <ludo.cornelissen@intel.com>
Co-authored-by: Eunwoo Shin <eunwoo.shin@intel.com>
Co-authored-by: dlyakhov <daniil.lyakhov@intel.com>
Co-authored-by: kprokofi <kirill.prokofiev@intel.com>
Co-authored-by: Sungman Cho <sungman.cho@intel.com>
Co-authored-by: Yunchu Lee <yunchu.lee@intel.com>
Co-authored-by: Ashwin Vaidya <ashwinitinvaidya@gmail.com>
Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com>
Co-authored-by: Vladislav Sovrasov <vladislav.sovrasov@intel.com>
Co-authored-by: Evgeny Tsykunov <e.tsykunov@gmail.com>
Co-authored-by: Galina Zalesskaya <galina.zalesskaya@intel.com>
Co-authored-by: dongkwan-kim <dongkwan.kim@intel.com>
sungmanc added a commit that referenced this pull request Dec 28, 2022
* make logfile saved in save-model-to directory

* enable train

* make main process train also

* bugfix

* refactor multi gpu training

* make all processs have same output path

* prevent child process from being termated by fokred main process

* refactor multigpu implementation

* refactor multi gpu implementation

* modify argument help sentence

* add multi gpu test code

* align with pre-commit test

* separate multi GPU manager class

* modify train cli argument 'save-logs-to' to 'output-path'

* remove tray excpet during killing child process

* apply output_path to all tasks

* change print to logger

* skip multi gpu test if number of gpu is insufficient

* fix typo

* multi gpu test bugfix

* isort fix

* test case bugfix

* fix typo and change some variable name

* [OTX] Apply changes in develop to feature/otx branch (#1436)

* Add tiling module (#1200)

* Update submodule branch (#1222)

* Enhance training schedule for multi-label classification (#1212)

* [CVS-88098] Remove initialize from export functions (#1226)

* Train graph added (#1211)

* Add @attrs decorator for base configs (#1229)

* Pretrained weight download error in MobilenetV3-large-1 of deep-object-reid in SC (#1233)

* [Anomaly Task] Revert hpo template (#1230)

* 🐞 [Anomaly Task] Fix progress bar (#1223)

* [CVS-90555] Fix NaN value in classification (#1244)

* update hpo_config.yaml (#1240)

* [CVS-90400, CVS-91015] NNCF pruning supported tweaks (#1248)

* [Anomaly Task] 🐞 Fix inference when model backbone changes (#1242)

* [CVS-91472] Add pruning_supported value (#1263)

* Pruning supported tweaks (#1256)

* [CVS-90400, CVS-91015] NNCF pruning supported tweaks (#1248)

* Revert "[CVS-90400, CVS-91015] NNCF pruning supported tweaks (#1248)" (#1269)

* [OTE-TEST] Disable obsolete test cases (#1220)

* [OTE-TEST] hot-fix for MPA performance tests (#1273)

* [Anomaly Task] ✨ Upgrade anomalib (#1243)

* Expose early stopping hyper-parameters for all tasks (#1241)

* Resolve pre-commit issues (#1272)

* Remove LazyEarlyStopHook in model_multilabel.py (#1281)

* Removed xfail (#1239)

* Implement IB loss for incremental learning in multi-class classification (#1289)

* Edit num_workers and change MPA repo as a latest (#1314)

* fix annotation bug (#1320)

* Valid POT configs for small HRNet models (#1313)

* Disable NNCF optimization for FP16 models (#1312)

* fliter object less than 1 pixel  (#1305)

* Fix some tests (#1322)

* [Develop] Move drop_last into MPA (#1357)

* Apply changes from releases/v0.3.1-geti1.0.0 (#1337)

* anomaly save_model bugfix (#1300)

* upgrade networkx module version (#1303)

* Forward CVS-94422 size bug fix PR to release branch (#1326)

* Valid POT configs for small HRNet models (#1317)

* [Release branch] Disable NNCF optimization for FP16 models  (#1319)

* [RELEASE] CVS-95549 - Hierarchical classification training failed without obvious reason (#1329)

* Fix h-label: per-group softmax (#1332)

* Fix dataset length bug in mpa task (#1338)

* Fix drop_last key issue for det/set (#1340)

* Hot-fix for OV inference for iseg output (#1345)

* Fix nncf model export bug (#1346)

* Fixed merge error (#1359)

* Update evaluation iou_thr of ins-seg (#1354)

* fix pre-commit test (#1366)

* Fix dataset item tests (#1360)

* Fix OV Inference issues (tiling tests & detection tests) (#1361)

* fix black & add xfail test cases (#1367)

* Update check_nncf_graph. (#1330)

* [Develop] Hot-fix OV inference issue in rotated detection (#1375)

* [Develop] updated documents (#1383)

* [CVS-94911] Fix difference between train and validation normalization pipeline (#1310)

* Update configs for padim model (#1378)

* updated QUICK_START_GUIDE.md (#1397)

* Change ote threshold of openvino test for cls (#1401)

* Normalize top-1 metrics to [0, 1] (#1394)

* Tiling deployment (#1387)

* Replace current saliency map generation with Recipro-CAM for cls (#1363)

* Class-wise saliency map generation for the detection task (#1402)

* Change submodule to develop (#1410)

* Send full dataset to POT optimization function (#1379) & Convert NaN to num to make visible in geti UI (#1413)

* Add active score evaluation to the classification task

* [release/0.4.0][OTX] Enabling GPU execution for exported code (#1416)

* [OTE][Release][XAI] Detection fix two stage bbox_head error (#1414)

* Update SDK commit for exportable code (#1423)

* HRNet-x and HRNe-18--mod2 configs update (#1419)

* [Release] Enable tiling oriented detection for v0.4.0/geti1.1.0 (#1427)

* [OTE][Releases v0.4.0][XAI] Hot-fix for Detection fix two stage error (#1433)

* Temporary MPA branch while dev->otx merge process

* Update doc & install for dev->otx changes

* Update ote_sdk -> otx.api

* Update ote_cli -> otx.cli

* Update external/mmsegmentation -> otx/algorithms/segmentation

* Align saliency map media instantiation over tasks (#1447)

* Update external/d-o-r -> otx/algorithms/classification

* Update external/mmdetection -> otx/algorithms/detection

* Update external/mpa -> otx/algorithms/*

* Fix CLI test run for better error message

* Numpy constraint for deprecated np.bool error

* Capture stderr only

* Align numpy requirement

* [OTX/Anomaly] Add changes from external to otx (#1452)

* Add changes from external to otx

* Address PR comments

* Update config files + remove backbone from base

* Fix pre-merge checks

* Fix pre-commit issues

* Update exportable code commit

* Fix indent error

* Fix flake8 issue

* Resolve softmax issue w/ FIXME for future work

* Add tiling tests

* Revert MPA branch to otx

Signed-off-by: Songki Choi <songki.choi@intel.com>
Co-authored-by: Eugene Liu <eugene.liu@intel.com>
Co-authored-by: Ashwin Vaidya <ashwin.vaidya@intel.com>
Co-authored-by: Jaeguk Hyun <jaeguk.hyun@intel.com>
Co-authored-by: Nikita Savelyev <nikita.savelyev@intel.com>
Co-authored-by: Jihwan Eom <jihwan.eom@intel.com>
Co-authored-by: Harim Kang <harim.kang@intel.com>
Co-authored-by: Soobee Lee <soobee.lee@intel.com>
Co-authored-by: Lee, Soobee <soobeele@intel.com>
Co-authored-by: Emily Chun <emily.chun@intel.com>
Co-authored-by: ljcornel <ludo.cornelissen@intel.com>
Co-authored-by: Eunwoo Shin <eunwoo.shin@intel.com>
Co-authored-by: dlyakhov <daniil.lyakhov@intel.com>
Co-authored-by: kprokofi <kirill.prokofiev@intel.com>
Co-authored-by: Sungman Cho <sungman.cho@intel.com>
Co-authored-by: Yunchu Lee <yunchu.lee@intel.com>
Co-authored-by: Ashwin Vaidya <ashwinitinvaidya@gmail.com>
Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com>
Co-authored-by: Vladislav Sovrasov <vladislav.sovrasov@intel.com>
Co-authored-by: Evgeny Tsykunov <e.tsykunov@gmail.com>
Co-authored-by: Galina Zalesskaya <galina.zalesskaya@intel.com>
Co-authored-by: dongkwan-kim <dongkwan.kim@intel.com>

* Apply latest MPA openvinotoolkit/model_preparation_algorithm#105

Signed-off-by: Songki Choi <songki.choi@intel.com>

Signed-off-by: Songki Choi <songki.choi@intel.com>
Co-authored-by: eunwoosh <eunwoo.shin@intel.com>
Co-authored-by: Eugene Liu <eugene.liu@intel.com>
Co-authored-by: Ashwin Vaidya <ashwin.vaidya@intel.com>
Co-authored-by: Jaeguk Hyun <jaeguk.hyun@intel.com>
Co-authored-by: Nikita Savelyev <nikita.savelyev@intel.com>
Co-authored-by: Jihwan Eom <jihwan.eom@intel.com>
Co-authored-by: Harim Kang <harim.kang@intel.com>
Co-authored-by: Soobee Lee <soobee.lee@intel.com>
Co-authored-by: Lee, Soobee <soobeele@intel.com>
Co-authored-by: Emily Chun <emily.chun@intel.com>
Co-authored-by: ljcornel <ludo.cornelissen@intel.com>
Co-authored-by: dlyakhov <daniil.lyakhov@intel.com>
Co-authored-by: kprokofi <kirill.prokofiev@intel.com>
Co-authored-by: Sungman Cho <sungman.cho@intel.com>
Co-authored-by: Yunchu Lee <yunchu.lee@intel.com>
Co-authored-by: Ashwin Vaidya <ashwinitinvaidya@gmail.com>
Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com>
Co-authored-by: Vladislav Sovrasov <vladislav.sovrasov@intel.com>
Co-authored-by: Evgeny Tsykunov <e.tsykunov@gmail.com>
Co-authored-by: Galina Zalesskaya <galina.zalesskaya@intel.com>
Co-authored-by: dongkwan-kim <dongkwan.kim@intel.com>
sungmanc added a commit that referenced this pull request Dec 29, 2022
* make logfile saved in save-model-to directory

* enable train

* make main process train also

* bugfix

* refactor multi gpu training

* make all processs have same output path

* prevent child process from being termated by fokred main process

* refactor multigpu implementation

* refactor multi gpu implementation

* modify argument help sentence

* add multi gpu test code

* align with pre-commit test

* separate multi GPU manager class

* modify train cli argument 'save-logs-to' to 'output-path'

* remove tray excpet during killing child process

* apply output_path to all tasks

* change print to logger

* skip multi gpu test if number of gpu is insufficient

* fix typo

* multi gpu test bugfix

* isort fix

* test case bugfix

* fix typo and change some variable name

* [OTX] Apply changes in develop to feature/otx branch (#1436)

* Add tiling module (#1200)

* Update submodule branch (#1222)

* Enhance training schedule for multi-label classification (#1212)

* [CVS-88098] Remove initialize from export functions (#1226)

* Train graph added (#1211)

* Add @attrs decorator for base configs (#1229)

* Pretrained weight download error in MobilenetV3-large-1 of deep-object-reid in SC (#1233)

* [Anomaly Task] Revert hpo template (#1230)

* 🐞 [Anomaly Task] Fix progress bar (#1223)

* [CVS-90555] Fix NaN value in classification (#1244)

* update hpo_config.yaml (#1240)

* [CVS-90400, CVS-91015] NNCF pruning supported tweaks (#1248)

* [Anomaly Task] 🐞 Fix inference when model backbone changes (#1242)

* [CVS-91472] Add pruning_supported value (#1263)

* Pruning supported tweaks (#1256)

* [CVS-90400, CVS-91015] NNCF pruning supported tweaks (#1248)

* Revert "[CVS-90400, CVS-91015] NNCF pruning supported tweaks (#1248)" (#1269)

* [OTE-TEST] Disable obsolete test cases (#1220)

* [OTE-TEST] hot-fix for MPA performance tests (#1273)

* [Anomaly Task] ✨ Upgrade anomalib (#1243)

* Expose early stopping hyper-parameters for all tasks (#1241)

* Resolve pre-commit issues (#1272)

* Remove LazyEarlyStopHook in model_multilabel.py (#1281)

* Removed xfail (#1239)

* Implement IB loss for incremental learning in multi-class classification (#1289)

* Edit num_workers and change MPA repo as a latest (#1314)

* fix annotation bug (#1320)

* Valid POT configs for small HRNet models (#1313)

* Disable NNCF optimization for FP16 models (#1312)

* fliter object less than 1 pixel  (#1305)

* Fix some tests (#1322)

* [Develop] Move drop_last into MPA (#1357)

* Apply changes from releases/v0.3.1-geti1.0.0 (#1337)

* anomaly save_model bugfix (#1300)

* upgrade networkx module version (#1303)

* Forward CVS-94422 size bug fix PR to release branch (#1326)

* Valid POT configs for small HRNet models (#1317)

* [Release branch] Disable NNCF optimization for FP16 models  (#1319)

* [RELEASE] CVS-95549 - Hierarchical classification training failed without obvious reason (#1329)

* Fix h-label: per-group softmax (#1332)

* Fix dataset length bug in mpa task (#1338)

* Fix drop_last key issue for det/set (#1340)

* Hot-fix for OV inference for iseg output (#1345)

* Fix nncf model export bug (#1346)

* Fixed merge error (#1359)

* Update evaluation iou_thr of ins-seg (#1354)

* fix pre-commit test (#1366)

* Fix dataset item tests (#1360)

* Fix OV Inference issues (tiling tests & detection tests) (#1361)

* fix black & add xfail test cases (#1367)

* Update check_nncf_graph. (#1330)

* [Develop] Hot-fix OV inference issue in rotated detection (#1375)

* [Develop] updated documents (#1383)

* [CVS-94911] Fix difference between train and validation normalization pipeline (#1310)

* Update configs for padim model (#1378)

* updated QUICK_START_GUIDE.md (#1397)

* Change ote threshold of openvino test for cls (#1401)

* Normalize top-1 metrics to [0, 1] (#1394)

* Tiling deployment (#1387)

* Replace current saliency map generation with Recipro-CAM for cls (#1363)

* Class-wise saliency map generation for the detection task (#1402)

* Change submodule to develop (#1410)

* Send full dataset to POT optimization function (#1379) & Convert NaN to num to make visible in geti UI (#1413)

* Add active score evaluation to the classification task

* [release/0.4.0][OTX] Enabling GPU execution for exported code (#1416)

* [OTE][Release][XAI] Detection fix two stage bbox_head error (#1414)

* Update SDK commit for exportable code (#1423)

* HRNet-x and HRNe-18--mod2 configs update (#1419)

* [Release] Enable tiling oriented detection for v0.4.0/geti1.1.0 (#1427)

* [OTE][Releases v0.4.0][XAI] Hot-fix for Detection fix two stage error (#1433)

* Temporary MPA branch while dev->otx merge process

* Update doc & install for dev->otx changes

* Update ote_sdk -> otx.api

* Update ote_cli -> otx.cli

* Update external/mmsegmentation -> otx/algorithms/segmentation

* Align saliency map media instantiation over tasks (#1447)

* Update external/d-o-r -> otx/algorithms/classification

* Update external/mmdetection -> otx/algorithms/detection

* Update external/mpa -> otx/algorithms/*

* Fix CLI test run for better error message

* Numpy constraint for deprecated np.bool error

* Capture stderr only

* Align numpy requirement

* [OTX/Anomaly] Add changes from external to otx (#1452)

* Add changes from external to otx

* Address PR comments

* Update config files + remove backbone from base

* Fix pre-merge checks

* Fix pre-commit issues

* Update exportable code commit

* Fix indent error

* Fix flake8 issue

* Resolve softmax issue w/ FIXME for future work

* Add tiling tests

* Revert MPA branch to otx

Signed-off-by: Songki Choi <songki.choi@intel.com>
Co-authored-by: Eugene Liu <eugene.liu@intel.com>
Co-authored-by: Ashwin Vaidya <ashwin.vaidya@intel.com>
Co-authored-by: Jaeguk Hyun <jaeguk.hyun@intel.com>
Co-authored-by: Nikita Savelyev <nikita.savelyev@intel.com>
Co-authored-by: Jihwan Eom <jihwan.eom@intel.com>
Co-authored-by: Harim Kang <harim.kang@intel.com>
Co-authored-by: Soobee Lee <soobee.lee@intel.com>
Co-authored-by: Lee, Soobee <soobeele@intel.com>
Co-authored-by: Emily Chun <emily.chun@intel.com>
Co-authored-by: ljcornel <ludo.cornelissen@intel.com>
Co-authored-by: Eunwoo Shin <eunwoo.shin@intel.com>
Co-authored-by: dlyakhov <daniil.lyakhov@intel.com>
Co-authored-by: kprokofi <kirill.prokofiev@intel.com>
Co-authored-by: Sungman Cho <sungman.cho@intel.com>
Co-authored-by: Yunchu Lee <yunchu.lee@intel.com>
Co-authored-by: Ashwin Vaidya <ashwinitinvaidya@gmail.com>
Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com>
Co-authored-by: Vladislav Sovrasov <vladislav.sovrasov@intel.com>
Co-authored-by: Evgeny Tsykunov <e.tsykunov@gmail.com>
Co-authored-by: Galina Zalesskaya <galina.zalesskaya@intel.com>
Co-authored-by: dongkwan-kim <dongkwan.kim@intel.com>

* Apply latest MPA openvinotoolkit/model_preparation_algorithm#105

Signed-off-by: Songki Choi <songki.choi@intel.com>

Signed-off-by: Songki Choi <songki.choi@intel.com>
Co-authored-by: eunwoosh <eunwoo.shin@intel.com>
Co-authored-by: Eugene Liu <eugene.liu@intel.com>
Co-authored-by: Ashwin Vaidya <ashwin.vaidya@intel.com>
Co-authored-by: Jaeguk Hyun <jaeguk.hyun@intel.com>
Co-authored-by: Nikita Savelyev <nikita.savelyev@intel.com>
Co-authored-by: Jihwan Eom <jihwan.eom@intel.com>
Co-authored-by: Harim Kang <harim.kang@intel.com>
Co-authored-by: Soobee Lee <soobee.lee@intel.com>
Co-authored-by: Lee, Soobee <soobeele@intel.com>
Co-authored-by: Emily Chun <emily.chun@intel.com>
Co-authored-by: ljcornel <ludo.cornelissen@intel.com>
Co-authored-by: dlyakhov <daniil.lyakhov@intel.com>
Co-authored-by: kprokofi <kirill.prokofiev@intel.com>
Co-authored-by: Sungman Cho <sungman.cho@intel.com>
Co-authored-by: Yunchu Lee <yunchu.lee@intel.com>
Co-authored-by: Ashwin Vaidya <ashwinitinvaidya@gmail.com>
Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com>
Co-authored-by: Vladislav Sovrasov <vladislav.sovrasov@intel.com>
Co-authored-by: Evgeny Tsykunov <e.tsykunov@gmail.com>
Co-authored-by: Galina Zalesskaya <galina.zalesskaya@intel.com>
Co-authored-by: dongkwan-kim <dongkwan.kim@intel.com>
harimkang added a commit that referenced this pull request Dec 30, 2022
* Initial commit: integrate MPA intothe OTX

* Update Self-SL for seg

* Enable pre-commit tests by ignoring imports, remove useless requirements

* Edit the requirements

* Add dist folder

* fix black

* [OTX] Rebase latest changes for MPA merge (#1468)

* make logfile saved in save-model-to directory

* enable train

* make main process train also

* bugfix

* refactor multi gpu training

* make all processs have same output path

* prevent child process from being termated by fokred main process

* refactor multigpu implementation

* refactor multi gpu implementation

* modify argument help sentence

* add multi gpu test code

* align with pre-commit test

* separate multi GPU manager class

* modify train cli argument 'save-logs-to' to 'output-path'

* remove tray excpet during killing child process

* apply output_path to all tasks

* change print to logger

* skip multi gpu test if number of gpu is insufficient

* fix typo

* multi gpu test bugfix

* isort fix

* test case bugfix

* fix typo and change some variable name

* [OTX] Apply changes in develop to feature/otx branch (#1436)

* Add tiling module (#1200)

* Update submodule branch (#1222)

* Enhance training schedule for multi-label classification (#1212)

* [CVS-88098] Remove initialize from export functions (#1226)

* Train graph added (#1211)

* Add @attrs decorator for base configs (#1229)

* Pretrained weight download error in MobilenetV3-large-1 of deep-object-reid in SC (#1233)

* [Anomaly Task] Revert hpo template (#1230)

* 🐞 [Anomaly Task] Fix progress bar (#1223)

* [CVS-90555] Fix NaN value in classification (#1244)

* update hpo_config.yaml (#1240)

* [CVS-90400, CVS-91015] NNCF pruning supported tweaks (#1248)

* [Anomaly Task] 🐞 Fix inference when model backbone changes (#1242)

* [CVS-91472] Add pruning_supported value (#1263)

* Pruning supported tweaks (#1256)

* [CVS-90400, CVS-91015] NNCF pruning supported tweaks (#1248)

* Revert "[CVS-90400, CVS-91015] NNCF pruning supported tweaks (#1248)" (#1269)

* [OTE-TEST] Disable obsolete test cases (#1220)

* [OTE-TEST] hot-fix for MPA performance tests (#1273)

* [Anomaly Task] ✨ Upgrade anomalib (#1243)

* Expose early stopping hyper-parameters for all tasks (#1241)

* Resolve pre-commit issues (#1272)

* Remove LazyEarlyStopHook in model_multilabel.py (#1281)

* Removed xfail (#1239)

* Implement IB loss for incremental learning in multi-class classification (#1289)

* Edit num_workers and change MPA repo as a latest (#1314)

* fix annotation bug (#1320)

* Valid POT configs for small HRNet models (#1313)

* Disable NNCF optimization for FP16 models (#1312)

* fliter object less than 1 pixel  (#1305)

* Fix some tests (#1322)

* [Develop] Move drop_last into MPA (#1357)

* Apply changes from releases/v0.3.1-geti1.0.0 (#1337)

* anomaly save_model bugfix (#1300)

* upgrade networkx module version (#1303)

* Forward CVS-94422 size bug fix PR to release branch (#1326)

* Valid POT configs for small HRNet models (#1317)

* [Release branch] Disable NNCF optimization for FP16 models  (#1319)

* [RELEASE] CVS-95549 - Hierarchical classification training failed without obvious reason (#1329)

* Fix h-label: per-group softmax (#1332)

* Fix dataset length bug in mpa task (#1338)

* Fix drop_last key issue for det/set (#1340)

* Hot-fix for OV inference for iseg output (#1345)

* Fix nncf model export bug (#1346)

* Fixed merge error (#1359)

* Update evaluation iou_thr of ins-seg (#1354)

* fix pre-commit test (#1366)

* Fix dataset item tests (#1360)

* Fix OV Inference issues (tiling tests & detection tests) (#1361)

* fix black & add xfail test cases (#1367)

* Update check_nncf_graph. (#1330)

* [Develop] Hot-fix OV inference issue in rotated detection (#1375)

* [Develop] updated documents (#1383)

* [CVS-94911] Fix difference between train and validation normalization pipeline (#1310)

* Update configs for padim model (#1378)

* updated QUICK_START_GUIDE.md (#1397)

* Change ote threshold of openvino test for cls (#1401)

* Normalize top-1 metrics to [0, 1] (#1394)

* Tiling deployment (#1387)

* Replace current saliency map generation with Recipro-CAM for cls (#1363)

* Class-wise saliency map generation for the detection task (#1402)

* Change submodule to develop (#1410)

* Send full dataset to POT optimization function (#1379) & Convert NaN to num to make visible in geti UI (#1413)

* Add active score evaluation to the classification task

* [release/0.4.0][OTX] Enabling GPU execution for exported code (#1416)

* [OTE][Release][XAI] Detection fix two stage bbox_head error (#1414)

* Update SDK commit for exportable code (#1423)

* HRNet-x and HRNe-18--mod2 configs update (#1419)

* [Release] Enable tiling oriented detection for v0.4.0/geti1.1.0 (#1427)

* [OTE][Releases v0.4.0][XAI] Hot-fix for Detection fix two stage error (#1433)

* Temporary MPA branch while dev->otx merge process

* Update doc & install for dev->otx changes

* Update ote_sdk -> otx.api

* Update ote_cli -> otx.cli

* Update external/mmsegmentation -> otx/algorithms/segmentation

* Align saliency map media instantiation over tasks (#1447)

* Update external/d-o-r -> otx/algorithms/classification

* Update external/mmdetection -> otx/algorithms/detection

* Update external/mpa -> otx/algorithms/*

* Fix CLI test run for better error message

* Numpy constraint for deprecated np.bool error

* Capture stderr only

* Align numpy requirement

* [OTX/Anomaly] Add changes from external to otx (#1452)

* Add changes from external to otx

* Address PR comments

* Update config files + remove backbone from base

* Fix pre-merge checks

* Fix pre-commit issues

* Update exportable code commit

* Fix indent error

* Fix flake8 issue

* Resolve softmax issue w/ FIXME for future work

* Add tiling tests

* Revert MPA branch to otx

Signed-off-by: Songki Choi <songki.choi@intel.com>
Co-authored-by: Eugene Liu <eugene.liu@intel.com>
Co-authored-by: Ashwin Vaidya <ashwin.vaidya@intel.com>
Co-authored-by: Jaeguk Hyun <jaeguk.hyun@intel.com>
Co-authored-by: Nikita Savelyev <nikita.savelyev@intel.com>
Co-authored-by: Jihwan Eom <jihwan.eom@intel.com>
Co-authored-by: Harim Kang <harim.kang@intel.com>
Co-authored-by: Soobee Lee <soobee.lee@intel.com>
Co-authored-by: Lee, Soobee <soobeele@intel.com>
Co-authored-by: Emily Chun <emily.chun@intel.com>
Co-authored-by: ljcornel <ludo.cornelissen@intel.com>
Co-authored-by: Eunwoo Shin <eunwoo.shin@intel.com>
Co-authored-by: dlyakhov <daniil.lyakhov@intel.com>
Co-authored-by: kprokofi <kirill.prokofiev@intel.com>
Co-authored-by: Sungman Cho <sungman.cho@intel.com>
Co-authored-by: Yunchu Lee <yunchu.lee@intel.com>
Co-authored-by: Ashwin Vaidya <ashwinitinvaidya@gmail.com>
Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com>
Co-authored-by: Vladislav Sovrasov <vladislav.sovrasov@intel.com>
Co-authored-by: Evgeny Tsykunov <e.tsykunov@gmail.com>
Co-authored-by: Galina Zalesskaya <galina.zalesskaya@intel.com>
Co-authored-by: dongkwan-kim <dongkwan.kim@intel.com>

* Apply latest MPA openvinotoolkit/model_preparation_algorithm#105

Signed-off-by: Songki Choi <songki.choi@intel.com>

Signed-off-by: Songki Choi <songki.choi@intel.com>
Co-authored-by: eunwoosh <eunwoo.shin@intel.com>
Co-authored-by: Eugene Liu <eugene.liu@intel.com>
Co-authored-by: Ashwin Vaidya <ashwin.vaidya@intel.com>
Co-authored-by: Jaeguk Hyun <jaeguk.hyun@intel.com>
Co-authored-by: Nikita Savelyev <nikita.savelyev@intel.com>
Co-authored-by: Jihwan Eom <jihwan.eom@intel.com>
Co-authored-by: Harim Kang <harim.kang@intel.com>
Co-authored-by: Soobee Lee <soobee.lee@intel.com>
Co-authored-by: Lee, Soobee <soobeele@intel.com>
Co-authored-by: Emily Chun <emily.chun@intel.com>
Co-authored-by: ljcornel <ludo.cornelissen@intel.com>
Co-authored-by: dlyakhov <daniil.lyakhov@intel.com>
Co-authored-by: kprokofi <kirill.prokofiev@intel.com>
Co-authored-by: Sungman Cho <sungman.cho@intel.com>
Co-authored-by: Yunchu Lee <yunchu.lee@intel.com>
Co-authored-by: Ashwin Vaidya <ashwinitinvaidya@gmail.com>
Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com>
Co-authored-by: Vladislav Sovrasov <vladislav.sovrasov@intel.com>
Co-authored-by: Evgeny Tsykunov <e.tsykunov@gmail.com>
Co-authored-by: Galina Zalesskaya <galina.zalesskaya@intel.com>
Co-authored-by: dongkwan-kim <dongkwan.kim@intel.com>

* Resolve some flake8 issues

* Fix semisl trainer import (#1471)

Co-authored-by: Lee, Soobee <soobeele@intel.com>

* Fixisort

* Rebase and fix isort

Signed-off-by: Songki Choi <songki.choi@intel.com>
Co-authored-by: Songki Choi <songki.choi@intel.com>
Co-authored-by: eunwoosh <eunwoo.shin@intel.com>
Co-authored-by: Eugene Liu <eugene.liu@intel.com>
Co-authored-by: Ashwin Vaidya <ashwin.vaidya@intel.com>
Co-authored-by: Jaeguk Hyun <jaeguk.hyun@intel.com>
Co-authored-by: Nikita Savelyev <nikita.savelyev@intel.com>
Co-authored-by: Jihwan Eom <jihwan.eom@intel.com>
Co-authored-by: Harim Kang <harim.kang@intel.com>
Co-authored-by: Soobee Lee <soobee.lee@intel.com>
Co-authored-by: Lee, Soobee <soobeele@intel.com>
Co-authored-by: Emily Chun <emily.chun@intel.com>
Co-authored-by: ljcornel <ludo.cornelissen@intel.com>
Co-authored-by: dlyakhov <daniil.lyakhov@intel.com>
Co-authored-by: kprokofi <kirill.prokofiev@intel.com>
Co-authored-by: Yunchu Lee <yunchu.lee@intel.com>
Co-authored-by: Ashwin Vaidya <ashwinitinvaidya@gmail.com>
Co-authored-by: Alexander Dokuchaev <alexander.dokuchaev@intel.com>
Co-authored-by: Vladislav Sovrasov <vladislav.sovrasov@intel.com>
Co-authored-by: Evgeny Tsykunov <e.tsykunov@gmail.com>
Co-authored-by: Galina Zalesskaya <galina.zalesskaya@intel.com>
Co-authored-by: dongkwan-kim <dongkwan.kim@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ALGO Any changes in OTX Algo Tasks implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants