WIP: feature(lk): add continuous version for acer algo #73

KeLiChloe · 2021-09-26T10:21:54Z

Description

Add continuous version for acer algo

Check List

merge the latest version source branch/repo, and resolve all the conflicts
pass style check
pass all the tests

* add modifiled predator_prey env * add collision_ratio * add readme and cfg for modified_predator_prey env * add readme imgs for modified_predator_prey * check format * fix format

…tch bug

…licy, polish sqil config

…inf replace

…odel device problem

* feature(nyz): add trueskill as league metric, naive elo calculator, fix game_env info bug * fix(nyz): fix league player mutate bug * fix(nyz): fix league unittest bug * feature(nyz): add elo ranking in league metric env * polish(nyz): modify fixed eval policy and trueskill init * feature(nyz): add init main player in evaluation and fix stop_value bug * style(nyz): rename test_league_metric to avoid pyc cache bug

* feature(zlx): Add tb in naive buffer; modify tb in advanced buffer * feature(zlx): naive_buffer tb, fix bug in valid_count update

…n link

* test rnd * fix mz config * fix config * fix(pu): fix r2d2 * feature(puyuan): add minigrid r2d2 config * polish minigrid config * modified as review * fix(pu): fix bugffor compatibility * polish(pu): add annotations and polish slice operation * style(pu): run format.sh * style(pu): correct yapf format

* enable user to use any model generated here * delete irelevant package * add test * bash format.sh to reformat style

* env-list * env-list-fix-grammmer * env-only-test * modify-gif * modify-gif-pendulum * modify-gif-delect-maze

…emo (#27) * feature(nyz): add resnet for cv sl task * feature(nyz): add imagenet classification dataset and adapt compile config for sl * feature(nyz): add naive image training entry demo * style(nyz): polish image cls train log * polish(nyz): polish multi gpu training setting * feature(nyz): add nn training bp and update async execution * feature(nyz): add distributed sampler for different dist backend * fix(nyz): fix compile config collector and buffer compatibility problem * style(nyz): correct yapf format * fix(nyz): fix env manager compile config compatibility bug * refactor(nyz): abstarct ISerialEvaluator and rename serial evaluation implementation * refactor(nyz): refactor collector name * feature(nyz): add metric evaluator and image cls acc metric eval demo * fix(nyz): fix cuda and multi gpu bug in image cls demo

* feat: add k8s launcher * feat: install kubectl when install k3d * feat: add orchestrator launcher and a test case * ci: install kubernetes related package and cli * style: format code * style: flake check code * test k8s launcher * ci: change back to unit test * feat: delete cert manager when delete orchestrator * style: flake8 check * feat: merge k8s-launcher with k8s-helper 1. merge k8s-launcher with k8s-helper 2. move kubernetes package import to where it will be used 3. hack/install-k8s-tools.sh -> ding/scripts/install-k8s-tools.sh

…49) * test dijob * test: wait for dijob Succeeded phase, and read coordinator logs * test: update wait condition * ci: update algo_test.yaml and flake check * test: move kubernetes package to where it will be used

…I-engine into dev-league-scheduler

* fix/fix_submodule_err (opendilab#61) * fix/fix_submodule_err --------- Co-authored-by: ChenQiaoling00 <qiaoling_chen@u.nus.edu> * fix issue templates (opendilab#65) * fix(tokenizer): refactor tokenizer and update usage in readme (opendilab#51) * update tokenizer example * fix(readme, requirements): fix typo at Chinese readme and select a lower version of transformers (opendilab#73) * fix a typo in readme * in order to find InternLMTokenizer, select a lower version of Transformers --------- Co-authored-by: gouhchangjiang <gouhchangjiang@gmail.com> * [Doc] Add wechat and discord link in readme (opendilab#78) * Doc：add wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * [Docs]: add Japanese README (opendilab#43) * Add Japanese README * Update README-ja-JP.md replace message * Update README-ja-JP.md * add repetition_penalty in GenerationConfig in web_demo.py (opendilab#48) Co-authored-by: YWMditto <862779238@qq.com> * use fp16 in instruction (opendilab#80) * [Enchancement] add more options for issue template (opendilab#77) * [Enchancement] add more options for issue template * update qustion icon * fix link * Use tempfile for convert2hf.py (opendilab#23) Fix InternLM/InternLM#50 * delete torch_dtype of README's example code (opendilab#100) * set the value of repetition_penalty to 1.0 to avoid random outputs (opendilab#99) * Update web_demo.py (opendilab#97) Remove meaningless log. * [Fix]Fix wrong string cutoff in the script for sft text tokenizing (opendilab#106) --------- Co-authored-by: ChenQiaoling00 <qiaoling_chen@u.nus.edu> Co-authored-by: Kai Chen <chenkaidev@gmail.com> Co-authored-by: Yang Gao <Gary1546308416AL@gmail.com> Co-authored-by: Changjiang GOU <gouchangjiang@gmail.com> Co-authored-by: gouhchangjiang <gouhchangjiang@gmail.com> Co-authored-by: vansin <msnode@163.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: YWMditto <46778265+YWMditto@users.noreply.github.com> Co-authored-by: YWMditto <862779238@qq.com> Co-authored-by: WRH <12756472+wangruohui@users.noreply.github.com> Co-authored-by: liukuikun <24622904+Harold-lkk@users.noreply.github.com> Co-authored-by: x54-729 <45304952+x54-729@users.noreply.github.com> Co-authored-by: Shuo Zhang <zhangshuolove@live.com> Co-authored-by: Miao Zheng <76149310+MeowZheng@users.noreply.github.com>

* fix/fix_submodule_err (opendilab#61) * fix/fix_submodule_err --------- Co-authored-by: ChenQiaoling00 <qiaoling_chen@u.nus.edu> * fix issue templates (opendilab#65) * fix(tokenizer): refactor tokenizer and update usage in readme (opendilab#51) * update tokenizer example * fix(readme, requirements): fix typo at Chinese readme and select a lower version of transformers (opendilab#73) * fix a typo in readme * in order to find InternLMTokenizer, select a lower version of Transformers --------- Co-authored-by: gouhchangjiang <gouhchangjiang@gmail.com> * [Doc] Add wechat and discord link in readme (opendilab#78) * Doc：add wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * [Docs]: add Japanese README (opendilab#43) * Add Japanese README * Update README-ja-JP.md replace message * Update README-ja-JP.md * add repetition_penalty in GenerationConfig in web_demo.py (opendilab#48) Co-authored-by: YWMditto <862779238@qq.com> * use fp16 in instruction (opendilab#80) * [Enchancement] add more options for issue template (opendilab#77) * [Enchancement] add more options for issue template * update qustion icon * fix link * Use tempfile for convert2hf.py (opendilab#23) Fix InternLM/InternLM#50 * delete torch_dtype of README's example code (opendilab#100) * set the value of repetition_penalty to 1.0 to avoid random outputs (opendilab#99) * Update web_demo.py (opendilab#97) Remove meaningless log. * [Fix]Fix wrong string cutoff in the script for sft text tokenizing (opendilab#106) * docs(install.md): update dependency package transformers version to >= 4.28.0 (opendilab#124) Co-authored-by: 黄婷 <huangting3@CN0014010744M.local> * docs(LICENSE): add license (opendilab#125) * add license of colossalai and flash-attn * fix lint * modify the name * fix AutoModel map in convert2hf.py (opendilab#116) * variables are not printly as expect (opendilab#114) * feat(solver): fix code to adapt to torch2.0 and provide docker images (opendilab#128) * feat(solver): fix code to adapt to torch2.0 * docs(install.md): publish internlm environment image * docs(install.md): update dependency packages version * docs(install.md): update default image --------- Co-authored-by: 黄婷 <huangting3@CN0014010744M.local> * add demo test (opendilab#132) Co-authored-by: qa-caif-cicd <qa-caif-cicd@pjlab.org.cn> * fix web_demo cache accelerate (opendilab#133) * fix(hybrid_zero_optim.py): delete math import * Update embedding.py --------- Co-authored-by: ChenQiaoling00 <qiaoling_chen@u.nus.edu> Co-authored-by: Kai Chen <chenkaidev@gmail.com> Co-authored-by: Yang Gao <Gary1546308416AL@gmail.com> Co-authored-by: Changjiang GOU <gouchangjiang@gmail.com> Co-authored-by: gouhchangjiang <gouhchangjiang@gmail.com> Co-authored-by: vansin <msnode@163.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: YWMditto <46778265+YWMditto@users.noreply.github.com> Co-authored-by: YWMditto <862779238@qq.com> Co-authored-by: WRH <12756472+wangruohui@users.noreply.github.com> Co-authored-by: liukuikun <24622904+Harold-lkk@users.noreply.github.com> Co-authored-by: x54-729 <45304952+x54-729@users.noreply.github.com> Co-authored-by: Shuo Zhang <zhangshuolove@live.com> Co-authored-by: Miao Zheng <76149310+MeowZheng@users.noreply.github.com> Co-authored-by: huangting4201 <1538303371@qq.com> Co-authored-by: 黄婷 <huangting3@CN0014010744M.local> Co-authored-by: ytxiong <45058324+yingtongxiong@users.noreply.github.com> Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com> Co-authored-by: kkscilife <126147887+kkscilife@users.noreply.github.com> Co-authored-by: qa-caif-cicd <qa-caif-cicd@pjlab.org.cn> Co-authored-by: hw <45089338+MorningForest@users.noreply.github.com>

* fix/fix_submodule_err (opendilab#61) * fix/fix_submodule_err --------- Co-authored-by: ChenQiaoling00 <qiaoling_chen@u.nus.edu> * fix issue templates (opendilab#65) * fix(tokenizer): refactor tokenizer and update usage in readme (opendilab#51) * update tokenizer example * fix(readme, requirements): fix typo at Chinese readme and select a lower version of transformers (opendilab#73) * fix a typo in readme * in order to find InternLMTokenizer, select a lower version of Transformers --------- Co-authored-by: gouhchangjiang <gouhchangjiang@gmail.com> * [Doc] Add wechat and discord link in readme (opendilab#78) * Doc：add wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * Doc：update wechat and discord link * [Docs]: add Japanese README (opendilab#43) * Add Japanese README * Update README-ja-JP.md replace message * Update README-ja-JP.md * add repetition_penalty in GenerationConfig in web_demo.py (opendilab#48) Co-authored-by: YWMditto <862779238@qq.com> * use fp16 in instruction (opendilab#80) * [Enchancement] add more options for issue template (opendilab#77) * [Enchancement] add more options for issue template * update qustion icon * fix link * Use tempfile for convert2hf.py (opendilab#23) Fix InternLM/InternLM#50 * delete torch_dtype of README's example code (opendilab#100) * set the value of repetition_penalty to 1.0 to avoid random outputs (opendilab#99) * Update web_demo.py (opendilab#97) Remove meaningless log. * [Fix]Fix wrong string cutoff in the script for sft text tokenizing (opendilab#106) * docs(install.md): update dependency package transformers version to >= 4.28.0 (opendilab#124) Co-authored-by: 黄婷 <huangting3@CN0014010744M.local> * docs(LICENSE): add license (opendilab#125) * add license of colossalai and flash-attn * fix lint * modify the name * fix AutoModel map in convert2hf.py (opendilab#116) * variables are not printly as expect (opendilab#114) * feat(solver): fix code to adapt to torch2.0 and provide docker images (opendilab#128) * feat(solver): fix code to adapt to torch2.0 * docs(install.md): publish internlm environment image * docs(install.md): update dependency packages version * docs(install.md): update default image --------- Co-authored-by: 黄婷 <huangting3@CN0014010744M.local> * add demo test (opendilab#132) Co-authored-by: qa-caif-cicd <qa-caif-cicd@pjlab.org.cn> * fix web_demo cache accelerate (opendilab#133) * Doc: add twitter link (opendilab#141) * Feat add checkpoint fraction (opendilab#151) * feat(config): add checkpoint_fraction into config * feat: remove checkpoint_fraction from configs/7B_sft.py --------- Co-authored-by: wangguoteng.p <wangguoteng925@qq.com> * [Doc] update deployment guide to keep consistency with lmdeploy (opendilab#136) * update deployment guide * fix error * use llm partition (opendilab#159) Co-authored-by: qa-caif-cicd <qa-caif-cicd@pjlab.org.cn> * test(ci_scripts): clean test data after test, remove unnecessary global variables, and other optimizations (opendilab#165) * test: optimization of ci scripts(variables, test data cleaning, etc). * chore(workflows): disable ci job on push. * fix: update partition * test(ci_scripts): add install requirements automaticlly,trigger event about lint check and other optimizations (opendilab#174) * add pull_request in lint check * use default variables in ci_scripts * fix format * check and install requirements automaticlly * fix format --------- Co-authored-by: qa-caif-cicd <qa-caif-cicd@pjlab.org.cn> * feat(profiling): add a simple memory profiler (opendilab#89) * feat(profiling): add simple memory profiler * feat(profiling): add profiling argument * feat(CI_workflow): Add PR & Issue auto remove workflow (opendilab#184) * feat(ci_workflow): Add PR & Issue auto remove workflow Add a workflow for stale PR & Issue auto remove - pr & issue well be labeled as stale for inactive in 7 days - staled PR & Issue well be remove in 7 days - run this workflow every day on 1:30 a.m. * Update stale.yml * feat(bot): Create .owners.yml for Auto Assign (opendilab#176) * Create .owners.yml: for issue/pr assign automatically * Update .owners.yml * Update .owners.yml fix typo * [feat]: add pal reasoning script (opendilab#163) * [Feat] Add PAL inference script * Update README.md * Update tools/README.md Co-authored-by: BigDong <yudongwang1226@gmail.com> * Update tools/pal_inference.py Co-authored-by: BigDong <yudongwang1226@gmail.com> * Update pal script * Update README.md * restore .ore-commit-config.yaml * Update tools/README.md Co-authored-by: BigDong <yudongwang1226@gmail.com> * Update tools/README.md Co-authored-by: BigDong <yudongwang1226@gmail.com> * Update pal inference script * Update READMD.md * Update internlm/utils/interface.py Co-authored-by: Wenwei Zhang <40779233+ZwwWayne@users.noreply.github.com> * Update pal script * Update pal script * Update script * Add docstring * Update format * Update script * Update script * Update script --------- Co-authored-by: BigDong <yudongwang1226@gmail.com> Co-authored-by: Wenwei Zhang <40779233+ZwwWayne@users.noreply.github.com> * test(ci_scripts): add timeout settings and clean work after the slurm job (opendilab#185) * restore pr test on develop branch * add mask * add post action to cancel slurm job * remove readonly attribute on job log * add debug info * debug job log * try stdin * use stdin * set default value avoid error * try setting readonly on job log * performance echo * remove debug info * use squeue to check slurm job status * restore the lossed parm * litmit retry times * use exclusive to avoid port already in use * optimize loop body * remove partition * add {} for variables * set env variable for slurm partition --------- Co-authored-by: qa-caif-cicd <qa-caif-cicd@pjlab.org.cn> * refactor(tools): move interface.py and import it to web_demo (opendilab#195) * move interface.py and import it to web_demo * typo * fix(ci): fix lint error * fix(ci): fix lint error --------- Co-authored-by: Sun Peng <sunpengsdu@gmail.com> Co-authored-by: ChenQiaoling00 <qiaoling_chen@u.nus.edu> Co-authored-by: Kai Chen <chenkaidev@gmail.com> Co-authored-by: Yang Gao <Gary1546308416AL@gmail.com> Co-authored-by: Changjiang GOU <gouchangjiang@gmail.com> Co-authored-by: gouhchangjiang <gouhchangjiang@gmail.com> Co-authored-by: vansin <msnode@163.com> Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com> Co-authored-by: YWMditto <46778265+YWMditto@users.noreply.github.com> Co-authored-by: YWMditto <862779238@qq.com> Co-authored-by: WRH <12756472+wangruohui@users.noreply.github.com> Co-authored-by: liukuikun <24622904+Harold-lkk@users.noreply.github.com> Co-authored-by: x54-729 <45304952+x54-729@users.noreply.github.com> Co-authored-by: Shuo Zhang <zhangshuolove@live.com> Co-authored-by: Miao Zheng <76149310+MeowZheng@users.noreply.github.com> Co-authored-by: 黄婷 <huangting3@CN0014010744M.local> Co-authored-by: ytxiong <45058324+yingtongxiong@users.noreply.github.com> Co-authored-by: Zaida Zhou <58739961+zhouzaida@users.noreply.github.com> Co-authored-by: kkscilife <126147887+kkscilife@users.noreply.github.com> Co-authored-by: qa-caif-cicd <qa-caif-cicd@pjlab.org.cn> Co-authored-by: hw <45089338+MorningForest@users.noreply.github.com> Co-authored-by: Guoteng <32697156+SolenoidWGT@users.noreply.github.com> Co-authored-by: wangguoteng.p <wangguoteng925@qq.com> Co-authored-by: lvhan028 <lvhan_028@163.com> Co-authored-by: zachtzy <141206206+zachtzy@users.noreply.github.com> Co-authored-by: cx <759046501@qq.com> Co-authored-by: Jaylin Lee <61487970+APX103@users.noreply.github.com> Co-authored-by: del-zhenwu <dele.zhenwu@gmail.com> Co-authored-by: Shaoyuan Xie <66255889+Daniel-xsy@users.noreply.github.com> Co-authored-by: BigDong <yudongwang1226@gmail.com> Co-authored-by: Wenwei Zhang <40779233+ZwwWayne@users.noreply.github.com> Co-authored-by: huangting4201 <huangting3@sensetime.com>

yifan123 and others added 30 commits August 23, 2021 14:53

Dev modified predator prey (#30)

98e2c13

* add modifiled predator_prey env * add collision_ratio * add readme and cfg for modified_predator_prey env * add readme imgs for modified_predator_prey * check format * fix format

hotfix(nyz): fix c51 head dimension mismatch bug and ppo config misma…

0453f9c

…tch bug

Merge branch 'main' of https://github.com/opendilab/DI-engine

cffb5b2

feature(ljw): add/delete/restart replicas via cli for k8s

2ef3ad6

hotfix(nyz): fix mujoco benchmark config typos

f84d76e

test(nyz): add sqil unittest and algotest, remove adder comment in po…

42e31ea

…licy, polish sqil config

style(nyz): update coveragerc with new entry cli code and fix config …

6f500a5

…inf replace

style(nyz): rename advanced_buffer register name to advanced

84583d4

style(nyz): add roadmap link in readme and correct format

2c03244

polish(nyz): polish cartpole dqn visualize demo and add solo eval demo

020eba2

fix(zym): modify default setting in mujoco (#35)

6326529

hotfix(nyz): fix random policy typo in serial entry and base policy m…

60a6867

…odel device problem

Merge branch 'main' of https://github.com/opendilab/DI-engine

608fee4

feature(nyz): add sample_range arg in replay buffer

7e9e4e8

add_scheduler_module

c2e63f3

add_scheduler_module

8ac7350

fix_change_range_and_factor

2d6b381

hotfix(nyz): fix max use and priority update special branch bug

1902039

cooldown_counter_bug_fix

e9bf572

add_div_mode

c068cac

code_format_fixed

0040832

code_format_fixed

dae4053

hotfix(nyz): fix cartpole ppg value buffer sample typo

da19fdb

Merge branch 'main' of https://github.com/opendilab/DI-engine

dd4472e

hotfix(nyz): fix ppo bug when use dual_clip and adv > 0

cd401b9

feature(zlx): add tb in naive buffer; modify tb in advanced buffer (#39)

0023438

* feature(zlx): Add tb in naive buffer; modify tb in advanced buffer * feature(zlx): naive_buffer tb, fix bug in valid_count update

style(nyz): rename pull request template name and add slack invitatio…

fae7926

…n link

enable user to use any expert model for sqil(#44)

5fbc945

* enable user to use any model generated here * delete irelevant package * add test * bash format.sh to reformat style

PaParaZz1 and others added 21 commits September 7, 2021 18:56

hotfix(nyz): polish lunarlander config

9a59e74

style(wyh): add env information in readme (#46)

fa453ef

* env-list * env-list-fix-grammmer * env-only-test * modify-gif * modify-gif-pendulum * modify-gif-delect-maze

style(nyz): polish env table in README

1439e22

style(nyz): update sparse reward badge in env table

5e52c1a

feature(nyz): add bebold experiment env

e646add

fix_pr_bug

a4b6e08

style(nyz): change PULL_REQUEST_TEMPLATE location

d10c02a

style(nyz): create Code of Conduct file

808f9ed

add_unnitest_module

8b58ec3

Merge remote-tracking branch 'origin/main' into dev-league-scheduler

fbeaac8

add_unnitest_module

78bd712

add_unnitest_module

5423c63

Merge branch 'main' into HEAD

d8400b4

add_patience_test

a976017

Merge branch 'dev-league-scheduler' of https://github.com/LikeJulia/D…

c86cff2

…I-engine into dev-league-scheduler

polish(nyz): polish scheduler design and fix league mode scheduler bug

e7d836e

fix(nyz): fix merge test_metric.py bug

945e413

add_continuous_head_option

9d97be3

KeLiChloe requested a review from PaParaZz1 September 26, 2021 10:21

add_continuous_head_option

ff14512

KeLiChloe changed the base branch from main to dev-acer-continuous September 26, 2021 10:44

KeLiChloe deleted the branch opendilab:dev-acer-continuous September 26, 2021 11:00

KeLiChloe closed this Sep 26, 2021

KeLiChloe deleted the dev-acer-continuous branch September 26, 2021 11:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: feature(lk): add continuous version for acer algo #73

WIP: feature(lk): add continuous version for acer algo #73

KeLiChloe commented Sep 26, 2021

WIP: feature(lk): add continuous version for acer algo #73

WIP: feature(lk): add continuous version for acer algo #73

Conversation

KeLiChloe commented Sep 26, 2021

Description

Check List