Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

debug for gpu rank for analyser #329

Merged
merged 4 commits into from
Jun 25, 2024
Merged

Conversation

BeachWang
Copy link
Collaborator

as the title says

@yxdyc yxdyc requested a review from drcege June 15, 2024 10:09
@BeachWang BeachWang self-assigned this Jun 17, 2024
Copy link
Collaborator

@drcege drcege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @garyzhang99 please take a look as well.

@drcege drcege requested a review from garyzhang99 June 25, 2024 06:09
@BeachWang BeachWang merged commit a8305bc into main Jun 25, 2024
4 checks passed
Copy link
Collaborator

@garyzhang99 garyzhang99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

yxdyc pushed a commit that referenced this pull request Jul 17, 2024
* modelscope-sora news (#323)

* News/modelscope sora (#327)

* modelscope-sora news

* remove empower

* debug for gpu rank for analyser (#329)

* debug for gpu rank for analyser

* spec_numprocs -> num_proc

* Add more unittest  (#304)

* add unittest env with gpu

* fix unittest yml

* add environment for unittest

* update workflow trigger

* update install step

* fix install command

* update working dir

* update container

* update working dir

* change working directory

* change working directory

* change working directory

* change working directory

* change unittest

* use test tag

* finish tag support

* support run op with different executro

* fix pre-commit

* add hf mirror

* add hf mirror

* run all test in standalone mode by default

* ignore image face ratio

* update tags

* add ray testcase

* add ray test in workflow

* update ray unittest workflow

* delete old unittest

---------

Co-authored-by: root <panxuchen>

* Add source tag (#317)

* add source tag for some mapper op

* fix no attribute 'current_tag' when executing local tests

* move op process logic from executor to base op

* fix typo

* move export outside op

* init refactor

* update analyser

* fix format

* clean up

* bring back batch mapper

* Improve fault tolerance & Fix Ray executor

* fix wrapper

* fix batched filter

* Remove use_actor as it is not compatible with the refactored OP clas, unless the dataset class is refactored

* make wrappers work with unittests

* Compatible with unit tests and works with ray

* fix unittest

* fix wrappers with ray, map, filter

* unify unittests

* wrap deduplicators

* Compatible with non-batched calls

* Class-level wrappers

- compatible with dataset.filter
- bring back nested wrappers

* Instance-level wrappers

* Refined instance-level wrappers

- Remove incomplete dataset.filter wrappers
- Simplify code
- Stack wrappers

* fix use_cuda

* Refactor dataset (#348)

* refactor dataset

* update unittest with DJDataset

* fix unittest

* update ray data load

* add test

* ray read json

* update docker image version

* actor is no longer supported

* Regress filter's stats export logic

---------

Co-authored-by: BeachWang <1400012807@pku.edu.cn>
Co-authored-by: Xuchen Pan <32844285+pan-x-c@users.noreply.github.com>
Co-authored-by: chenhesen <hesen.chs@alibaba-inc.com>
Co-authored-by: garyzhang99 <garyzhang99@163.com>
yxdyc added a commit that referenced this pull request Jul 18, 2024
* Refactor OP & Dataset (#336)

* modelscope-sora news (#323)

* News/modelscope sora (#327)

* modelscope-sora news

* remove empower

* debug for gpu rank for analyser (#329)

* debug for gpu rank for analyser

* spec_numprocs -> num_proc

* Add more unittest  (#304)

* add unittest env with gpu

* fix unittest yml

* add environment for unittest

* update workflow trigger

* update install step

* fix install command

* update working dir

* update container

* update working dir

* change working directory

* change working directory

* change working directory

* change working directory

* change unittest

* use test tag

* finish tag support

* support run op with different executro

* fix pre-commit

* add hf mirror

* add hf mirror

* run all test in standalone mode by default

* ignore image face ratio

* update tags

* add ray testcase

* add ray test in workflow

* update ray unittest workflow

* delete old unittest

---------

Co-authored-by: root <panxuchen>

* Add source tag (#317)

* add source tag for some mapper op

* fix no attribute 'current_tag' when executing local tests

* move op process logic from executor to base op

* fix typo

* move export outside op

* init refactor

* update analyser

* fix format

* clean up

* bring back batch mapper

* Improve fault tolerance & Fix Ray executor

* fix wrapper

* fix batched filter

* Remove use_actor as it is not compatible with the refactored OP clas, unless the dataset class is refactored

* make wrappers work with unittests

* Compatible with unit tests and works with ray

* fix unittest

* fix wrappers with ray, map, filter

* unify unittests

* wrap deduplicators

* Compatible with non-batched calls

* Class-level wrappers

- compatible with dataset.filter
- bring back nested wrappers

* Instance-level wrappers

* Refined instance-level wrappers

- Remove incomplete dataset.filter wrappers
- Simplify code
- Stack wrappers

* fix use_cuda

* Refactor dataset (#348)

* refactor dataset

* update unittest with DJDataset

* fix unittest

* update ray data load

* add test

* ray read json

* update docker image version

* actor is no longer supported

* Regress filter's stats export logic

---------

Co-authored-by: BeachWang <1400012807@pku.edu.cn>
Co-authored-by: Xuchen Pan <32844285+pan-x-c@users.noreply.github.com>
Co-authored-by: chenhesen <hesen.chs@alibaba-inc.com>
Co-authored-by: garyzhang99 <garyzhang99@163.com>

* minor fix

* fix num_proc default None

---------

Co-authored-by: Ce Ge (戈策) <gece@foxmail.com>
Co-authored-by: BeachWang <1400012807@pku.edu.cn>
Co-authored-by: Xuchen Pan <32844285+pan-x-c@users.noreply.github.com>
Co-authored-by: chenhesen <hesen.chs@alibaba-inc.com>
Co-authored-by: garyzhang99 <garyzhang99@163.com>
Co-authored-by: null <3213204+drcege@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants