Fix GQA permutation computation and sequential weight initialization / loading when doing TP #531

michaelbenayoun · 2024-03-27T14:27:02Z

What does this PR do?

Fixes the way indices are compute for GQA permutation of the query and output projection, and add a test case to make sure everything works
Add the possibility to specify the number of concurrent ranks that can initialize or load the model weights at the same time under TP. It can be useful to avoid going out-of-memory.
Fixes a typo indicies -> indices

HuggingFaceDocBuilderDev · 2024-03-27T14:35:51Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

michaelbenayoun · 2024-03-27T16:46:32Z

examples/language-modeling/run_clm.py

Will restore this change before merging.

JingyaHuang

LGTM. Thanks for the fix!

JingyaHuang · 2024-03-28T10:03:55Z

optimum/neuron/distributed/base.py

+            local_rank = xm.get_local_ordinal()
+            if num_ranks_per_loading_step < 0:
+                num_ranks_per_loading_step = get_local_world_size()
+            for worker in range(math.ceil(get_local_world_size() / num_ranks_per_loading_step)):


So there will be two workers (0, 1) if get_local_world_size() / num_ranks_per_loading_step > 1 and < 2?

I can set num_ranks_per_loading_step = min(num_ranks_per_loading_step, get_local_world_size()) as a safety measure.

michaelbenayoun · 2024-03-28T14:15:32Z

Ran the distributed tests locally and they pass. It's just a flaky test that needs to be solved.

michaelbenayoun added 3 commits March 26, 2024 16:12

Fix query compute indicies

b40ceb9

Add multiple features

f019780

Fix

f652943

michaelbenayoun added 2 commits March 27, 2024 15:42

Fix typo

77458dd

Rename variables

f05eea9

michaelbenayoun changed the title ~~Fix GQA and sequential weight initialization / loading when doing TP~~ Fix GQA permutation computation and sequential weight initialization / loading when doing TP Mar 27, 2024

michaelbenayoun marked this pull request as ready for review March 27, 2024 14:52

michaelbenayoun requested review from JingyaHuang and dacorvo March 27, 2024 14:53

michaelbenayoun added 2 commits March 27, 2024 15:55

Fix typo (part 2)

e2e7702

Fix issue

9aeb643

michaelbenayoun commented Mar 27, 2024

View reviewed changes

examples/language-modeling/run_clm.py Outdated

Copy link

Member Author

michaelbenayoun Mar 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will restore this change before merging.

JingyaHuang approved these changes Mar 28, 2024

View reviewed changes

michaelbenayoun added 3 commits March 28, 2024 14:43

Merge branch 'main' into fix_gqa_compute_query_indicies

55924d9

Restore run_clm.py

a8c72fc

Styling

92e636b

michaelbenayoun merged commit 1bc0405 into main Mar 28, 2024
10 of 12 checks passed

michaelbenayoun deleted the fix_gqa_compute_query_indicies branch March 28, 2024 15:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix GQA permutation computation and sequential weight initialization / loading when doing TP #531

Fix GQA permutation computation and sequential weight initialization / loading when doing TP #531

michaelbenayoun commented Mar 27, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Mar 27, 2024

michaelbenayoun Mar 27, 2024

JingyaHuang left a comment

JingyaHuang Mar 28, 2024

michaelbenayoun Mar 28, 2024

michaelbenayoun commented Mar 28, 2024

Fix GQA permutation computation and sequential weight initialization / loading when doing TP #531

Fix GQA permutation computation and sequential weight initialization / loading when doing TP #531

Conversation

michaelbenayoun commented Mar 27, 2024 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented Mar 27, 2024

michaelbenayoun Mar 27, 2024

Choose a reason for hiding this comment

JingyaHuang left a comment

Choose a reason for hiding this comment

JingyaHuang Mar 28, 2024

Choose a reason for hiding this comment

michaelbenayoun Mar 28, 2024

Choose a reason for hiding this comment

michaelbenayoun commented Mar 28, 2024

michaelbenayoun commented Mar 27, 2024 •

edited

Loading