Update unwrap from accelerate #29933

SunMarc · 2024-03-28T11:11:15Z

What does this PR do ?

This PR update the unwrap function to use the one in accelerate instead.
Fixes issue from @abhishekkrthakur

SunMarc · 2024-03-28T11:13:59Z

src/transformers/modeling_utils.py

@@ -2306,7 +2306,7 @@ def save_pretrained(
            files_timestamps = self._get_files_timestamps(save_directory)

        # Only save the model itself if we are using distributed training
-        model_to_save = unwrap_model(self)
+        model_to_save = unwrap_model(self) if is_accelerate_available() else Accelerator().unwrap_model(self)


Not sure what is the best case here. I don't think we want to force users to install accelerate to save a model. If they are saving after training through trainer or accelerate, they will have accelerate installed.

HuggingFaceDocBuilderDev · 2024-03-28T11:35:24Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

muellerzr

My main concern is #29780, which looks to expand it.

Now that being said, everything here revolves around torch, and transformers with PyTorch requires Accelerate, so I think it's a fair assumption we can assume accelerate must be available in the env. @ArthurZucker let me know if you disagree with this.

If we agree, then I propose fully removing the current implementation, and solely relying on the one in Accelerate. What's in the mentioned PR can then also be offloaded to there, as the behaviors between the two differ, and with that PR they will differ even more.

The other alternative is in test_trainer we include a test that verifies similar behavior between Accelerate's and transformers model unwraps, so we can flag when they are not up to date

muellerzr · 2024-03-28T11:47:43Z

src/transformers/modeling_utils.py

@@ -105,7 +105,7 @@
 XLA_DOWNCAST_BF16 = os.environ.get("XLA_DOWNCAST_BF16", "0").upper()

 if is_accelerate_available():
-    from accelerate import dispatch_model, infer_auto_device_map, init_empty_weights
+    from accelerate import Accelerator, dispatch_model, infer_auto_device_map, init_empty_weights


Let's just use extract_model_from_parallel instead of going through the Accelerator, since that's all it's calling

Yeah I thought about that too. Maybe, we can put that in the unwrap_model so that it is easier to understand. However, we need to add a test to make sure that we have the same behavior.

muellerzr · 2024-03-28T14:58:05Z

Got word from tf-boi that generally modeling_utils.py is PyTorch only, so any code that calls it requires pytorch, and in-vein with how Accelerate is with transformers requirement wise, accelerate. So I propose that we do move forward with fully ripping it out and relying on accelerate, modifying it's implementation.

@amyeroberts @ArthurZucker what do you say? :)

ArthurZucker

Current state looks good. But yes torch requires accelerate, but let's keep unwrap anyway, sounds simpler

zorrofox · 2024-04-18T05:59:13Z

Hi @muellerzr @SunMarc , does this PR will merged? We need this.

SunMarc · 2024-04-18T13:47:49Z

I will finish this PR asap @zorrofox ! @ArthurZucker , do you want to switch back to unwrap_model in the trainer files also ? It makes sense to use self.accelerator.unwrap_model(model) since we initialized Accelerator there. I've updated the unwrap_model function, LMK what you think.

src/transformers/modeling_utils.py

Co-authored-by: Zach Mueller <muellerzr@gmail.com>

muellerzr

This looks great to me! (after quality 😉 )

amyeroberts

Thanks! LGTM

src/transformers/modeling_utils.py

* Use unwrap with the one in accelerate * oups * update unwrap * fix * wording * raise error instead * comment * doc * Update src/transformers/modeling_utils.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * style * put else --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com>

Use unwrap with the one in accelerate

b7a933d

SunMarc requested review from muellerzr and ArthurZucker March 28, 2024 11:11

SunMarc commented Mar 28, 2024

View reviewed changes

oups

8ca3acd

muellerzr reviewed Mar 28, 2024

View reviewed changes

muellerzr mentioned this pull request Mar 28, 2024

fix: extend the unwrap_model function and save unwrapped model state dict instead of wrapped #29780

Closed

5 tasks

ArthurZucker reviewed Apr 5, 2024

View reviewed changes

SunMarc added 5 commits April 18, 2024 15:02

Merge remote-tracking branch 'upstream/main' into update_unwrap

772ebd1

update unwrap

d1f4b8d

fix

5f5d62d

wording

f409b1e

raise error instead

e865ed1

SunMarc requested review from muellerzr and ArthurZucker April 18, 2024 13:50

SunMarc added 2 commits April 18, 2024 15:54

comment

9494928

doc

a691cd5

muellerzr reviewed Apr 18, 2024

View reviewed changes

src/transformers/modeling_utils.py Outdated Show resolved Hide resolved

src/transformers/modeling_utils.py Outdated Show resolved Hide resolved

Update src/transformers/modeling_utils.py

fac445e

Co-authored-by: Zach Mueller <muellerzr@gmail.com>

muellerzr approved these changes Apr 18, 2024

View reviewed changes

muellerzr requested a review from amyeroberts April 18, 2024 15:03

style

e50c902

amyeroberts approved these changes Apr 18, 2024

View reviewed changes

src/transformers/modeling_utils.py Show resolved Hide resolved

SunMarc added 2 commits April 19, 2024 14:17

put else

6a31887

Merge remote-tracking branch 'upstream/main' into update_unwrap

2631ba1

SunMarc merged commit b4fd49b into huggingface:main Apr 19, 2024
21 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update unwrap from accelerate #29933

Update unwrap from accelerate #29933

SunMarc commented Mar 28, 2024

SunMarc Mar 28, 2024

HuggingFaceDocBuilderDev commented Mar 28, 2024

muellerzr left a comment

muellerzr Mar 28, 2024

SunMarc Mar 28, 2024

muellerzr commented Mar 28, 2024

ArthurZucker left a comment

zorrofox commented Apr 18, 2024

SunMarc commented Apr 18, 2024 •

edited

Loading

muellerzr left a comment

amyeroberts left a comment

Update unwrap from accelerate #29933

Update unwrap from accelerate #29933

Conversation

SunMarc commented Mar 28, 2024

What does this PR do ?

SunMarc Mar 28, 2024

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Mar 28, 2024

muellerzr left a comment

Choose a reason for hiding this comment

muellerzr Mar 28, 2024

Choose a reason for hiding this comment

SunMarc Mar 28, 2024

Choose a reason for hiding this comment

muellerzr commented Mar 28, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

zorrofox commented Apr 18, 2024

SunMarc commented Apr 18, 2024 • edited Loading

muellerzr left a comment

Choose a reason for hiding this comment

amyeroberts left a comment

Choose a reason for hiding this comment

SunMarc commented Apr 18, 2024 •

edited

Loading