fixing multiple LoRA in the same batch or vit #1990

saeid93 · 2024-08-05T14:15:29Z

@BenjaminBossan
This is the initial fix for fixing the batched LoRA inference problem explained #1960 .
For now, this only supports the vit model. This is an example and making a template for adding support for other models gradually. Generally there are two options to solve this according to the solution we discussed #1960 (comment) since we need to change model specific details:

Change the signature of the model forward functions in the transformer library. The problem with this approach is that it needs Peft specific logic in transformers which I'm not sure is the best way for a general purpose library like transformers.
Change the forward function in Peft and patch it dynamically when the multiple LoRA request are in the inference batch.

I'm doing the second approach but each model needs different changes. Also, generate functions for generative models should be added. I'm happy to go through models one by one and also fix #1967, but it is better to review this first and then decide whether we want to go down this route of dynamically patching the forward functions or fixing it in the transformers library.

BenjaminBossan

Thanks a lot for following up on your issue. I can see that you put some thought into finding an elegant solution and I think it would be a viable solution. However, I wonder if there is another way, as sketched in my comment, please check. But I may well have missed something, LMK if my suggestion is not working.

saeid93 · 2024-08-06T12:52:45Z

No problem! Glad to be of any help.
Sorry, I couldn't find any comments on the pull request. Do you mean this comment on MobileVit issue? If so, the problem is different from this one, this is to solve this issue for multiple LoRA adapters. The other issue is a MobileVit specific problem.

BenjaminBossan · 2024-08-06T13:23:31Z

Sorry, I couldn't find any comments on the pull request.

Wow, it's gone! No idea what happened, I did write it for sure...

Okay, so a second time. I was referring to these lines:

https://github.com/huggingface/peft/pull/1990/files#diff-b700510ad2034b549511a969d85f89f9094243a7f3c740e311dc1eb83ace9a79R57-R61

Are those the only real change to the forward function that are required? If yes, would it be possible to instead register a pre-forward hook for classifier to inject the argument? This could be easily achieved here:

peft/src/peft/tuners/lora/model.py

Lines 434 to 438 in 4611034

    
           for module in self.modules(): 
        
               if isinstance(module, LoraLayer): 
        
                   pre_forward = partial(_adapter_names_pre_forward_hook, adapter_names=adapter_names) 
        
                   handle = module.register_forward_pre_hook(pre_forward, with_kwargs=True) 
        
                   hook_handles.append(handle)

But maybe I'm missing something and more changes are necessary, or will be in the future for the issue. WDYT?

saeid93 · 2024-08-06T17:13:16Z

Yes, your suggestion worked perfectly!
Please check the new commit.

BenjaminBossan

Thanks for the update, I think this version looks really nice.

I have added some comments where I think the code needs some further adjustments, please take a look. Also, please ensure to run make style on the PR. Apart from that, I have two more requests:

Let's update the docs to mention that modules_to_save is supported, but add the necessary caveats (depending on what we end up having in the code).
The unit tests should be updated to check for this use case. We don't need to test each type of possible model, but maybe a case similar to the original one with a classifier layer at the end and then perhaps one with two modules_to_save, say embedding and LM head. The tests could go here. LMK if you feel like giving this a try.

BenjaminBossan · 2024-08-07T09:37:00Z

src/peft/utils/other.py

-        return self.modules_to_save[self.active_adapter](*args, **kwargs)
+        if "adapter_names" not in kwargs.keys():
+            return self.modules_to_save[self.active_adapter](*args, **kwargs)
+        # Batches requests with similar LoRAs into microbatches


Let's move this to a sub-method, similar to how we do this for LoRA:

peft/src/peft/tuners/lora/layer.py

Line 327 in 4611034

def _mixed_batch_forward(

Also, with this added, I think it makes sense to have a similar method as in LoRA to check the arguments:

peft/src/peft/tuners/lora/layer.py

Line 302 in 4611034

def _check_forward_args(self, x, *args, **kwargs):

Of course, we have to be careful not to be too restrictive here, given the other issue that you raised, and since the underlying module could be of any type.

Both of the functions are added in the new commit, please check that.

BenjaminBossan · 2024-08-07T09:42:12Z

src/peft/utils/other.py

+
+        results = [0 for i in range(len(batch))]
+        for i, active_adapter in enumerate(unique_adapters):
+            sub_batch = batch[sub_batch_indices_list[i]]


Hmm, here we assume that there is only 1 args, as any other args would be dropped, right? Also, what if other args or kwargs need to be sliced? We don't really know that so I think the best we can do is make a guess.

One suggestion that I have:

Check all args and kwargs if they're tensors and if they are a tensor, that they have the same length (i.e. batch size). In that case, slice those too. Otherwise, leave them as is. It's not perfect but I'm not sure what else could be done. WDYT?

I changed the input definition in the new version with x as the input to avoid the problems that you mentioned.

saeid93 · 2024-08-07T17:17:56Z

Sure, I'll be happy to add the tests. I'll add the updates when I get a chance.

BenjaminBossan · 2024-08-20T09:43:43Z

@saeid93 Do you still plan on working on this?

saeid93 · 2024-08-20T09:52:40Z

@saeid93 Do you still plan on working on this?

Yes, sorry I had some busy weeks, I'll work on it on this weekend if there isn't a strict deadline.

BenjaminBossan · 2024-08-20T10:21:00Z

Yes, sorry I had some busy weeks, I'll work on it on this weekend if there isn't a strict deadline.

Thanks. No worries about the time, I just wanted to ensure that you're still on it. If not, that's also okay, just let me know.

saeid93 · 2024-08-24T18:55:57Z

@BenjaminBossan I added the tests and also changed the code according to your comments, please let me know if further changes are required. About the docs, I wasn't sure if it is still necessary to add anything as I don't see any caveat regarding using module_to_save and the assumption is that modules_to_save is supported out of the box. But please let me know if you still think it should be updated.

BenjaminBossan

Thanks for making the adjustments. I think there is still a bit of an issue to figure out when it comes to the signature of the mixed batch forward call. Please check my comments.

src/peft/utils/other.py

tests/test_custom_models.py

BenjaminBossan · 2024-09-02T09:22:29Z

@saeid93 LMK when this is ready for review.

BenjaminBossan · 2024-09-12T16:13:33Z

Gentle ping @saeid93

… check of mixed batches

saeid93 · 2024-09-15T12:09:09Z

@BenjaminBossan sorry for the delay, I was on annual leave. I made the changes you asked for, please let me know if you need any more changes.

BenjaminBossan

Thanks so much for resuming the work and making the requested adjustments. Overall this looks good, I only have a few small comments. Could you please check?

Also, could you please run make style to silence the linter? Note that by now, we've reached ruff 0.6.5, so you may have to upgrade its version in your environment.

docs/source/developer_guides/lora.md

src/peft/utils/other.py

tests/test_custom_models.py

src/peft/utils/other.py

saeid93 · 2024-09-16T18:10:33Z

@BenjaminBossan sure, glad to be of any help. I think all the comments have now been applied. I ran the style every time, probably it was the version mismatch, hopefully, it will go through this time. Let me know if further changes are needed.

BenjaminBossan · 2024-09-17T09:20:53Z

Thanks for the latest changes @saeid93. The style check is still failing, did you successfully run make style? If that doesn't work for some reason, removing this import should be sufficient.

saeid93 · 2024-09-17T09:28:59Z

@BenjaminBossan no problem! I did run the style, I'm not sure why it wasn't caught locally. I just removed the line. Hopefully, it will go through this time.

BenjaminBossan · 2024-09-17T10:35:09Z

The linter is happy now 🎉. However, now we get an error with Python 3.8 because it does not support list[str] etc. Could you please add the from __future__ import annotations import to the top of utils/other.py, that should fix it.

saeid93 · 2024-09-17T11:14:53Z

Done 👍 I had to make a small change and use tuple instead of Tuple to avoid ruff complaints after adding from __future__ import annotations.

BenjaminBossan

Thanks a lot for extending the functionality of having different adapters in the same batch to modules_to_save. The changes look good, are well covered by tests, and documented. Nothing more to add!

Extend the functionality of having different adapters in the same batch to also work with `modules_to_save`.

fixing multiple LoRA in the same batch or vit

2579b85

BenjaminBossan reviewed Aug 6, 2024

View reviewed changes

removed patching by inheritance and used pytorch pre_hook instead

fd0a9ce

BenjaminBossan requested changes Aug 7, 2024

View reviewed changes

added the test, mixed batch and forward arg functions

6b0290f

saeid93 mentioned this pull request Aug 25, 2024

Inference with different LoRA adapters in the same batch does not use the correct module_to_save classifier #1960

Open

4 tasks

BenjaminBossan requested changes Aug 26, 2024

View reviewed changes

src/peft/utils/other.py Outdated Show resolved Hide resolved

tests/test_custom_models.py Outdated Show resolved Hide resolved

tests/test_custom_models.py Outdated Show resolved Hide resolved

saeid93 marked this pull request as draft September 1, 2024 14:41

saeid93 added 3 commits September 1, 2024 14:48

changed the lora test layer

d143b13

added handling specific layers in mixed batch forward

a46ad62

updated documentation with modules_to_save information and caveats

d3bce93

added tests for module_to_save with ModelEmbConv1D and tests for type…

60384bd

… check of mixed batches

saeid93 requested a review from BenjaminBossan September 15, 2024 12:07

saeid93 marked this pull request as ready for review September 15, 2024 12:07

BenjaminBossan requested changes Sep 16, 2024

View reviewed changes

clarification on docs and comments applied

683da8b

removed extra import

e0a12b3

added annotation backward compatibility

bed1a10

BenjaminBossan approved these changes Sep 17, 2024

View reviewed changes

BenjaminBossan merged commit adf0a1d into huggingface:main Sep 17, 2024
14 checks passed

saeid93 mentioned this pull request Sep 17, 2024

MobileViT does not work with Inference with different LoRA adapters in the same batch #1967

Closed

4 tasks

BenjaminBossan pushed a commit to BenjaminBossan/peft that referenced this pull request Sep 18, 2024

ENH Multi adapters in same batch: modules_to_save (huggingface#1990)

b970607

Extend the functionality of having different adapters in the same batch to also work with `modules_to_save`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fixing multiple LoRA in the same batch or vit #1990

fixing multiple LoRA in the same batch or vit #1990

saeid93 commented Aug 5, 2024

BenjaminBossan left a comment

saeid93 commented Aug 6, 2024

BenjaminBossan commented Aug 6, 2024

saeid93 commented Aug 6, 2024

BenjaminBossan left a comment •

edited

Loading

BenjaminBossan Aug 7, 2024

saeid93 Aug 24, 2024

BenjaminBossan Aug 7, 2024

saeid93 Aug 24, 2024

saeid93 commented Aug 7, 2024

BenjaminBossan commented Aug 20, 2024

saeid93 commented Aug 20, 2024

BenjaminBossan commented Aug 20, 2024

saeid93 commented Aug 24, 2024 •

edited

Loading

BenjaminBossan left a comment

BenjaminBossan commented Sep 2, 2024

BenjaminBossan commented Sep 12, 2024

saeid93 commented Sep 15, 2024

BenjaminBossan left a comment

saeid93 commented Sep 16, 2024

BenjaminBossan commented Sep 17, 2024

saeid93 commented Sep 17, 2024

BenjaminBossan commented Sep 17, 2024

saeid93 commented Sep 17, 2024

BenjaminBossan left a comment

fixing multiple LoRA in the same batch or vit #1990

fixing multiple LoRA in the same batch or vit #1990

Conversation

saeid93 commented Aug 5, 2024

BenjaminBossan left a comment

Choose a reason for hiding this comment

saeid93 commented Aug 6, 2024

BenjaminBossan commented Aug 6, 2024

saeid93 commented Aug 6, 2024

BenjaminBossan left a comment • edited Loading

Choose a reason for hiding this comment

BenjaminBossan Aug 7, 2024

Choose a reason for hiding this comment

saeid93 Aug 24, 2024

Choose a reason for hiding this comment

BenjaminBossan Aug 7, 2024

Choose a reason for hiding this comment

saeid93 Aug 24, 2024

Choose a reason for hiding this comment

saeid93 commented Aug 7, 2024

BenjaminBossan commented Aug 20, 2024

saeid93 commented Aug 20, 2024

BenjaminBossan commented Aug 20, 2024

saeid93 commented Aug 24, 2024 • edited Loading

BenjaminBossan left a comment

Choose a reason for hiding this comment

BenjaminBossan commented Sep 2, 2024

BenjaminBossan commented Sep 12, 2024

saeid93 commented Sep 15, 2024

BenjaminBossan left a comment

Choose a reason for hiding this comment

saeid93 commented Sep 16, 2024

BenjaminBossan commented Sep 17, 2024

saeid93 commented Sep 17, 2024

BenjaminBossan commented Sep 17, 2024

saeid93 commented Sep 17, 2024

BenjaminBossan left a comment

Choose a reason for hiding this comment

BenjaminBossan left a comment •

edited

Loading

saeid93 commented Aug 24, 2024 •

edited

Loading