Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Faster adapter loading if there are a lot of target modules #2045

Conversation

BenjaminBossan
Copy link
Member

@BenjaminBossan BenjaminBossan commented Aug 30, 2024

This is an optimization to reduce the number of entries in the target_modules set. The reason is that in some circumstances, target_modules can contain hundreds of entries. Since each target module is checked against each module of the net (which can be thousands), this can become quite expensive when many adapters are being added. Often, the target_modules can be condensed in such a case, which speeds up the process.

A context in which this can happen is when diffusers loads non-PEFT LoRAs. As there is no meta info on target_modules in that case, they are just inferred by listing all keys from the state_dict, which can be quite a lot. See: huggingface/diffusers#9297

As there is a small chance for undiscovered bugs, we apply this optimization only if the list of target_modules is sufficiently big. Therefore, for normal PEFT users, this should not have any effect.

Example:

>>> from peft.tuners.tuners_utils import _find_minimal_target_modules
>>> target_modules = [f"model.decoder.layers.{i}.self_attn.q_proj" for i in range(100)]
>>> target_modules += [f"model.decoder.layers.{i}.self_attn.v_proj" for i in range(100)]
>>> other_module_names = [f"model.encoder.layers.{i}.self_attn.k_proj" for i in range(100)]
>>> _find_minimal_target_modules(target_modules, other_module_names)
{"q_proj", "v_proj"}

As shown in huggingface/diffusers#9297, the speed improvements for loading many diffusers LoRAs can be substantial. When loading 30 adapters, the time would go up from 0.6 sec per adapter to 3 sec per adapter. With this fix, the time goes up from 0.6 sec per adapter to 1 sec per adapter.

This is an optimization to reduce the number of entries in the
target_modules list. The reason is that in some circumstances,
target_modules can contain hundreds of entries. Since each target module
is checked against each module of the net (which can be thousands), this
can become quite expensive when many adapters are being added. Often,
the target_modules can be condensed in such a case, which speeds up the
process.

A context in which this can happen is when diffusers loads non-PEFT
LoRAs. As there is no meta info on target_modules in that case, they are
just inferred by listing all keys from the state_dict, which can be
quite a lot. See: huggingface/diffusers#9297

As there is a small chance for undiscovered bugs, we apply this
optimization only if the list of target_modules is sufficiently big.
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent stuff! Would be nice to also update the original description of this PR to hint at the speedup achievable with this change.

# quite a lot. See: https://github.com/huggingface/diffusers/issues/9297
# As there is a small chance for undiscovered bugs, we apply this optimization only if the list of
# target_modules is sufficiently big.
if isinstance(peft_config.target_modules, (list, set)) and len(peft_config.target_modules) >= 20:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

20 could be assigned to a variable.

# target_modules is sufficiently big.
if isinstance(peft_config.target_modules, (list, set)) and len(peft_config.target_modules) >= 20:
names_not_match = [n for n in key_list if n not in peft_config.target_modules]
new_target_modules = find_minimal_target_modules(peft_config.target_modules, names_not_match)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we keep find_minimal_target_modules() as a pseudo-private method to denote the experimental nature of this feature?

@@ -781,6 +796,86 @@ def _move_adapter_to_device_of_base_layer(self, adapter_name: str, device: Optio
adapter_layer[adapter_name] = adapter_layer[adapter_name].to(device)


def find_minimal_target_modules(
target_modules: list[str] | set[str], other_module_names: list[str] | set[str]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer this (list[str] | set[str]) to Union[List[str], Set[str]] more. Nice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I prefer the new style that requires no from typing import List, Set, Union etc. I think this will be adopted more and more going forward, as old Python versions that don't support it are phased out, so I'd rather keep it like this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah same here.

Copy link
Member Author

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, your points should be addressed, please check again. Note that I made some small changes on top, like better variable names and adding another test.

@@ -781,6 +796,86 @@ def _move_adapter_to_device_of_base_layer(self, adapter_name: str, device: Optio
adapter_layer[adapter_name] = adapter_layer[adapter_name].to(device)


def find_minimal_target_modules(
target_modules: list[str] | set[str], other_module_names: list[str] | set[str]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I prefer the new style that requires no from typing import List, Set, Union etc. I think this will be adopted more and more going forward, as old Python versions that don't support it are phased out, so I'd rather keep it like this.

@BenjaminBossan BenjaminBossan merged commit 01275b4 into huggingface:main Sep 2, 2024
14 checks passed
@BenjaminBossan BenjaminBossan deleted the enh-speed-up-adapter-loading-many-target-modules branch September 2, 2024 10:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants