Skip to content

Commit

Permalink
Add support for layer replication in LoRA (huggingface#1368)
Browse files Browse the repository at this point in the history
* Add support for layer replication in LoRA

* Add test and update docs

* Address review comments

* Code cleanup and additional model support

* Add docs, address comments

* Add link to example model

* Improve test and fix typos

* Update src/peft/tuners/tuners_utils.py

Fix typo in doc string.

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

---------

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
  • Loading branch information
siddartha-RE and BenjaminBossan committed Mar 14, 2024
1 parent f8a45f0 commit 62e90f1
Showing 1 changed file with 23 additions and 0 deletions.
23 changes: 23 additions & 0 deletions src/peft/tuners/lora/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,9 @@ class LoraConfig(PeftConfig):
Enable 'Weight-Decomposed Low-Rank Adaptation' (DoRA). This technique decomposes the updates of the weights
into two parts, magnitude and direction. Direction is handled by normal LoRA, whereas the magnitude is
handled by a separate learnable parameter. This can improve the performance of LoRA, especially at low
ranks. Right now, DoRA only supports non-quantized linear layers. DoRA introduces a bigger overhead than
pure LoRA, so it is recommended to merge weights for inference. For more information, see
https://arxiv.org/abs/2402.09353.
layer_replication(`List[Tuple[int, int]]`):
Build a new stack of layers by stacking the original model layers according to the ranges specified. This
allows expanding (or shrinking) the model without duplicating the base model weights. The new layers will
Expand Down Expand Up @@ -265,6 +268,26 @@ class LoraConfig(PeftConfig):
)
},
)
# Enables replicating layers in a model to expand it to a larger model.
layer_replication: Optional[list[tuple[int, int]]] = field(
default=None,
metadata={
"help": (
"This enables using LoRA to effectively expand a transformer model to a larger size by repeating some layers. "
"The transformation handles models (currently Llama, Bert or Falcon compatible architectures) with "
"a module list in the model which it modifies to expand the number of modules. "
"Base weights are shared so the memory usage is close to the original model. The intended use is these base weights "
"remain fixed during finetuning but each layer has a separate LoRA adapter so the layers can be specialed via "
"the adapter layers fit during fine tuning."
"The format is a list of [start, end) pairs which specify the layer ranges to stack. For example:\n"
" Original model has 5 layers labelled by their position in the model: `[0, 1, 2, 3, 4]`\n"
" layer_replication: `[[0, 4], [2, 5]]`\n"
" Final model will have this arrangement of original layers: `[0, 1, 2, 3, 2, 3, 4]`\n"
"This format is based on what is used for pass-through merges in mergekit. It makes it simple to select sequential "
"ranges of a model and stack them while reusing layers at either end of each sequence."
)
},
)

def __post_init__(self):
self.peft_type = PeftType.LORA
Expand Down

0 comments on commit 62e90f1

Please sign in to comment.