Skip to content

Commit

Permalink
Add link to example model
Browse files Browse the repository at this point in the history
  • Loading branch information
siddartha-RE committed Mar 4, 2024
1 parent ba0db26 commit 11b489e
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions docs/source/developer_guides/lora.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,9 @@ config = LoraConfig(layer_replication=[[0,4], [2,5]], ...)
Given the original model had 5 layers `[0, 1, 2 ,3, 4]`, this would create a model with 7 layers arranged as `[0, 1, 2, 3, 2, 3, 4]`. This follows the mergekit pass through merge convention where sequences
of layers specified as start inclusive and end exclusive tuples are stacked to build the final model. It is important to note that each layer in the final model gets its own distinct set of LoRA adpaters.

[Fewshot-Metamath-OrcaVicuna-Mistral-10B](https://huggingface.co/abacusai/Fewshot-Metamath-OrcaVicuna-Mistral-10B) is an example of a model trained using this method on Mistral-7B expanded to 10B. The
(adapter_config.json)[https://huggingface.co/abacusai/Fewshot-Metamath-OrcaVicuna-Mistral-10B/blob/main/adapter_config.json] shows a sample LoRA adapter config applying this method for fine-tuning.

## Merge adapters

While LoRA is significantly smaller and faster to train, you may encounter latency issues during inference due to separately loading the base model and the LoRA adapter. To eliminate latency, use the [`~LoraModel.merge_and_unload`] function to merge the adapter weights with the base model. This allows you to use the newly merged model as a standalone model. The [`~LoraModel.merge_and_unload`] function doesn't keep the adapter weights in memory.
Expand Down

0 comments on commit 11b489e

Please sign in to comment.