Train parameters exclusively in specific ranges #1390

seungduk-yanolja · 2024-03-11T09:28:12Z

Train parameters exclusively in specific ranges

Description

This PR enables the freezing configuration to specify parameter ranges so that we can ignore gradients for specific tokens.
For more details, please take a look at this technical report.
https://arxiv.org/abs/2402.14714

Motivation and Context

When some tokens are added to the vocab, we may want to train only those specific tokens. For example, while fine-tuning, imagine you want to train specific control tokens only.
This technique can be also used when expanding the vocabulary for non-English languages.

How has this been tested?

I wrote unit tests for it.

Screenshots (if appropriate)

N/A

Types of changes

New feature (non-breaking change which adds functionality)

seungduk-yanolja · 2024-03-11T16:12:21Z

examples/mistral/mixtral.yml

@@ -16,12 +16,12 @@ output_dir: ./qlora-out

 ## You can optionally freeze the entire model and unfreeze a subset of parameters
 unfrozen_parameters:
-#  - lm_head.*


The previous example was technically wrong (but worked) because this becomes the following:
lm_head\.* and checks if the dot repeats 0+ times

winglian

thanks!

* Train parameters exclusively in specific ranges * Fix the style and update docs * Update yaml example

seungduk-yanolja added 3 commits March 11, 2024 18:26

Train parameters exclusively in specific ranges

f0b474d

Fix the style and update docs

686f041

Update yaml example

19e5cb8

seungduk-yanolja commented Mar 11, 2024

View reviewed changes

winglian approved these changes Mar 12, 2024

View reviewed changes

winglian merged commit 05bcc9e into axolotl-ai-cloud:main Mar 14, 2024
6 checks passed

seungduk-yanolja added a commit to Y-IAB/axolotl that referenced this pull request Mar 19, 2024

Train parameters exclusively in specific ranges (axolotl-ai-cloud#1390)

d3e0a07

* Train parameters exclusively in specific ranges * Fix the style and update docs * Update yaml example

winglian mentioned this pull request Mar 30, 2024

Unfreeze layers in mixtral does not work as expected #1464

Open

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Train parameters exclusively in specific ranges #1390

Train parameters exclusively in specific ranges #1390

seungduk-yanolja commented Mar 11, 2024 •

edited

Loading

seungduk-yanolja Mar 11, 2024

winglian left a comment

Train parameters exclusively in specific ranges #1390

Train parameters exclusively in specific ranges #1390

Conversation

seungduk-yanolja commented Mar 11, 2024 • edited Loading

Description

Motivation and Context

How has this been tested?

Screenshots (if appropriate)

Types of changes

seungduk-yanolja Mar 11, 2024

Choose a reason for hiding this comment

winglian left a comment

Choose a reason for hiding this comment

seungduk-yanolja commented Mar 11, 2024 •

edited

Loading