Unsloth optims for Llama #1609

winglian · 2024-05-11T04:06:16Z

WIP to integrate Unsloth's optimizations into axolotl.

The manual autograd for MLP, QKV, O only seems to help VRAM by 1% as opposed to the reported 8%.
The Cross Entropy Loss does help significantly, but only reduced VRAM by 13% as opposed to the reported 17%.

edit: clarification, the cross entropy loss works for both full fine tunes and lora. The MLP, QKV, and O are only for 4-bit qlora with flash attention.

bratao · 2024-05-11T19:07:19Z

The cross_entropy_loss optimization is applicable even in a full fine tune, right?

winglian · 2024-05-11T19:31:52Z

The cross_entropy_loss optimization is applicable even in a full fine tune, right?

Correct!

linux-leo · 2024-05-11T20:43:44Z

Are these optimizations compatible with flash attention? (Complete Noob here)

bratao · 2024-05-17T16:36:27Z

It is possible to alter to also patch Qwen? It was added to unsloth and all optimizations work for it:
unslothai/unsloth#428

winglian · 2024-05-18T17:39:49Z

It is possible to alter to also patch Qwen?

Let's tackle that in a follow up PR

Nero10578 · 2024-06-30T06:05:25Z

Does this only work for single GPU? Getting error of "Runtime Error: Model must be 2-D" when enabling unsloth cross entropy loss on 2x3090Ti finetuning Llama 3 8B LORA.

* WIP for unsloth integrations * import the unsloth code in the right context * add unsloth mlp, qkv, o lora optimizations * apply unsloth mlp and qkv kernels

winglian changed the title ~~Unsloth optims~~ Unsloth optims for Llama May 14, 2024

winglian added 4 commits May 18, 2024 13:39

WIP for unsloth integrations

3fc4538

import the unsloth code in the right context

8509cac

add unsloth mlp, qkv, o lora optimizations

5bc5f38

apply unsloth mlp and qkv kernels

ae2dc57

winglian force-pushed the unsloth-optims branch from 9c52e8d to ae2dc57 Compare May 18, 2024 17:39

winglian merged commit 8a1572a into main May 20, 2024
7 checks passed

winglian deleted the unsloth-optims branch May 20, 2024 13:55

djsaunde pushed a commit that referenced this pull request Dec 17, 2024

Unsloth optims for Llama (#1609)

d83a6fc

* WIP for unsloth integrations * import the unsloth code in the right context * add unsloth mlp, qkv, o lora optimizations * apply unsloth mlp and qkv kernels

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unsloth optims for Llama #1609

Unsloth optims for Llama #1609

winglian commented May 11, 2024 •

edited

Loading

bratao commented May 11, 2024

winglian commented May 11, 2024

linux-leo commented May 11, 2024

bratao commented May 17, 2024

winglian commented May 18, 2024

Nero10578 commented Jun 30, 2024

Unsloth optims for Llama #1609

Unsloth optims for Llama #1609

Conversation

winglian commented May 11, 2024 • edited Loading

bratao commented May 11, 2024

winglian commented May 11, 2024

linux-leo commented May 11, 2024

bratao commented May 17, 2024

winglian commented May 18, 2024

Nero10578 commented Jun 30, 2024

winglian commented May 11, 2024 •

edited

Loading