examples : Fix `llama-export-lora` example #8607

ngxson · 2024-07-20T22:48:08Z

Resolve #8581

This can now accepts new lora format introduced in #8332

The output merged tensor will be forced to f16, since ggml_cast does not support Q-type quants. It would be nice to move threaded quantization functions from llama.cpp to ggml

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

ggerganov

Haven't done tests but I think it is OK

Would recommend to add log messages with detailed information about:

tensors being merged
tensor sizes
lora scaling
ranks
etc.

We can improve as needed

ggerganov · 2024-07-22T06:28:24Z

examples/export-lora/export-lora.cpp

+            bool t_a = true;
+            bool t_b = true;
+            for (auto & adapter : adapters) {
+                t_a &= nullptr != adapter->get_tensor(it.first + ".lora_a");
+                t_b &= nullptr != adapter->get_tensor(it.first + ".lora_b");
+            }


Here we require that all adapters provide data for the base tensor. If either adapter does not have data for that tensor, then we keep the original data.

Maybe in the future we can improve to support when a subset of the adapters have the data

Yeah right, thanks for spotting that. It's quite complicated to fix that for now, so I'd prefer to merge this PR as-is.

To prevent producing half-working model, I added a check and reject if that's the case. As a fallback solution, user can always run multiple times llama-export-lora: 0ec2b58

examples/export-lora/export-lora.cpp

examples : Fix `llama-export-lora` example (ggerganov#8607)

mudler · 2024-07-24T09:51:39Z

common/common.h

@@ -128,7 +128,6 @@ struct gpt_params {

    // TODO: avoid tuple, use struct
    std::vector<std::tuple<std::string, float>> lora_adapter; // lora adapter path with user defined scale
-    std::string lora_base  = "";                              // base model path for the lora adapter


Just for downstream consumers (like me) this PR removed lora_base so I want to highlight this - also a QQ is this now not needed anymore since #8332 ?

Yes, since #8332 lora_base is no longer needed because we're now merging lora at runtime (instead of merging it with base model at start up). You can still your base model as -m in llama-export-lora, but IMO it won't change end result very much.

gotcha, thanks for the clarification!

* fix export-lora example * add more logging * reject merging subset * better check * typo

fix export-lora example

793a1cd

ngxson requested a review from ggerganov July 20, 2024 22:48

github-actions bot added the examples label Jul 20, 2024

ngxson mentioned this pull request Jul 21, 2024

Add multiple derived adaptions hosting #8415

Closed

4 tasks

ggerganov approved these changes Jul 22, 2024

View reviewed changes

ngxson changed the title ~~Fix export-lora example~~ Fix llama-export-lora example Jul 23, 2024

ngxson added 2 commits July 23, 2024 20:01

add more logging

1f96573

reject merging subset

0ec2b58

ngxson added the merge ready indicates that this may be ready to merge soon and is just holding out in case of objections label Jul 23, 2024

slaren reviewed Jul 23, 2024

View reviewed changes

examples/export-lora/export-lora.cpp Outdated Show resolved Hide resolved

ngxson added 2 commits July 23, 2024 20:48

better check

4bd55ec

typo

78e0d8d

ngxson changed the title ~~Fix llama-export-lora example~~ examples : Fix llama-export-lora example Jul 23, 2024

ngxson merged commit de28008 into ggerganov:master Jul 23, 2024
53 checks passed

Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Jul 23, 2024

Merge pull request #264 from ggerganov/master

0cdeb4a

examples : Fix `llama-export-lora` example (ggerganov#8607)

mudler reviewed Jul 24, 2024

View reviewed changes

mudler mentioned this pull request Jul 24, 2024

fix(llama.cpp): do not set anymore lora_base mudler/LocalAI#2999

Merged

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Jul 27, 2024

examples : Fix llama-export-lora example (ggerganov#8607)

a17bcdf

* fix export-lora example * add more logging * reject merging subset * better check * typo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples : Fix `llama-export-lora` example #8607

examples : Fix `llama-export-lora` example #8607

ngxson commented Jul 20, 2024 •

edited

Loading

ggerganov left a comment

ggerganov Jul 22, 2024

ngxson Jul 23, 2024

mudler Jul 24, 2024 •

edited

Loading

ngxson Jul 24, 2024

mudler Jul 24, 2024

examples : Fix llama-export-lora example #8607

examples : Fix llama-export-lora example #8607

Conversation

ngxson commented Jul 20, 2024 • edited Loading

ggerganov left a comment

Choose a reason for hiding this comment

ggerganov Jul 22, 2024

Choose a reason for hiding this comment

ngxson Jul 23, 2024

Choose a reason for hiding this comment

mudler Jul 24, 2024 • edited Loading

Choose a reason for hiding this comment

ngxson Jul 24, 2024

Choose a reason for hiding this comment

mudler Jul 24, 2024

Choose a reason for hiding this comment

examples : Fix `llama-export-lora` example #8607

examples : Fix `llama-export-lora` example #8607

ngxson commented Jul 20, 2024 •

edited

Loading

mudler Jul 24, 2024 •

edited

Loading