Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

examples : Fix llama-export-lora example #8607

Merged
merged 5 commits into from
Jul 23, 2024

Conversation

ngxson
Copy link
Collaborator

@ngxson ngxson commented Jul 20, 2024

Resolve #8581

This can now accepts new lora format introduced in #8332

The output merged tensor will be forced to f16, since ggml_cast does not support Q-type quants. It would be nice to move threaded quantization functions from llama.cpp to ggml


Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't done tests but I think it is OK

Would recommend to add log messages with detailed information about:

  • tensors being merged
  • tensor sizes
  • lora scaling
  • ranks
  • etc.

We can improve as needed

Comment on lines +190 to +195
bool t_a = true;
bool t_b = true;
for (auto & adapter : adapters) {
t_a &= nullptr != adapter->get_tensor(it.first + ".lora_a");
t_b &= nullptr != adapter->get_tensor(it.first + ".lora_b");
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we require that all adapters provide data for the base tensor. If either adapter does not have data for that tensor, then we keep the original data.

Maybe in the future we can improve to support when a subset of the adapters have the data

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah right, thanks for spotting that. It's quite complicated to fix that for now, so I'd prefer to merge this PR as-is.

To prevent producing half-working model, I added a check and reject if that's the case. As a fallback solution, user can always run multiple times llama-export-lora: 0ec2b58

@ngxson ngxson changed the title Fix export-lora example Fix llama-export-lora example Jul 23, 2024
@ngxson ngxson added the merge ready indicates that this may be ready to merge soon and is just holding out in case of objections label Jul 23, 2024
@ngxson ngxson changed the title Fix llama-export-lora example examples : Fix llama-export-lora example Jul 23, 2024
@ngxson ngxson merged commit de28008 into ggerganov:master Jul 23, 2024
53 checks passed
Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Jul 23, 2024
examples : Fix `llama-export-lora` example (ggerganov#8607)
@@ -128,7 +128,6 @@ struct gpt_params {

// TODO: avoid tuple, use struct
std::vector<std::tuple<std::string, float>> lora_adapter; // lora adapter path with user defined scale
std::string lora_base = ""; // base model path for the lora adapter
Copy link
Contributor

@mudler mudler Jul 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for downstream consumers (like me) this PR removed lora_base so I want to highlight this - also a QQ is this now not needed anymore since #8332 ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, since #8332 lora_base is no longer needed because we're now merging lora at runtime (instead of merging it with base model at start up). You can still your base model as -m in llama-export-lora, but IMO it won't change end result very much.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gotcha, thanks for the clarification!

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Jul 27, 2024
* fix export-lora example

* add more logging

* reject merging subset

* better check

* typo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples merge ready indicates that this may be ready to merge soon and is just holding out in case of objections
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: export-lora does not accept GGUF files
4 participants