Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for ggml models #58

Closed
hippalectryon-0 opened this issue May 19, 2023 · 15 comments
Closed

Add support for ggml models #58

hippalectryon-0 opened this issue May 19, 2023 · 15 comments

Comments

@hippalectryon-0
Copy link

Issue: we can't import ggml (ex: llama-cpp) models.

@abetlen
Copy link
Contributor

abetlen commented May 19, 2023

@hippalectryon-0 I got a basic version of this working last night using the llama-cpp-python server and the OpenAI llm implemented here
abetlen/llama-cpp-python#241

Unfortunately, most of the really interesting features like the RegexProcessor, Token Healing, etc are only available for transformer based models. It looks like there are two options:

  • Extend the guidance.llms.LLM class directly with a LlamaCpp class that wraps llama_cpp.Llama. This may be the correct approach but there's not too much in the way of documentation on how to implement this so I'd have to model it off of guidance.llms.Transformer
  • Create a fake LlamaCppTransformer that looks like a hugginface transformers.Transformer enough to be used by guidance.

@hippalectryon-0
Copy link
Author

hippalectryon-0 commented May 19, 2023

Yeah I've been trying the first option for a few hours but as you said it requires quite a bit of code-digging, haven't made much progress.

@Maximilian-Winter
Copy link

Maximilian-Winter commented May 19, 2023

I tried the second option to implement my own provider for llama-cpp-python in guidance but haven't got very far.
I will try again later today.

@bluecoconut
Copy link

I think that this package: https://github.com/marella/ctransformers ctransformers (new as of a few days ago) has the look and API feel of huggingface transformers, but works directly on ggml models and feels pretty good to use. (simple, works, etc.)

This also has more than just llama, and supports things like starcoder and more models

@hippalectryon-0
Copy link
Author

hippalectryon-0 commented May 19, 2023

It doesn't support llama though, from what's in the readme

@abetlen
Copy link
Contributor

abetlen commented May 19, 2023

ctransformers is really cool but yes it currently doesn't support llama.cpp only ggml based models. Additionally, the current Llama class does have the same signature, ie eval, sample, and generate so I'm not sure it would be any more compatible than llama-cpp-python.

@bluecoconut
Copy link

Ah yeah, I guess I read this issue as ggml models and not just llama.cpp when i replied with the suggestion -> tbh, i didn't appreciate the difference between these, I had always thought llama.cpp was "on top" of ggml, assuming anything ggml is the superset of llama.cpp (an assumed subset). That said, looking more at this, it does seem like there's subtly I didn't understand before.

In good news, the creator of ctransformers responded on a thread about llama.cpp support -> marella/ctransformers#4 --> hopefully available within this week.

Might make sense to have both backends, llama-cpp-python (llama.cpp derivatives) and ctransformers (ggml derivatives)

@Maximilian-Winter
Copy link

Maximilian-Winter commented May 19, 2023

Ok, I got a basic version working with guidance and llama-cpp-python. Will clean it up and test a little bit and then post a link here!

@Maximilian-Winter
Copy link

Maximilian-Winter commented May 19, 2023

Here is the fork of guidance with llama-cpp-python support:
https://github.com/Maximilian-Winter/guidance
I'm not sure about any bugs, but this is the result of the RPG Character Example:
The following is a character profile for an RPG game in JSON format.

{
    "id": "e1f491f7-7ab8-4dac-8c20-c92b5e7d883d",
    "description": "A quick and nimble fighter.",
    "name": "Katana",
    "age": 26,
    "armor": "leather",
    "weapon": "sword",
    "class": "fighter",
    "mantra": "I am the sword of justice.",
    "strength": 10,
    "items": ["a katana", "a leather jacket", "a backpack", "traveler's rations", "water bottle"]
}

abetlen added a commit to abetlen/guidance that referenced this issue May 19, 2023
…r additional models

Hi @slundberg and all, first off great work on this project, I'm very excited to see how it develops.

As per guidance-ai#58 it would be very useful to be able to extend guidance to support additional LLM models and make use of all of the features.

I understand this project is quite new and you probably want to avoid the cost of maintaining N implementations. If it's an acceptable solution, can we export `LLMSession` and `SyncSession` so external projects can add support on their own?
@abetlen
Copy link
Contributor

abetlen commented May 19, 2023

@Maximilian-Winter very impressive thank you!

I've created a PR to export the two missing guidance.llms classes, this way if the maintainers prefer to keep this implementation seperate we can merge it into llama-cpp-python under a contrib subpackage or something like that.

@hippalectryon-0
Copy link
Author

@Maximilian-Winter Nice ! How does it handle stop words though ? I don't see where they're forwarded

@Maximilian-Winter
Copy link

@hippalectryon-0 I'm currently make some changes to llama-cpp-python to add logits processors and the stop criteria lists.

@Maximilian-Winter
Copy link

@hippalectryon-0 I have fixed all bug on my side and added all the need things in llama-cpp-python to my fork of it.

You will find the fork here:
https://github.com/Maximilian-Winter/llama-cpp-python

@Picus303
Copy link

Picus303 commented Jul 11, 2023

@Maximilian-Winter Do you have an example? A sample or a list of dependencies? I tried using your fork (for guidance) but just importing it raises an error. I fixed a few ones on my side but i prefer to ask you as I haven't been able to get it to work for now.

@marcotcr
Copy link
Collaborator

In the new release, we support llama-cpp models. Sorry it took us so long to get to this! They are great, and very fast.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants