Add support for ggml models #58

hippalectryon-0 · 2023-05-19T12:07:14Z

Issue: we can't import ggml (ex: llama-cpp) models.

abetlen · 2023-05-19T15:15:30Z

@hippalectryon-0 I got a basic version of this working last night using the llama-cpp-python server and the OpenAI llm implemented here
abetlen/llama-cpp-python#241

Unfortunately, most of the really interesting features like the RegexProcessor, Token Healing, etc are only available for transformer based models. It looks like there are two options:

Extend the guidance.llms.LLM class directly with a LlamaCpp class that wraps llama_cpp.Llama. This may be the correct approach but there's not too much in the way of documentation on how to implement this so I'd have to model it off of guidance.llms.Transformer
Create a fake LlamaCppTransformer that looks like a hugginface transformers.Transformer enough to be used by guidance.

hippalectryon-0 · 2023-05-19T15:18:24Z

Yeah I've been trying the first option for a few hours but as you said it requires quite a bit of code-digging, haven't made much progress.

Maximilian-Winter · 2023-05-19T16:49:56Z

I tried the second option to implement my own provider for llama-cpp-python in guidance but haven't got very far.
I will try again later today.

bluecoconut · 2023-05-19T16:59:08Z

I think that this package: https://github.com/marella/ctransformers ctransformers (new as of a few days ago) has the look and API feel of huggingface transformers, but works directly on ggml models and feels pretty good to use. (simple, works, etc.)

This also has more than just llama, and supports things like starcoder and more models

hippalectryon-0 · 2023-05-19T17:07:53Z

It doesn't support llama though, from what's in the readme

abetlen · 2023-05-19T17:11:47Z

ctransformers is really cool but yes it currently doesn't support llama.cpp only ggml based models. Additionally, the current Llama class does have the same signature, ie eval, sample, and generate so I'm not sure it would be any more compatible than llama-cpp-python.

bluecoconut · 2023-05-19T17:15:32Z

Ah yeah, I guess I read this issue as ggml models and not just llama.cpp when i replied with the suggestion -> tbh, i didn't appreciate the difference between these, I had always thought llama.cpp was "on top" of ggml, assuming anything ggml is the superset of llama.cpp (an assumed subset). That said, looking more at this, it does seem like there's subtly I didn't understand before.

In good news, the creator of ctransformers responded on a thread about llama.cpp support -> marella/ctransformers#4 --> hopefully available within this week.

Might make sense to have both backends, llama-cpp-python (llama.cpp derivatives) and ctransformers (ggml derivatives)

Maximilian-Winter · 2023-05-19T17:15:48Z

Ok, I got a basic version working with guidance and llama-cpp-python. Will clean it up and test a little bit and then post a link here!

Maximilian-Winter · 2023-05-19T22:47:03Z

Here is the fork of guidance with llama-cpp-python support:
https://github.com/Maximilian-Winter/guidance
I'm not sure about any bugs, but this is the result of the RPG Character Example:
The following is a character profile for an RPG game in JSON format.

{
    "id": "e1f491f7-7ab8-4dac-8c20-c92b5e7d883d",
    "description": "A quick and nimble fighter.",
    "name": "Katana",
    "age": 26,
    "armor": "leather",
    "weapon": "sword",
    "class": "fighter",
    "mantra": "I am the sword of justice.",
    "strength": 10,
    "items": ["a katana", "a leather jacket", "a backpack", "traveler's rations", "water bottle"]
}

@slundberg

…r additional models Hi @slundberg and all, first off great work on this project, I'm very excited to see how it develops. As per guidance-ai#58 it would be very useful to be able to extend guidance to support additional LLM models and make use of all of the features. I understand this project is quite new and you probably want to avoid the cost of maintaining N implementations. If it's an acceptable solution, can we export `LLMSession` and `SyncSession` so external projects can add support on their own?

abetlen · 2023-05-19T23:23:55Z

@Maximilian-Winter very impressive thank you!

I've created a PR to export the two missing guidance.llms classes, this way if the maintainers prefer to keep this implementation seperate we can merge it into llama-cpp-python under a contrib subpackage or something like that.

hippalectryon-0 · 2023-05-20T09:15:52Z

@Maximilian-Winter Nice ! How does it handle stop words though ? I don't see where they're forwarded

Maximilian-Winter · 2023-05-20T10:26:41Z

@hippalectryon-0 I'm currently make some changes to llama-cpp-python to add logits processors and the stop criteria lists.

Maximilian-Winter · 2023-05-20T20:51:22Z

@hippalectryon-0 I have fixed all bug on my side and added all the need things in llama-cpp-python to my fork of it.

You will find the fork here:
https://github.com/Maximilian-Winter/llama-cpp-python

Picus303 · 2023-07-11T13:21:18Z

@Maximilian-Winter Do you have an example? A sample or a list of dependencies? I tried using your fork (for guidance) but just importing it raises an error. I fixed a few ones on my side but i prefer to ask you as I haven't been able to get it to work for now.

marcotcr · 2023-11-14T21:36:33Z

In the new release, we support llama-cpp models. Sorry it took us so long to get to this! They are great, and very fast.

hippalectryon-0 mentioned this issue May 19, 2023

Use better prompt templating su77ungr/CASALIOY#99

Open

abetlen mentioned this issue May 19, 2023

Export LLMSession and SyncSession #66

Merged

marella mentioned this issue May 28, 2023

Support Microsoft Guidance marella/ctransformers#13

Open

marcotcr closed this as completed Nov 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for ggml models #58

Add support for ggml models #58

hippalectryon-0 commented May 19, 2023

abetlen commented May 19, 2023

hippalectryon-0 commented May 19, 2023 •

edited

Loading

Maximilian-Winter commented May 19, 2023 •

edited

Loading

bluecoconut commented May 19, 2023

hippalectryon-0 commented May 19, 2023 •

edited

Loading

abetlen commented May 19, 2023

bluecoconut commented May 19, 2023

Maximilian-Winter commented May 19, 2023 •

edited

Loading

Maximilian-Winter commented May 19, 2023 •

edited

Loading

abetlen commented May 19, 2023

hippalectryon-0 commented May 20, 2023

Maximilian-Winter commented May 20, 2023

Maximilian-Winter commented May 20, 2023

Picus303 commented Jul 11, 2023 •

edited

Loading

marcotcr commented Nov 14, 2023

Add support for ggml models #58

Add support for ggml models #58

Comments

hippalectryon-0 commented May 19, 2023

abetlen commented May 19, 2023

hippalectryon-0 commented May 19, 2023 • edited Loading

Maximilian-Winter commented May 19, 2023 • edited Loading

bluecoconut commented May 19, 2023

hippalectryon-0 commented May 19, 2023 • edited Loading

abetlen commented May 19, 2023

bluecoconut commented May 19, 2023

Maximilian-Winter commented May 19, 2023 • edited Loading

Maximilian-Winter commented May 19, 2023 • edited Loading

abetlen commented May 19, 2023

hippalectryon-0 commented May 20, 2023

Maximilian-Winter commented May 20, 2023

Maximilian-Winter commented May 20, 2023

Picus303 commented Jul 11, 2023 • edited Loading

marcotcr commented Nov 14, 2023

hippalectryon-0 commented May 19, 2023 •

edited

Loading

Maximilian-Winter commented May 19, 2023 •

edited

Loading

hippalectryon-0 commented May 19, 2023 •

edited

Loading

Maximilian-Winter commented May 19, 2023 •

edited

Loading

Maximilian-Winter commented May 19, 2023 •

edited

Loading

Picus303 commented Jul 11, 2023 •

edited

Loading