For llama.cpp, model_kwargs gets ignored #743

isamu-isozaki · 2024-03-14T01:46:15Z

Describe the issue as clearly as possible:

I found that when we do

llm = outlines.models.llamacpp(
      model_path,
      model_kwargs={
          "n_gpu_layers": 15,
          "n_batch": 2048,
          "n_ctx": 2048
      },
      device="cpu"
  )

similar to the documentation, the contents of model_kwargs get ignored. The reason I think is here

def llamacpp(model_path: str, device: Optional[str] = None, **model_kwargs) -> LlamaCpp:
    from llama_cpp import Llama
    if device == "cuda":
        model_kwargs["n_gpu_layers"].setdefault(-1)

    model = Llama(model_path, **model_kwargs)
    return LlamaCpp(model=model)

where model_kwargs becomes a dict with key model_kwargs. To fix this I just did

def llamacpp(model_path: str, device: Optional[str] = None, **model_kwargs) -> LlamaCpp:
    from llama_cpp import Llama
    model_kwargs = model_kwargs["model_kwargs"]
    if device == "cuda":
        model_kwargs["n_gpu_layers"].setdefault(-1)

    model = Llama(model_path, **model_kwargs)
    return LlamaCpp(model=model)

but I wanted to make sure if this was a bug. I did make a pr for it if the way the function is called is correct. For example, if we just directly put the keywords in to llamacpp instead of doing model_kwargs= it'll work but that'll be different from the transformers api.

Steps/code to reproduce the bug:

import outlines
llm = outlines.models.llamacpp(
      model_path,
      model_kwargs={
          "n_gpu_layers": 15,
          "n_batch": 2048,
          "n_ctx": 2048
      },
      device="cpu"
)

Expected result:

llm will be initialized with 2048 context length but instead it's 512 context length as the model_kwargs do not get set in model

Error message:

No response

Outlines/Python version information:

Version information
latest

Context for the issue:

No response

Fixes [this issue](#743)

isamu-isozaki added the bug label Mar 14, 2024

isamu-isozaki mentioned this issue Mar 14, 2024

Added model_kwargs #744

Merged

rlouf closed this as completed in #744 Mar 14, 2024

rlouf pushed a commit that referenced this issue Mar 14, 2024

Pass 'model_kwargs for outlines.models.llamacpp as dict (#744)

5c15e8c

Fixes [this issue](#743)

isamu-isozaki mentioned this issue Mar 26, 2024

outlines with llamacpp has KeyError over 'n_gpu_layers' #761

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

For llama.cpp, model_kwargs gets ignored #743

For llama.cpp, model_kwargs gets ignored #743

isamu-isozaki commented Mar 14, 2024 •

edited

Loading

For llama.cpp, model_kwargs gets ignored #743

For llama.cpp, model_kwargs gets ignored #743

Comments

isamu-isozaki commented Mar 14, 2024 • edited Loading

Describe the issue as clearly as possible:

Steps/code to reproduce the bug:

Expected result:

Error message:

Outlines/Python version information:

Context for the issue:

isamu-isozaki commented Mar 14, 2024 •

edited

Loading