You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Sep 12, 2024. It is now read-only.
I apologize in advance if I omit any useful details, I'm just a simple dev with no knowledge or understanding in DS and therefore I'm in trial and error land.
I followed the instructions from llama.cpp on the llama-2-13b-chat model, and I now have the q4_0 file: llama-2-13b-chat/ggml-model-q4_0.gguf.
I use the example code from this repo and of course have changed it to point to the model file, but loading fails:
The code:
import{LLM}from'llama-node';import{LLamaCpp}from'llama-node/dist/llm/llama-cpp.js';importpathfrom'path';constmodel=path.resolve(process.cwd(),'../llama.cpp/models/llama-2-13b-chat/ggml-model-q4_0.gguf',);console.log(model);constllama=newLLM(LLamaCpp);/** @type {import('llama-node/dist/llm/llama-cpp').LoadConfig} */constconfig={modelPath: model,enableLogging: true,nCtx: 1024,seed: 0,f16Kv: false,logitsAll: false,vocabOnly: false,useMlock: false,embedding: false,useMmap: true,nGpuLayers: 128,};consttemplate=`How are you?`;constprompt=`A chat between a user and an assistant.USER: ${template}ASSISTANT:`;constparams={nThreads: 4,nTokPredict: 2048,topK: 40,topP: 0.1,temp: 0.2,repeatPenalty: 1,
prompt,};construn=async()=>{awaitllama.load(config);awaitllama.createCompletion(params,response=>{process.stdout.write(response.token);});};run();
The error:
Debugger listening on ws://127.0.0.1:59899/c72280cb-a098-4c15-859f-54025e513896
For help, see: https://nodejs.org/en/docs/inspector
Debugger attached.
/Users/gioraguttsait/Git/personal-repos/llm/llama.cpp/models/llama-2-13b-chat/ggml-model-q4_0.gguf
llama.cpp: loading model from /Users/gioraguttsait/Git/personal-repos/llm/llama.cpp/models/llama-2-13b-chat/ggml-model-q4_0.gguf
error loading model: unknown (magic, version) combination: 46554747, 00000001; is this really a GGML file?
llama_init_from_file: failed to load model
Waiting for the debugger to disconnect...
node:internal/process/promises:288
triggerUncaughtException(err, true /* fromPromise */);
^
[Error: Failed to initialize LLama context from file: /Users/gioraguttsait/Git/personal-repos/llm/llama.cpp/models/llama-2-13b-chat/ggml-model-q4_0.gguf] {
code: 'GenericFailure'
}
Node.js v18.17.1
I can see that the error refers to some constants which it doesn't expect in the file (error loading model: unknown (magic, version) combination: 46554747, 00000001; is this really a GGML file?), and I see that it's a gguf file and not a ggml one.
From a quick google search, I got to this post on r/LocalLLaMA which stats that gguf is sort of a successor to ggml.
I have literally 0 understanding of what I'm doing, and would appreciate if someone could point me in some direction of how to deal with it. Even just pointing out keywords I might have missed which could have led me to find a better answer in the first place 😅
Thanks in advance for your time!
The text was updated successfully, but these errors were encountered:
Exact same issue here. Did you manage to find a work around? I might be wrong, but it doesn't look like this library's llama-cpp has been updated in ~4 months. I wonder if that's the issue.
I apologize in advance if I omit any useful details, I'm just a simple dev with no knowledge or understanding in DS and therefore I'm in trial and error land.
I followed the instructions from llama.cpp on the llama-2-13b-chat model, and I now have the q4_0 file:
llama-2-13b-chat/ggml-model-q4_0.gguf
.I use the example code from this repo and of course have changed it to point to the model file, but loading fails:
The code:
The error:
I can see that the error refers to some constants which it doesn't expect in the file (
error loading model: unknown (magic, version) combination: 46554747, 00000001; is this really a GGML file?
), and I see that it's a gguf file and not a ggml one.From a quick google search, I got to this post on
r/LocalLLaMA
which stats that gguf is sort of a successor to ggml.I have literally 0 understanding of what I'm doing, and would appreciate if someone could point me in some direction of how to deal with it. Even just pointing out keywords I might have missed which could have led me to find a better answer in the first place 😅
Thanks in advance for your time!
The text was updated successfully, but these errors were encountered: