You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
import llm
model = llm.get_model("orca-mini-3b-gguf2-q4_0")
for i in range(70):
print(i, model.prompt("How are you today?"))
seems to always crash on the 61th prompt() call.
It doesn't seem to be related to running out of memory but something else.
I'm not quite sure if this is actually llm-gpt4all issue or issue in gpt4all or even in llama.cpp.
But at least, I didn't see issues when using gpt4-all directly (at least the following version works):
from gpt4all import GPT4All
model = GPT4All(MODEL)
for i in range(70)
print(model.generate("How are you", max_tokens=5))
Anyway, the gpt4-all Python API behaves quite a bit differently here.
E.g., the llm-gpt4all re-creates LLModel objects in Python for each prompt.
The coredump shows that the ctx variable seen by the C++ code is null but how that exactly happens:
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f5a024a8ac4 in ggml_new_object (ctx=ctx@entry=0x0, type=type@entry=GGML_OBJECT_GRAPH, size=262440)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/ggml.c:2430
2430 struct ggml_object * obj_cur = ctx->objects_end;
(gdb) bt
#0 0x00007f5a024a8ac4 in ggml_new_object (ctx=ctx@entry=0x0, type=type@entry=GGML_OBJECT_GRAPH, size=262440)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/ggml.c:2430 #1 0x00007f5a024cc3c2 in ggml_new_graph_custom (ctx=0x0, size=8192, grads=false)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/ggml.c:15834 #2 0x00007f5a02488bae in llm_build_context::build_llama (this=this@entry=0x7ffc5fb27d60)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/llama.cpp:4326 #3 0x00007f5a024606c4 in llama_build_graph (lctx=..., batch=...) at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/llama.cpp:6191 #4 0x00007f5a0246e405 in llama_new_context_with_model (model=, params=...)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/llama.cpp:9514 #5 0x00007f5a02456a98 in LLamaModel::loadModel (this=0x419eed0, modelPath="/home/johannes/.cache/gpt4all/orca-mini-3b-gguf2-q4_0.gguf", n_ctx=)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llamamodel.cpp:215 #6 0x00007f5a03f770df in llmodel_loadModel (model=, model_path=0x7f5a02acbdd0 "/home/johannes/.cache/gpt4all/orca-mini-3b-gguf2-q4_0.gguf",
n_ctx=2048) at /usr/include/c++/13/bits/basic_string.tcc:238 #7 0x00007f5a03f898b6 in ffi_call_unix64 () at ../src/x86/unix64.S:104 #8 0x00007f5a03f8634d in ffi_call_int (cif=cif@entry=0x7ffc5fb297f0, fn=, rvalue=, avalue=,
closure=closure@entry=0x0) at ../src/x86/ffi64.c:673 #9 0x00007f5a03f88f33 in ffi_call (cif=cif@entry=0x7ffc5fb297f0, fn=fn@entry=0x7f5a03f77060 <llmodel_loadModel(llmodel_model, char const*, int)>,
rvalue=rvalue@entry=0x7ffc5fb29700, avalue=) at ../src/x86/ffi64.c:710 #10 0x00007f5a042142e9 in _call_function_pointer (argtypecount=, argcount=3, resmem=0x7ffc5fb29700, restype=,
atypes=, avalues=, pProc=0x7f5a03f77060 <llmodel_loadModel(llmodel_model, char const*, int)>, flags=)
at /usr/src/python3.11-3.11.6-3/Modules/_ctypes/callproc.c:923 #11 _ctypes_callproc (pProc=, argtuple=, flags=, argtypes=, restype=,
checker=) at /usr/src/python3.11-3.11.6-3/Modules/_ctypes/callproc.c:1262
I'm using Ubuntu, Python 3.11.6, no GPU used here.
The text was updated successfully, but these errors were encountered:
Simple script like:
seems to always crash on the 61th prompt() call.
It doesn't seem to be related to running out of memory but something else.
I'm not quite sure if this is actually llm-gpt4all issue or issue in gpt4all or even in llama.cpp.
But at least, I didn't see issues when using gpt4-all directly (at least the following version works):
Anyway, the gpt4-all Python API behaves quite a bit differently here.
E.g., the llm-gpt4all re-creates LLModel objects in Python for each prompt.
The coredump shows that the ctx variable seen by the C++ code is null but how that exactly happens:
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f5a024a8ac4 in ggml_new_object (ctx=ctx@entry=0x0, type=type@entry=GGML_OBJECT_GRAPH, size=262440)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/ggml.c:2430
2430 struct ggml_object * obj_cur = ctx->objects_end;
(gdb) bt
#0 0x00007f5a024a8ac4 in ggml_new_object (ctx=ctx@entry=0x0, type=type@entry=GGML_OBJECT_GRAPH, size=262440)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/ggml.c:2430
#1 0x00007f5a024cc3c2 in ggml_new_graph_custom (ctx=0x0, size=8192, grads=false)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/ggml.c:15834
#2 0x00007f5a02488bae in llm_build_context::build_llama (this=this@entry=0x7ffc5fb27d60)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/llama.cpp:4326
#3 0x00007f5a024606c4 in llama_build_graph (lctx=..., batch=...) at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/llama.cpp:6191
#4 0x00007f5a0246e405 in llama_new_context_with_model (model=, params=...)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/llama.cpp:9514
#5 0x00007f5a02456a98 in LLamaModel::loadModel (this=0x419eed0, modelPath="/home/johannes/.cache/gpt4all/orca-mini-3b-gguf2-q4_0.gguf", n_ctx=)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llamamodel.cpp:215
#6 0x00007f5a03f770df in llmodel_loadModel (model=, model_path=0x7f5a02acbdd0 "/home/johannes/.cache/gpt4all/orca-mini-3b-gguf2-q4_0.gguf",
n_ctx=2048) at /usr/include/c++/13/bits/basic_string.tcc:238
#7 0x00007f5a03f898b6 in ffi_call_unix64 () at ../src/x86/unix64.S:104
#8 0x00007f5a03f8634d in ffi_call_int (cif=cif@entry=0x7ffc5fb297f0, fn=, rvalue=, avalue=,
closure=closure@entry=0x0) at ../src/x86/ffi64.c:673
#9 0x00007f5a03f88f33 in ffi_call (cif=cif@entry=0x7ffc5fb297f0, fn=fn@entry=0x7f5a03f77060 <llmodel_loadModel(llmodel_model, char const*, int)>,
rvalue=rvalue@entry=0x7ffc5fb29700, avalue=) at ../src/x86/ffi64.c:710
#10 0x00007f5a042142e9 in _call_function_pointer (argtypecount=, argcount=3, resmem=0x7ffc5fb29700, restype=,
atypes=, avalues=, pProc=0x7f5a03f77060 <llmodel_loadModel(llmodel_model, char const*, int)>, flags=)
at /usr/src/python3.11-3.11.6-3/Modules/_ctypes/callproc.c:923
#11 _ctypes_callproc (pProc=, argtuple=, flags=, argtypes=, restype=,
checker=) at /usr/src/python3.11-3.11.6-3/Modules/_ctypes/callproc.c:1262
I'm using Ubuntu, Python 3.11.6, no GPU used here.
The text was updated successfully, but these errors were encountered: