Repeated prompt segfaults on 61th iteration #22

jonppe · 2024-01-26T07:24:21Z

Simple script like:

import llm
model = llm.get_model("orca-mini-3b-gguf2-q4_0")
for i in range(70):
    print(i, model.prompt("How are you today?"))

seems to always crash on the 61th prompt() call.
It doesn't seem to be related to running out of memory but something else.

I'm not quite sure if this is actually llm-gpt4all issue or issue in gpt4all or even in llama.cpp.
But at least, I didn't see issues when using gpt4-all directly (at least the following version works):

from gpt4all import GPT4All
model = GPT4All(MODEL)
for i in range(70)
    print(model.generate("How are you", max_tokens=5))

Anyway, the gpt4-all Python API behaves quite a bit differently here.
E.g., the llm-gpt4all re-creates LLModel objects in Python for each prompt.

The coredump shows that the ctx variable seen by the C++ code is null but how that exactly happens:

Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f5a024a8ac4 in ggml_new_object (ctx=ctx@entry=0x0, type=type@entry=GGML_OBJECT_GRAPH, size=262440)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/ggml.c:2430
2430 struct ggml_object * obj_cur = ctx->objects_end;
(gdb) bt
#0 0x00007f5a024a8ac4 in ggml_new_object (ctx=ctx@entry=0x0, type=type@entry=GGML_OBJECT_GRAPH, size=262440)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/ggml.c:2430
#1 0x00007f5a024cc3c2 in ggml_new_graph_custom (ctx=0x0, size=8192, grads=false)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/ggml.c:15834
#2 0x00007f5a02488bae in llm_build_context::build_llama (this=this@entry=0x7ffc5fb27d60)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/llama.cpp:4326
#3 0x00007f5a024606c4 in llama_build_graph (lctx=..., batch=...) at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/llama.cpp:6191
#4 0x00007f5a0246e405 in llama_new_context_with_model (model=, params=...)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/llama.cpp:9514
#5 0x00007f5a02456a98 in LLamaModel::loadModel (this=0x419eed0, modelPath="/home/johannes/.cache/gpt4all/orca-mini-3b-gguf2-q4_0.gguf", n_ctx=)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llamamodel.cpp:215
#6 0x00007f5a03f770df in llmodel_loadModel (model=, model_path=0x7f5a02acbdd0 "/home/johannes/.cache/gpt4all/orca-mini-3b-gguf2-q4_0.gguf",
n_ctx=2048) at /usr/include/c++/13/bits/basic_string.tcc:238
#7 0x00007f5a03f898b6 in ffi_call_unix64 () at ../src/x86/unix64.S:104
#8 0x00007f5a03f8634d in ffi_call_int (cif=cif@entry=0x7ffc5fb297f0, fn=, rvalue=, avalue=,
closure=closure@entry=0x0) at ../src/x86/ffi64.c:673
#9 0x00007f5a03f88f33 in ffi_call (cif=cif@entry=0x7ffc5fb297f0, fn=fn@entry=0x7f5a03f77060 <llmodel_loadModel(llmodel_model, char const*, int)>,
rvalue=rvalue@entry=0x7ffc5fb29700, avalue=) at ../src/x86/ffi64.c:710
#10 0x00007f5a042142e9 in _call_function_pointer (argtypecount=, argcount=3, resmem=0x7ffc5fb29700, restype=,
atypes=, avalues=, pProc=0x7f5a03f77060 <llmodel_loadModel(llmodel_model, char const*, int)>, flags=)
at /usr/src/python3.11-3.11.6-3/Modules/_ctypes/callproc.c:923
#11 _ctypes_callproc (pProc=, argtuple=, flags=, argtypes=, restype=,
checker=) at /usr/src/python3.11-3.11.6-3/Modules/_ctypes/callproc.c:1262

I'm using Ubuntu, Python 3.11.6, no GPU used here.

The text was updated successfully, but these errors were encountered:

Rubiel1 · 2024-03-06T05:30:01Z

Hi,
I use Fedora 38, python 3.11.8, no GPU and have the same problem.

jonppe mentioned this issue Jan 26, 2024

can't open file '/main.py' on container jonppe/funsearch#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repeated prompt segfaults on 61th iteration #22

Repeated prompt segfaults on 61th iteration #22

jonppe commented Jan 26, 2024

Rubiel1 commented Mar 6, 2024

Repeated prompt segfaults on 61th iteration #22

Repeated prompt segfaults on 61th iteration #22

Comments

jonppe commented Jan 26, 2024

Rubiel1 commented Mar 6, 2024