Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeated prompt segfaults on 61th iteration #22

Open
jonppe opened this issue Jan 26, 2024 · 1 comment
Open

Repeated prompt segfaults on 61th iteration #22

jonppe opened this issue Jan 26, 2024 · 1 comment

Comments

@jonppe
Copy link

jonppe commented Jan 26, 2024

Simple script like:

import llm
model = llm.get_model("orca-mini-3b-gguf2-q4_0")
for i in range(70):
    print(i, model.prompt("How are you today?"))

seems to always crash on the 61th prompt() call.
It doesn't seem to be related to running out of memory but something else.

I'm not quite sure if this is actually llm-gpt4all issue or issue in gpt4all or even in llama.cpp.
But at least, I didn't see issues when using gpt4-all directly (at least the following version works):

from gpt4all import GPT4All
model = GPT4All(MODEL)
for i in range(70)
    print(model.generate("How are you", max_tokens=5))

Anyway, the gpt4-all Python API behaves quite a bit differently here.
E.g., the llm-gpt4all re-creates LLModel objects in Python for each prompt.

The coredump shows that the ctx variable seen by the C++ code is null but how that exactly happens:

Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f5a024a8ac4 in ggml_new_object (ctx=ctx@entry=0x0, type=type@entry=GGML_OBJECT_GRAPH, size=262440)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/ggml.c:2430
2430 struct ggml_object * obj_cur = ctx->objects_end;
(gdb) bt
#0 0x00007f5a024a8ac4 in ggml_new_object (ctx=ctx@entry=0x0, type=type@entry=GGML_OBJECT_GRAPH, size=262440)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/ggml.c:2430
#1 0x00007f5a024cc3c2 in ggml_new_graph_custom (ctx=0x0, size=8192, grads=false)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/ggml.c:15834
#2 0x00007f5a02488bae in llm_build_context::build_llama (this=this@entry=0x7ffc5fb27d60)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/llama.cpp:4326
#3 0x00007f5a024606c4 in llama_build_graph (lctx=..., batch=...) at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/llama.cpp:6191
#4 0x00007f5a0246e405 in llama_new_context_with_model (model=, params=...)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llama.cpp-mainline/llama.cpp:9514
#5 0x00007f5a02456a98 in LLamaModel::loadModel (this=0x419eed0, modelPath="/home/johannes/.cache/gpt4all/orca-mini-3b-gguf2-q4_0.gguf", n_ctx=)
at /home/johannes/ai/llm/gpt4all/gpt4all-backend/llamamodel.cpp:215
#6 0x00007f5a03f770df in llmodel_loadModel (model=, model_path=0x7f5a02acbdd0 "/home/johannes/.cache/gpt4all/orca-mini-3b-gguf2-q4_0.gguf",
n_ctx=2048) at /usr/include/c++/13/bits/basic_string.tcc:238
#7 0x00007f5a03f898b6 in ffi_call_unix64 () at ../src/x86/unix64.S:104
#8 0x00007f5a03f8634d in ffi_call_int (cif=cif@entry=0x7ffc5fb297f0, fn=, rvalue=, avalue=,
closure=closure@entry=0x0) at ../src/x86/ffi64.c:673
#9 0x00007f5a03f88f33 in ffi_call (cif=cif@entry=0x7ffc5fb297f0, fn=fn@entry=0x7f5a03f77060 <llmodel_loadModel(llmodel_model, char const*, int)>,
rvalue=rvalue@entry=0x7ffc5fb29700, avalue=) at ../src/x86/ffi64.c:710
#10 0x00007f5a042142e9 in _call_function_pointer (argtypecount=, argcount=3, resmem=0x7ffc5fb29700, restype=,
atypes=, avalues=, pProc=0x7f5a03f77060 <llmodel_loadModel(llmodel_model, char const*, int)>, flags=)
at /usr/src/python3.11-3.11.6-3/Modules/_ctypes/callproc.c:923
#11 _ctypes_callproc (pProc=, argtuple=, flags=, argtypes=, restype=,
checker=) at /usr/src/python3.11-3.11.6-3/Modules/_ctypes/callproc.c:1262

I'm using Ubuntu, Python 3.11.6, no GPU used here.

@Rubiel1
Copy link

Rubiel1 commented Mar 6, 2024

Hi,
I use Fedora 38, python 3.11.8, no GPU and have the same problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants