Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error loading model #9

Closed
mateenmalik opened this issue May 22, 2023 · 6 comments
Closed

Error loading model #9

mateenmalik opened this issue May 22, 2023 · 6 comments

Comments

@mateenmalik
Copy link

Hi,

I get the following error when I try to load the model:

python3.10/site-packages/ctransformers/lib/basic/libctransformers.so: cannot open shared object file: No such file or directory

using:
llm = AutoModelForCausalLM.from_pretrained('/models/gpt-2-1558M-ggml/ggml-model-q4_1.bin', model_type='gpt2', lib='basic')

I am running this on aarch64 Ubuntu 22.04 system

Please let me know how to fix this.

Thank you.

@marella
Copy link
Owner

marella commented May 22, 2023

Hi,

Precompiled libs are not available for ARM processors. If you remove lib='basic', you should get an error "The current platform is not supported."

Can you please run the following command and let me know its output:

python3 -c 'import platform; print(platform.processor())'

You can build the library from source to make it work:

git clone --recurse-submodules https://github.com/marella/ctransformers
cd ctransformers
./scripts/build.sh

The compiled binary will be located at build/lib/libctransformers.so which can be used as:

llm = AutoModelForCausalLM.from_pretrained(..., lib='/path/to/ctransformers/build/lib/libctransformers.so')

Please let me know if this works.

@mateenmalik
Copy link
Author

Hi Marella,

Thanks for the reply.

I installed from source as you advised.
It works :)

Thanks again for the good work you are doing and helping people with AI and LLM's. Kudos to you :).

@xdevfaheem
Copy link

xdevfaheem commented May 23, 2023

Hey @marella You are Awesome Man! I mean really... I Tried So Many FOSS, llama.cpp, gpt4all library, rwkv.cpp ... nothing gave me the infrence speed and low ram usage which ctransformers gave. i wonder what may be the reason to this?

please don't stop improving this!

@mateenmalik
Copy link
Author

mateenmalik commented May 24, 2023

Hi Marella,

Just wondering if ctransformers can be used with Nvidia Triton Inference Server (https://developer.nvidia.com/nvidia-triton-inference-server) for inference using both CPU and GPU,
as triton will take care of batching, concurrent model execution, support for GPU & CPU, it will maximizes performance and reduces end-to-end latency.

I guess a custom backend has to be created for this https://github.com/triton-inference-server/backend/blob/main/examples/README.md

Example of minimal backend (for your quick reference):
https://github.com/triton-inference-server/backend/blob/main/examples/backends/minimal/src/minimal.cc

I believe an enterprise level inference server which can support ggml models and langchain will benefit the community, enterprise and the environment.

Looking forward for your views on this.

Thank you.

@marella
Copy link
Owner

marella commented May 24, 2023

Thanks again for the good work you are doing and helping people with AI and LLM's. Kudos to you :).

Thanks @mateenmalik, I will add the build instructions to README and will see if I can further simplify this.

I believe an enterprise level inference server which can support ggml models and langchain will benefit the community, enterprise and the environment.

Currently the GGML library is not ready for production use and changes rapidly, so integrating it with a enterprise level server might not be feasible. Also it's main goal is to run large models on CPU on consumer hardware without the need for a GPU. However this may change over time as the GGML library evolves.


Hey @marella You are Awesome Man! I mean really... I Tried So Many FOSS, llama.cpp, gpt4all library, rwkv.cpp ... nothing gave me the infrence speed and low ram usage which ctransformers gave. i wonder what may be the reason to this?

Thanks @TheFaheem, one reason for performance could be that most libraries don't enable AVX2 by default but this library enables it by default and asks users to switch to 'avx' or 'basic' versions if AVX2 doesn't work.

@Ajayvenki
Copy link

For some reason this solution did not work. I tried all the steps mentioned by @marella
It just generated the .dylib file and i could not find .so after the build.sh execution. Screen shot attached.

ctransformers

(venv) (base) ajayvenkatesan@Ajays-Air sample % python3 -c 'import platform; print(platform.processor())'
arm

Is there anything i could do with this to solve ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants