Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster loading of the model #85

Closed
kig opened this issue Mar 13, 2023 · 5 comments
Closed

Faster loading of the model #85

kig opened this issue Mar 13, 2023 · 5 comments
Labels
enhancement New feature or request good first issue Good for newcomers performance Speed related topics

Comments

@kig
Copy link

kig commented Mar 13, 2023

I was playing with the 65B model, and it took a minute to read the files. If you wrap the model loader loop with a #pragma omp parallel for and add -fopenmp to the compiler flags, you can drop it to 18 seconds.

@ggerganov ggerganov added enhancement New feature or request good first issue Good for newcomers performance Speed related topics labels Mar 13, 2023
@ggerganov
Copy link
Owner

Great idea. We prefer to not use -fopenmp.
The implementation should use #include <thread>

@ggerganov ggerganov pinned this issue Mar 13, 2023
@kassane
Copy link
Contributor

kassane commented Mar 15, 2023

and TBB?
https://github.com/oneapi-src/oneTBB - lic: Apache

I remember that the mold linker project also uses it.

@ggerganov
Copy link
Owner

Not familiar with TBB, but most likely the answer is no

@kig
Copy link
Author

kig commented Mar 16, 2023

I have some experiments with optimizing large file read I/O in https://gist.github.com/kig/357a4193be54915d142f1db6063bc929 and https://github.com/kig/fast_read_optimizer if you want to overkill it...

@maxtriano
Copy link

Has this been implemented yet?

rooprob pushed a commit to rooprob/llama.cpp that referenced this issue Aug 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers performance Speed related topics
Projects
None yet
Development

No branches or pull requests

4 participants