Faster loading of the model #85

kig · 2023-03-13T08:04:28Z

I was playing with the 65B model, and it took a minute to read the files. If you wrap the model loader loop with a #pragma omp parallel for and add -fopenmp to the compiler flags, you can drop it to 18 seconds.

The text was updated successfully, but these errors were encountered:

ggerganov · 2023-03-13T08:14:52Z

Great idea. We prefer to not use -fopenmp.
The implementation should use #include <thread>

kassane · 2023-03-15T17:08:48Z

and TBB?
https://github.com/oneapi-src/oneTBB - lic: Apache

I remember that the mold linker project also uses it.

ggerganov · 2023-03-15T20:44:53Z

Not familiar with TBB, but most likely the answer is no

kig · 2023-03-16T01:49:33Z

I have some experiments with optimizing large file read I/O in https://gist.github.com/kig/357a4193be54915d142f1db6063bc929 and https://github.com/kig/fast_read_optimizer if you want to overkill it...

maxtriano · 2023-06-07T14:35:41Z

Has this been implemented yet?

…lama Export llama without llama

ggerganov added enhancement New feature or request good first issue Good for newcomers performance Speed related topics labels Mar 13, 2023

ggerganov pinned this issue Mar 13, 2023

setzer22 mentioned this issue Mar 15, 2023

Good ideas from llama.cpp rustformers/llm#15

Closed

6 tasks

gjmulder unpinned this issue Mar 20, 2023

philpax mentioned this issue Mar 26, 2023

Parallel loading of the model tensors rustformers/llm#79

Open

ggerganov closed this as completed Jul 28, 2023

rooprob pushed a commit to rooprob/llama.cpp that referenced this issue Aug 2, 2023

Merge pull request ggerganov#85 from python273/export-llama-without-l…

5bcd19a

…lama Export llama without llama

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster loading of the model #85

Faster loading of the model #85

kig commented Mar 13, 2023

ggerganov commented Mar 13, 2023

kassane commented Mar 15, 2023

ggerganov commented Mar 15, 2023

kig commented Mar 16, 2023

maxtriano commented Jun 7, 2023

Faster loading of the model #85

Faster loading of the model #85

Comments

kig commented Mar 13, 2023

ggerganov commented Mar 13, 2023

kassane commented Mar 15, 2023

ggerganov commented Mar 15, 2023

kig commented Mar 16, 2023

maxtriano commented Jun 7, 2023