Replies: 1 comment 1 reply
-
I believe you mean something similar to what I have suggested in #7012 . 🙂 |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Llama-3 70B has been self-merged by @mlabonne into a 120B model. Someone reported that it performs quite well. He even apologized in advance if AGI is achieved by duplicating random layers. On the other hand, others had tried to delete some layers, such as here.
Layer duplicating or removal can be done on-the-fly. So I do it in chatllm.cpp:
foldl/chatllm.cpp@fb8690c
Doc: https://github.com/foldl/chatllm.cpp/blob/master/docs/fun.md#layer-shuffling
I think it would be nice for llama.cpp to have this functionality, too. Why? It is fun, isn't it?
Beta Was this translation helpful? Give feedback.
All reactions