Merging quantized model with pass through #184

dnhkng · 2024-03-01T11:05:44Z

Is there a way to do this?

I understand why F16 is required for linear and slerp, but can we do passthrough of quantized layer, as currently it necessary to go via huge models and requantize, which is a big pain point.

cg123 · 2024-03-07T20:11:59Z

It would be possible to write a standalone script to do this. I don't think working with quantized GGUF models in mergekit-yaml makes much sense, as there are very few operations that it would be able to actually support. Just stacking layers would be reasonable though. I'll add this to my list of things to investigate when I have the time. Thanks for the suggestion!

dnhkng · 2024-03-07T21:42:28Z

No need for GGUF now, as it's being done in a PR

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merging quantized model with pass through #184

Merging quantized model with pass through #184

dnhkng commented Mar 1, 2024

cg123 commented Mar 7, 2024

dnhkng commented Mar 7, 2024

Merging quantized model with pass through #184

Merging quantized model with pass through #184

Comments

dnhkng commented Mar 1, 2024

cg123 commented Mar 7, 2024

dnhkng commented Mar 7, 2024