Add FLUX nf4 quantization support #525

agv-zx82 · 2024-12-17T06:12:16Z

Please add FLUX nf4 quantization support

stduhpf · 2024-12-17T10:27:25Z

I'm not familiar with nf4. From what I managed to understand it's basically just 4 bit fixed point numbers with a range of [-1,1]? Or am I misunderstanding?

cb88 · 2024-12-19T20:30:56Z

I'm not familiar with nf4. From what I managed to understand it's basically just 4 bit fixed point numbers with a range of [-1,1]? Or am I misunderstanding?

Not fixed point apparently. Just a very limited non standard float.

https://huggingface.co/blog/4bit-transformers-bitsandbytes

stduhpf · 2024-12-20T15:58:24Z

https://www.ai-bites.net/qlora-train-your-llms-on-a-single-gpu/#normalfloat

This looks somewhat similar to GGML's IQ4_NL type in principle? It's not quite the same though.

Green-Sky · 2024-12-20T19:40:30Z

Pretty sure all was need is the ability to read it and convert to whatever quant the user wants.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add FLUX nf4 quantization support #525

Add FLUX nf4 quantization support #525

agv-zx82 commented Dec 17, 2024

stduhpf commented Dec 17, 2024

cb88 commented Dec 19, 2024

stduhpf commented Dec 20, 2024

Green-Sky commented Dec 20, 2024

Add FLUX nf4 quantization support #525

Add FLUX nf4 quantization support #525

Comments

agv-zx82 commented Dec 17, 2024

stduhpf commented Dec 17, 2024

cb88 commented Dec 19, 2024

stduhpf commented Dec 20, 2024

Green-Sky commented Dec 20, 2024