Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for 4 bit Quantization #580

Open
vikigenius opened this issue Mar 17, 2023 · 3 comments
Open

Support for 4 bit Quantization #580

vikigenius opened this issue Mar 17, 2023 · 3 comments
Labels

Comments

@vikigenius
Copy link
Contributor

Language model progress has been rapid recently and with the llama weights being released, so much progress is being made on the c++ side

https://github.com/ggerganov/llama.cpp

I see that fp16 is on the roadmap soon.

But it might also be a good idea to consider support for 4 bit quantization and related techniques. Is that something that will be considered?

@coreylowman
Copy link
Owner

Does anyone know how they represent tensors with 4 bit data? Would this be some packed structure where they store 2 i4s with a u8?

struct i4x2(u8)

@nkoppel
Copy link
Contributor

nkoppel commented Mar 21, 2023

Here is a relevant description of the implementation llama.cpp uses to represent floats in 4 bits. It essentially boils down to storing some number of 4 bit integers along with a f32 scaling factor and an optional f32 offset. From what I have read of source code, it seems possible to do a lot of the math highly efficiently on the cpu using SIMD packed 8/16 bit integers, and without touching floating point at all.

While this representation is excellent for specialized inference libraries, I don't think that it's practical for a generalist library like dfdx, because dfdx must deal with strided indexing and must support cuda. Furthermore, I'm not sure how efficient we could make operations on this representation, considering that simd support in stable rust is very limited.

@vikigenius
Copy link
Contributor Author

Yep, I was also looking into this, this would be very nice to have, but I am not sure if we can make it even remotely approach the efficiency of what the C++ people are doing, considering how general purpose dfdx is.

I think the SIMD concerns are not that big of a deal, since dfdx heavily relies on nightly for some features already anyway and the SIMD support there is ok from what I have seen so far.

But I take your point on how specialized this is, and I am not sure if it is worth the effort to have this representation as an option for dfdx tensors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants