Model representation.. aren't weights layers actually are "images"? #6338

drazdra · 2024-03-27T07:53:38Z

drazdra
Mar 27, 2024

i was thinking about format and realized that layers with weight ideally fit all the existing "images" libraries. in other words, there is a hell lot of algorithms for image compression, including quantizing up to N bits, as well as lossless ones.

couldn't we store model weights with image libs, like png or webp, etc and get good results? possibly, even accessing values without fully unpacking, reducing greatly memory footprint but paying with cpu for accessing values?

i feel that image compression algorithms should work well on models, keeping the whole "picture" of the model, while reducing its size. it's very interesting to see how it compares to QLORA, etc. and, perhaps, it might allow running much bigger models on smaller memory without having to write a lot of new code, as it's already there for images?

what do you think?

clort81 · 2024-05-25T22:14:01Z

clort81
May 25, 2024

Lossy image compression works by transforming to a representation that looks similar to the human eye, but a model layer compressed to jpeg would break.

0 replies

mofosyne · 2024-05-26T00:51:21Z

mofosyne
May 26, 2024
Collaborator

There are lossless image encoding scheme. E.g. png has a lossless mode.

However isn't llamafile relying on memory mapping these area? If compressed you would have to decompress it to a staging area like a ram?

0 replies

jelspace · 2024-05-26T06:02:14Z

jelspace
May 26, 2024

https://github.com/xinntao/Real-ESRGAN

0 replies

drazdra · 2024-05-26T07:14:10Z

drazdra
May 26, 2024
Author

the concept is the same, it all is just a list of values, be that brightness or weights. it just gets leveled to a certain value due to some algorithm, just as it happens with reducing precision through existing weight quantizations. but images have not just a lot of quantization methods but also a lot of methods to compress the values.

yes, it would require to decompress on the fly, but that's a balance between the max size of a model you can fit into fast memory vs speed of compute you have. yes, you would have to spend compute on decompression when weights go to the chip.

however, the important thing here is that modern GPUs do have hardware support for decompressing images, so it can happen very fast, allowing to keep in a fast memory MUCH bigger models at the loss of some precision (just as we have with quantization). or it may allow us to keep in a fast memory lossless compression of weights which will still allow loading of much bigger models without ANY loss of precision, that is in comparison to the uncompressed methods used nowadays.

0 replies

mofosyne · 2024-05-26T10:29:26Z

mofosyne
May 26, 2024
Collaborator

Oh so experimenting with jpeg style lossy frequency based compression? Interesting. Well there is a bunch of assumptions we would need to investigate first, ergo is the position of weights in an X dimension array has a spatial relations that won't be significantly impacted by having stuff like 'high frequency' information being discarded.

Certainly needs a proof of concept first that this won't break anything first. Has anyone tested the idea at least in the python ML / huggingface community (as python experimentation with pytorch is likely a bit easier than testing it in here)

1 reply

drazdra May 26, 2024
Author

both lossy and lossless versions are interesting in my opinion, besides, there is also png and other formats for image compression that are just ready to try out there.

unfortunately, i'm totally busy with my own project right now to try my idea, and i don't know C, so it would require a lot of time to spend on learning before i could actually write something to experiment with. That's why i'm just sharing the idea here in a hope that somebody will join and implement it.

after all, i believe that weights will be compressed soon enough, it's pretty obvious that using the weights we use now is extremely strange.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model representation.. aren't weights layers actually are "images"? #6338

{{title}}

Replies: 5 comments 1 reply

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Model representation.. aren't weights layers actually are "images"? #6338

drazdra Mar 27, 2024

Replies: 5 comments · 1 reply

clort81 May 25, 2024

mofosyne May 26, 2024 Collaborator

jelspace May 26, 2024

drazdra May 26, 2024 Author

mofosyne May 26, 2024 Collaborator

drazdra May 26, 2024 Author

drazdra
Mar 27, 2024

Replies: 5 comments 1 reply

clort81
May 25, 2024

mofosyne
May 26, 2024
Collaborator

jelspace
May 26, 2024

drazdra
May 26, 2024
Author

mofosyne
May 26, 2024
Collaborator

drazdra May 26, 2024
Author