This repository has been archived by the owner on Mar 19, 2021. It is now read-only.

Reduce model load time with quantized embeddings

danieldk released this 09 Oct 13:51

· 67 commits to master since this release

0.7.0

8f5628a

This release contains one large change: the loading of quantized models is speeded up by computing the unknown word embedding as an avarage of the subquantizers, rather than an average of all in-vocab word embeddings.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce model load time with quantized embeddings