Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider including small and compressed language models as candidates? #1

Open
GeneZC opened this issue Sep 25, 2024 · 1 comment
Open

Comments

@GeneZC
Copy link

GeneZC commented Sep 25, 2024

Along the development of small language models, compressed language models play crucial roles as well.

Typical representatives (in time order) would be:

  1. sheared-llama (https://arxiv.org/abs/2310.06694), which is a pruned language model from llama.
  2. minima (https://arxiv.org/abs/2311.07052), which is a distilled language model from llama.
  3. gemma-2-2b (https://arxiv.org/abs/2408.00118), which is a distilled language model.
  4. minitron-4b (https://arxiv.org/abs/2408.11796), which is a distilled language model from llama.
    etc.

I believe including discussions related to above small language models would make this survey even stronger : )

@Luzhenyan
Copy link
Collaborator

Thanks for your valuable suggestion! We will continue to update our survey and include discussions on compressed language models in future versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants