Awesome opengpt

Awesome opengpt
English
- Foundation
  - OLMo
  - Phi
  - Mistral
  - StripedHyena-7B
  - BLOOM
  - Mosaic pretrained transformers (MPT)
  - h2oGPT
  - LLaMA
  - Falcon
  - Pythia
  - Other
- Finetuning
  - Vicuna
  - Alpaca
  - Dolly
  - Misc
Mulitlingual (chinese)
- Foundation
  - Yi-01
  - InterLM
  - DeepSeek
  - Xverse
  - Qwen
  - Baichuan
  - ChatGLM
- Finetuning
- Other
Extra
Toolkits
Extra reference

English

Foundation

Fetching Title#pwot
GEB-1.3B: Open Lightweight Large Language Model, arXiv, 2406.09900, arxiv, pdf, cication: -1

Jie Wu, Yufeng Zhu, Lei Shen, Xuqing Lu
NVIDIA Releases Open Synthetic Data Generation Pipeline for Training Large Language Models | NVIDIA Blog

· (research.nvidia)
Nemotron-4-340B-Instruct - nvidia 🤗
WizardLM-2-8x22B - alpindale 🤗

· (wizardlm.github) · (WizardLM - victorsungo)
MAP-NEO - multimodal-art-projection
Snowflake Arctic: The Best LLM for Enterprise AI — Efficiently Intelligent, Truly Open

· (snowflake-arctic - Snowflake-Labs) · (huggingface) · (twitter)
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework, arXiv, 2404.14619, arxiv, pdf, cication: -1

Sachin Mehta, Mohammad Hossein Sekhavat, Qingqing Cao, Maxwell Horton, Yanzi Jin, Chenfan Sun, Iman Mirzadeh, Mahyar Najibi, Dmitry Belenko, Peter Zatloukal · (huggingface) · (corenet - apple)
Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models, arXiv, 2404.12387, arxiv, pdf, cication: -1

Aitor Ormazabal, Che Zheng, Cyprien de Masson d'Autume, Dani Yogatama, Deyu Fu, Donovan Ong, Eric Chen, Eugenie Lamprecht, Hai Pham, Isaac Ong
- http://showcase.reka.ai
Rho-1: Not All Tokens Are What You Need, arXiv, 2404.07965, arxiv, pdf, cication: -1

Zhenghao Lin, Zhibin Gou, Yeyun Gong, Xiao Liu, Yelong Shen, Ruochen Xu, Chen Lin, Yujiu Yang, Jian Jiao, Nan Duan · (rho - microsoft)
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies, arXiv, 2404.06395, arxiv, pdf, cication: -1

Shengding Hu, Yuge Tu, Xu Han, Chaoqun He, Ganqu Cui, Xiang Long, Zhi Zheng, Yewei Fang, Yuxiang Huang, Weilin Zhao · (MiniCPM - OpenBMB)
Stable LM 2 1.6B Technical Report, arXiv, 2402.17834, arxiv, pdf, cication: -1

Marco Bellagente, Jonathan Tow, Dakota Mahan, Duy Phung, Maksym Zhuravinskyi, Reshinth Adithyan, James Baicoianu, Ben Brooks, Nathan Cooper, Ashish Datta
stablelm-2-12b - stabilityai 🤗
c4ai-command-r-plus - CohereForAI 🤗
JetMoE: Reaching Llama2 Performance with 0.1M Dollars, arXiv, 2404.07413, arxiv, pdf, cication: -1

Yikang Shen, Zhen Guo, Tianle Cai, Zengyi Qin
JetMoE - myshell-ai

Reaching LLaMA2 Performance with 0.1M Dollars · (research.myshell) · (huggingface) · (qbitai)
Poro 34B and the Blessing of Multilinguality, arXiv, 2404.01856, arxiv, pdf, cication: -1

Risto Luukkonen, Jonathan Burdge, Elaine Zosa, Aarne Talman, Ville Komulainen, Väinö Hatanpää, Peter Sarlin, Sampo Pyysalo · (huggingface)
MicroLlama - keeeeenw

Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget
grok-1 - xai-org

Grok open release · (x) · (huggingface) · (qbitai)

· (huggingface) · (mp.weixin.qq)

· (x) · (qbitai)
c4ai-command-r-v01 - CohereForAI 🤗

· (txt.cohere)

· (huggingface)
miqu-1-70b - miqudev 🤗
H2O-Danube-1.8B Technical Report, arXiv, 2401.16818, arxiv, pdf, cication: -1

Philipp Singer, Pascal Pfeiffer, Yauhen Babakhin, Maximilian Jeblick, Nischay Dhankhar, Gabor Fodor, Sri Satish Ambati
Smaug-72B-v0.1 - abacusai 🤗
Smaug-34B-v0.1 - abacusai 🤗
bagel-34b-v0.2 - jondurbin 🤗
TinyLlama: An Open-Source Small Language Model, arXiv, 2401.02385, arxiv, pdf, cication: -1

Peiyuan Zhang, Guangtao Zeng, Tianduo Wang, Wei Lu · (TinyLlama - jzhang38)
TigerBot: An Open Multilingual Multitask LLM, arXiv, 2312.08688, arxiv, pdf, cication: -1

Ye Chen, Wei Cai, Liangmin Wu, Xiaowei Li, Zhanxuan Xin, Cong Fu
DeciLM-7B - Deci 🤗
DeciLM-7B-instruct - Deci 🤗

· (huggingface)
LLM360: Towards Fully Transparent Open-Source LLMs, arXiv, 2312.06550, arxiv, pdf, cication: -1

Zhengzhong Liu, Aurick Qiao, Willie Neiswanger, Hongyi Wang, Bowen Tan, Tianhua Tao, Junbo Li, Yuqi Wang, Suqi Sun, Omkar Pangarkar
- https://www.llm360.ai
GPT4All: An Ecosystem of Open Source Compressed Language Models, arXiv, 2311.04931, arxiv, pdf, cication: -1

Yuvanesh Anand, Zach Nussbaum, Adam Treat, Aaron Miller, Richard Guo, Ben Schmidt, GPT4All Community, Brandon Duderstadt, Andriy Mulyar · (gpt4all - nomic-ai)
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data, arXiv, 2309.11235, arxiv, pdf, cication: -1

Guan Wang, Sijie Cheng, Xianyuan Zhan, Xiangang Li, Sen Song, Yang Liu · (openchat - imoneoi) · (huggingface) · (openchat)
Zephyr: Direct Distillation of LM Alignment, arXiv, 2310.16944, arxiv, pdf, cication: 1

Lewis Tunstall, Edward Beeching, Nathan Lambert, Nazneen Rajani, Kashif Rasul, Younes Belkada, Shengyi Huang, Leandro von Werra, Clémentine Fourrier, Nathan Habib · (alignment-handbook - huggingface)
H2O Open Ecosystem for State-of-the-art Large Language Models, arXiv, 2310.13012, arxiv, pdf, cication: -1

Arno Candel, Jon McKinney, Philipp Singer, Pascal Pfeiffer, Maximilian Jeblick, Chun Ming Lee, Marcos V. Conde
BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model, arXiv, 2309.11568, arxiv, pdf, cication: -1

Nolan Dey, Daria Soboleva, Faisal Al-Khateeb, Bowen Yang, Ribhu Pathria, Hemant Khachane, Shaheer Muhammad, Zhiming, Chen, Robert Myers
OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch, arXiv, 2309.10706, arxiv, pdf, cication: -1

Juntao Li, Zecheng Tang, Yuyang Ding, Pinzheng Wang, Pei Guo, Wangjie You, Dan Qiao, Wenliang Chen, Guohong Fu, Qiaoming Zhu · (openba - opennlg)
XGen-7B Technical Report, arXiv, 2309.03450, arxiv, pdf, cication: 3

Erik Nijkamp, Tian Xie, Hiroaki Hayashi, Bo Pang, Congying Xia, Chen Xing, Jesse Vig, Semih Yavuz, Philippe Laban, Ben Krause
FLM-101B: An Open LLM and How to Train It with $100K Budget, arXiv, 2309.03852, arxiv, pdf, cication: 3

Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Xuying Meng, Siqi Fan, Peng Han, Jing Li, Li Du, Bowen Qin · (huggingface)
adept-inference - persimmon-ai-labs

Inference code for Persimmon-8B
WizardLM - nlpxucan

Family of instruction-following LLMs powered by Evol-Instruct: WizardLM, WizardCoder
FreeWilly2 - stabilityai 🤗
xgen - salesforce

Salesforce open-source LLMs with 8k sequence length.
PolyLM: An Open Source Polyglot Large Language Model, arXiv, 2307.06018, arxiv, pdf, cication: 5

Xiangpeng Wei, Haoran Wei, Huan Lin, Tianhao Li, Pei Zhang, Xingzhang Ren, Mei Li, Yu Wan, Zhiwei Cao, Binbin Xie
A Technical Report for Polyglot-Ko: Open-Source Large-Scale Korean Language Models, arXiv, 2306.02254, arxiv, pdf, cication: -1

Hyunwoong Ko, Kichang Yang, Minho Ryu, Taekyoon Choi, Seungmu Yang, Jiwung Hyun, Sungho Park, Kyubyong Park
Examining User-Friendly and Open-Sourced Large GPT Models: A Survey on Language, Multimodal, and Scientific GPT Models, arXiv, 2308.14149, arxiv, pdf, cication: -1

Kaiyuan Gao, Sunan He, Zhenyu He, Jiacheng Lin, QiZhi Pei, Jie Shao, Wei Zhang · (gpt_alternatives - GPT-Alternatives) · (jiqizhixin)

Llama 3

Welcome Llama 3 - Meta's new open LLM
Site Unreachable
llama3 - meta-llama

The official Meta Llama 3 GitHub site
Meta-Llama-3-8B-Instruct - meta-llama 🤗
Meta-Llama-3-70B-Instruct - meta-llama 🤗
Llama-3-Smaug-8B - abacusai 🤗
Llama-3-8B-16K - mattshumer 🤗
Llama-3-8B-Special-Tokens-Adjusted - astronomer 🤗
Google Colab
Llama-3-8b-64k-PoSE - winglian 🤗
dolphin-2.9-llama3-70b - cognitivecomputations 🤗
Llama-3-8B-Instruct-262k - gradientai 🤗
Meditron: An LLM suite especially suited for low-resource medical settings leveraging Meta Llama
llama-3-8b-256k-PoSE - winglian 🤗
Llama-3-8B-Instruct-Gradient-1048k - gradientai 🤗
Hermes-2-Pro-Llama-3-8B - NousResearch 🤗
Llama3-ChatQA-1.5-8B - nvidia 🤗
Meta-Llama-3-120B-Instruct - mlabonne 🤗
What’s up with Llama 3? Arena data analysis | LMSYS Org
Planning for Distillation of Llama 3 70b -> 4x8b / 25b : r/LocalLLaMA
Llama-3-Refueled - refuelai 🤗
SFR-Instruct-LLaMA-3-8B-R - a Salesforce Collection
Smaug-Llama-3-70B-Instruct - abacusai 🤗

· (reddit)
Higgs-Llama-3-70B - bosonai 🤗
Announcing the Higgs Family of LLMs
Hermes-2-Theta-Llama-3-70B - NousResearch 🤗
Llama-3-Refueled - refuelai 🤗

Jamba

Jamba: A Hybrid Transformer-Mamba Language Model, arXiv, 2403.19887, arxiv, pdf, cication: -1

Opher Lieber, Barak Lenz, Hofit Bata, Gal Cohen, Jhonathan Osin, Itay Dalmedigos, Erez Safahi, Shaked Meirom, Yonatan Belinkov, Shai Shalev-Shwartz · (huggingface)
Jamba-v0.1-chat-multilingual - lightblue 🤗
Jambatypus-v0.1 - mlabonne 🤗

Databricks

Introducing DBRX: A New State-of-the-Art Open LLM | Databricks

· (twitter) · (twitter)
dbrx - databricks

Code examples and resources for DBRX, a large language model developed by Databricks
dbrx-instruct - databricks 🤗

Gemma

Gemma 2 Release - a google Collection

· (x)
gemma-1.1-7b-it - google 🤗
Gemma: Open Models Based on Gemini Research and Technology, arXiv, 2403.08295, arxiv, pdf, cication: -1

Gemma Team, Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivière, Mihir Sanjay Kale, Juliette Love
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context, arXiv, 2403.05530, arxiv, pdf, cication: -1

Machel Reid, Nikolay Savinov, Denis Teplyashin, Dmitry Lepikhin, Timothy Lillicrap, Jean-baptiste Alayrac, Radu Soricut, Angeliki Lazaridou, Orhan Firat, Julian Schrittwieser
Unsloth Fixing Gemma bugs

· (twitter)
gemma_pytorch - google

The official PyTorch implementation of Google's Gemma models
gemma-7b - google 🤗

· (ai.google) · (huggingface)
gemma.cpp - google

lightweight, standalone C++ inference engine for Google's Gemma models.
gemma-peft - 🤗
Understanding, Using, and Finetuning Gemma - a Lightning Studio by sebastian

· (jiqizhixin)
gemma-7b-dolly-chatml - philschmid 🤗
catch-me-if-you-can - cyzgab 🤗
GemMoE-Beta-1 - Crystalcareai 🤗
Hebrew-Gemma-11B - yam-peleg 🤗
Gemma finetuning should be much better now : r/LocalLLaMA
Gemma-10M Technical Overview. Motivation | by Akshgarg | May, 2024 | Medium

· (gemma-2B-10M - mustafaaljadery)

OLMo

OLMo 1.7–7B: A 24 point improvement on MMLU | by AI2 | Apr, 2024 | AI2 Blog
OLMo-1.7-7B - allenai 🤗
OLMo: Accelerating the Science of Language Models, arXiv, 2402.00838, arxiv, pdf, cication: -1

Dirk Groeneveld, Iz Beltagy, Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang · (OLMo - allenai) · (allenai) · (allenai)
OLMo-7B - allenai 🤗
OLMo-7B-Instruct - allenai 🤗

Phi

Phi-3-mini-4k-instruct - microsoft 🤗
Phi-3-vision-128k-instruct - microsoft 🤗
models - 🤗
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone, arXiv, 2404.14219, arxiv, pdf, cication: -1

Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Harkirat Behl
phi-2 - microsoft 🤗
phi-1_5 - microsoft 🤗
phi-1 - microsoft 🤗
phi-2 - randomblock1 🤗
phixtral-4x2_8 - mlabonne 🤗

Mistral

Mistral-7B-Instruct-v0.3 - mistralai 🤗
Cheaper, Better, Faster, Stronger | Mistral AI | Frontier AI in your hands
Fetching Title#dsku
mixtral-8x22b-instruct-oh - fireworks-ai 🤗
Mistral-22B-v0.2 - Vezora 🤗
Mistral-22B-v0.1 - Vezora 🤗
zephyr-orpo-141b-A35b-v0.1 - HuggingFaceH4 🤗

· (huggingface)
Mixtral-8x22B-v0.1 - mistral-community 🤗
hackathon - mistralai-sf24

· (models.mistralcdn) · (jiqizhixin)
Mistral-7B-Instruct-v0.2 - mistralai 🤗
mistral-src - mistralai

Reference implementation of Mistral AI 7B v0.1 model. · (jiqizhixin)
Mixtral of experts | Mistral AI | Open source models
mixtral - 🤗
llama-mistral - dzhulgakov

Inference code for Mistral and Mixtral hacked up into original Llama implementation
DiscoLM-mixtral-8x7b-v2 - DiscoResearch 🤗
Mixtral-8x7B-Instruct-v0.1 - mistralai 🤗

· (mp.weixin.qq)
mixtral-7b-8expert - DiscoResearch 🤗

· (huggingface)
mixtral-8x7b-32kseqlen - someone13574 🤗
mixtral-46.7b-chat - openskyml 🤗
Mixtral-8x7B-v0.1-GPTQ - TheBloke 🤗
Mixtral 8x7B on Fireworks: faster, cheaper, even before the official release | by Fireworks.ai | Dec, 2023 | Medium
MixtralKit - open-compass

A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI
mistral-playground - marcofrodl 🤗
Mixtral-8x7B-Instruct-v0.1-bnb-4bit - ybelkada 🤗
notux-8x7b-v1 - argilla 🤗
mixtral-offloading - dvmazur

Run Mixtral-8x7B models in Colab or consumer desktops
mixtral-test-46.7b-chat - johann22 🤗
Nous-Hermes-2-Mixtral-8x7B-SFT - NousResearch 🤗

· (jiqizhixin)
Nous-Hermes-2-Mixtral-8x7B-DPO - NousResearch 🤗
Nous-Hermes-2-Mixtral-8x7B-DPO-adapter - NousResearch 🤗
miqu-1-70b - miqudev 🤗
Hermes-2-Pro-Mistral-7B - NousResearch 🤗
dolphin-2.8-mistral-7b-v02 - cognitivecomputations 🤗
dolphin-2.8-mistral-7b-v02 - cognitivecomputations 🤗
dolphin-2.9.1-mixtral-1x22b - cognitivecomputations 🤗

StripedHyena-7B

StripedHyena-Hessian-7B - togethercomputer 🤗
StripedHyena-Nous-7B - togethercomputer 🤗

· (together)

BLOOM

BLOOMChat-176B-v1-GPTQ - TheBloke 🤗

Mosaic pretrained transformers (MPT)

GitHub - mosaicml/llm-foundry: LLM training code for MosaicML foundation models

mpt-30b-chat - mosaicml 🤗

h2oGPT

h2oGPT: Democratizing Large Language Models, arXiv, 2306.08161, arxiv, pdf, cication: -1

Arno Candel, Jon McKinney, Philipp Singer, Pascal Pfeiffer, Maximilian Jeblick, Prithvi Prabhu, Jeff Gambera, Mark Landry, Shivam Bansal, Ryan Chesler

LLaMA

LiteLlama-460M-1T - ahxt 🤗

· (jiqizhixin)
Llama-2-7b-chat-mlx - mlx-llama 🤗
TinyLlama - jzhang38

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
llama-recipes - facebookresearch

Examples and recipes for Llama 2 model · (mp.weixin.qq) · (jiqizhixin) · (mp.weixin.qq) · (d7mv45xi4m.feishu)
llama2-13b-orca-8k-3319 - OpenAssistant 🤗
pyllama - juncongmoo

LLaMA: Open and Efficient Foundation Language Models
llama-gpt - getumbrel

A self-hosted, offline, ChatGPT-like chatbot. Powered by Llama 2. 100% private, with no data leaving your device.
LLongMA-2-13b-16k - conceptofmind 🤗
LLongMA-2-13b - conceptofmind 🤗
LLongMA-2-7b-16k - conceptofmind 🤗
Llama 2: an incredible open LLM - by Nathan Lambert
llama2-webui - liltom-eth

Run Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac). Supporting Llama-2-7B/13B/70B with 8-bit, 4-bit. Supporting GPU inference (6 GB VRAM) and CPU inference.
Fine-tune Llama 2 with DPO
Flan-Open-Llama-13b - conceptofmind 🤗
Llama-2 - amitsangani

All the projects related to Llama

Falcon

Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens and 11 languages
falcon-11B - tiiuae 🤗
Spread Your Wings: Falcon 180B is here
Falcon-LLM - Sentdex

Helper scripts and examples for exploring the Falcon LLM models · (huggingface) · (huggingface)

Pythia

[2304.01373] Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

 · ([pythia](https://github.com/EleutherAI/pythia) - EleutherAI) ![Star](https://img.shields.io/github/stars/EleutherAI/pythia.svg?style=social&label=Star)

Other

Timeline of recent major LLM releases (past 2 months) : r/LocalLLaMA
The History of Open-Source LLMs: Imitation and Alignment (Part Three)

· (mp.weixin.qq)
os-llms - blog 🤗
A16Z 刚刚官宣支持8个开源人工智能社区
开源大型语言模型(llm)总结

Finetuning

Vicuna

Flacuna: Unleashing the Problem Solving Power of Vicuna using FLAN Fine-Tuning, arXiv, 2307.02053, arxiv, pdf, cication: 3

Deepanway Ghosal, Yew Ken Chia, Navonil Majumder, Soujanya Poria
FastChat - lm-sys

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and FastChat-T5.

Alpaca

stanford_alpaca - tatsu-lab

Code and documentation to train Stanford's Alpaca models, and generate the data. · (crfm.stanford)

Dolly

dolly - databrickslabs

Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform · (huggingface) · (databricks)

Misc

WizardLM - a microsoft Collection
YamshadowExperiment28-7B - automerger 🤗

· (twitter)
Starling-7B: Increasing LLM Helpfulness & Harmlessness with RLAIF

· (huggingface) · (huggingface)
Beagle14-7B - mlabonne 🤗
Improving Open-Source LLMs - Datasets, Merging and Stacking - The Abacus.AI Blog
CrystalChat - LLM360 🤗
btlm-3b-8k-chat - cerebras 🤗
stablelm-zephyr-3b - stabilityai 🤗

· (huggingface)
smol-7b - rishiraj 🤗
LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion, arXiv, 2306.02561, arxiv, pdf, cication: -1

Dongfu Jiang, Xiang Ren, Bill Yuchen Lin · (huggingface) · (LLM-Blender - yuchenlin)
Intel Neural-Chat 7b: Fine-Tuning on Gaudi2 for Top LLM Performance
Starling-7B: Increasing LLM Helpfulness & Harmlessness with RLAIF
sparse-llama-gsm8k - neuralmagic 🤗
DeciLM-6b - Deci 🤗
GOAT-7B-Community - GOAT-AI 🤗
openchat - imoneoi

OpenChat: Less is More for Open-source Models · (mp.weixin.qq)
GPT-4-LLM - Instruction-Tuning-with-GPT-4

Instruction Tuning with GPT-4
Instruction Tuning with GPT-4, arXiv, 2304.03277, arxiv, pdf, cication: 182

Baolin Peng, Chunyuan Li, Pengcheng He, Michel Galley, Jianfeng Gao · (instruction-tuning-with-gpt-4.github)
deepseek-coder-7b-instruct - deepseek-ai 🤗
UltraChat - thunlp

Large-scale, Informative, and Diverse Multi-round Chat Data (and Models) · (mp.weixin.qq) · (qbitai)
How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources, arXiv, 2306.04751, arxiv, pdf, cication: 40

Yizhong Wang, Hamish Ivison, Pradeep Dasigi, Jack Hessel, Tushar Khot, Khyathi Raghavi Chandu, David Wadden, Kelsey MacMillan, Noah A. Smith, Iz Beltagy · (jiqizhixin) · (open-instruct - allenai)

Mulitlingual (chinese)

Foundation

Xmodel-LM Technical Report, arXiv, 2406.02856, arxiv, pdf, cication: -1

Yichuan Wang, Yang Liu, Yu Yan, Xucheng Huang, Ling Jiang

· (XmodelLM - XiaoduoAILab)
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series, arXiv, 2405.19327, arxiv, pdf, cication: -1

Ge Zhang, Scott Qu, Jiaheng Liu, Chenchen Zhang, Chenghua Lin, Chou Leuang Yu, Danny Pan, Esther Cheng, Jie Liu, Qunshu Lin · (map-neo.github)
Yuan 2.0-M32: Mixture of Experts with Attention Router, arXiv, 2405.17976, arxiv, pdf, cication: -1

Shaohua Wu, Jiangang Luo, Xi Chen, Lingjun Li, Xudong Zhao, Tong Yu, Chao Wang, Yue Wang, Fei Wang, Weixu Qiao

· (huggingface)
ChuXin: 1.6B Technical Report, arXiv, 2405.04828, arxiv, pdf, cication: -1

Xiaomin Zhuang, Yufan Jiang, Qiaozhi He, Zhihua Wu
Tele-FLM Technical Report, arXiv, 2404.16645, arxiv, pdf, cication: -1

Xiang Li, Yiqun Yao, Xin Jiang, Xuezhi Fang, Chao Wang, Xinzhang Liu, Zihan Wang, Yu Zhao, Xin Wang, Yuyao Huang
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model, arXiv, 2404.04167, arxiv, pdf, cication: -1

Xinrun Du, Zhouliang Yu, Songyang Gao, Ding Pan, Yuyang Cheng, Ziyang Ma, Ruibin Yuan, Xingwei Qu, Jiaheng Liu, Tianyu Zheng · (huggingface)
Mengzi3 - Langboat

· (qbitai)
MiniCPM - OpenBMB

MiniCPM-2.4B: An end-side LLM outperforms Llama2-13B.

· (huggingface)

· (shengdinghu.notion)
iFlytekSpark-13B: 讯飞星火开源-13B（iFlytekSpark-13B）
Orion-14B: Open-source Multilingual Large Language Models, arXiv, 2401.12246, arxiv, pdf, cication: -1

Du Chen, Yi Huang, Xiaopu Li, Yongqiang Li, Yongqiang Liu, Haihui Pan, Leichao Xu, Dacheng Zhang, Zhipeng Zhang, Kun Han
Orion - OrionStarAI

Orion-14B is a family of models includes a 14B foundation LLM, and a series of models: a chat model, a long context model, a quantized model, a RAG fine-tuned model, and an Agent fine-tuned model. Orion-14B 系列模型包括一个具有140亿参数的多语言基座大模型以及一系列相关的衍生模型，包括对话模型，长文本模型，量化模型，RAG微调模型，Agent微调模型等。 · (Orion - OrionStarAI)
TeleChat Technical Report, arXiv, 2401.03804, arxiv, pdf, cication: -1

Zihan Wang, Xinzhang Liu, Shixuan Liu, Yitong Yao, Yuyao Huang, Zhongjiang He, Xuelong Li, Yongxiang Li, Zhonghao Che, Zhaoxi Zhang
YAYI 2: Multilingual Open-Source Large Language Models, arXiv, 2312.14862, arxiv, pdf, cication: -1

Yin Luo, Qingchao Kong, Nan Xu, Jia Cao, Bao Hao, Baoyu Qu, Bo Chen, Chao Zhu, Chenyang Zhao, Donglei Zhang
SeaLLMs -- Large Language Models for Southeast Asia, arXiv, 2312.00738, arxiv, pdf, cication: -1

Xuan-Phi Nguyen, Wenxuan Zhang, Xin Li, Mahani Aljunied, Qingyu Tan, Liying Cheng, Guanzheng Chen, Yue Deng, Sen Yang, Chaoqun Liu

· (SeaLLMs - DAMO-NLP-SG)
YUAN 2.0: A Large Language Model with Localized Filtering-based Attention, arXiv, 2311.15786, arxiv, pdf, cication: -1

Shaohua Wu, Xudong Zhao, Shenling Wang, Jiangang Luo, Lingjun Li, Xi Chen, Bing Zhao, Wei Wang, Tong Yu, Rongguo Zhang · (Yuan-2.0 - IEIT-Yuan)
Ziya2: Data-centric Learning is All LLMs Need, arXiv, 2311.03301, arxiv, pdf, cication: -1

Ruyi Gan, Ziwei Wu, Renliang Sun, Junyu Lu, Xiaojun Wu, Dixiang Zhang, Kunhao Pan, Ping Yang, Qi Yang, Jiaxing Zhang

· (huggingface)
Skywork: A More Open Bilingual Foundation Model, arXiv, 2310.19341, arxiv, pdf, cication: 1

Tianwen Wei, Liang Zhao, Lichang Zhang, Bo Zhu, Lijie Wang, Haihua Yang, Biye Li, Cheng Cheng, Weiwei Lü, Rui Hu · (jiqizhixin) · (qbitai) · (skywork - skyworkai)
Aquila2 - FlagAI-Open

The official repo of Aquila2 series proposed by BAAI, including pretrained & chat large language models. · (mp.weixin.qq)
ColossalAI - hpcaitech

Making large AI models cheaper, faster and more accessible · (qbitai)
VisCPM - OpenBMB

基于CPM基础模型的中英双语多模态大模型系列 · (jiqizhixin)

Yi-01

Yi-1.5 (2024/05) - a 01-ai Collection
Yi: Open Foundation Models by 01.AI, arXiv, 2403.04652, arxiv, pdf, cication: -1

01. AI, :, Alex Young, Bei Chen, Chao Li, Chengen Huang, Ge Zhang, Guanwei Zhang, Heng Li, Jiangcheng Zhu
Yi-9B - 01-ai 🤗
Yi - 01-ai

A series of large language models trained from scratch by developers @01-ai

· (jiqizhixin)

InterLM

Fetching Title#35ag
InternLM-Math - InternLM

· (huggingface)
internlm2-math-plus-mixtral8x22b - internlm 🤗
InternLM2 Technical Report, arXiv, 2403.17297, arxiv, pdf, cication: -1

Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu
InternLM - InternLM

InternLM has open-sourced a 7 billion parameter base model, a chat model tailored for practical scenarios and the training system. · (qbitai) · (qbitai)

· (mp.weixin.qq) · (huggingface)
internlm2-chat-7b - internlm 🤗

DeepSeek

DeepSeek-V2 - deepseek-ai

· (DeepSeek-V2 - deepseek-ai)
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence, arXiv, 2401.14196, arxiv, pdf, cication: -1

Daya Guo, Qihao Zhu, Dejian Yang, Zhenda Xie, Kai Dong, Wentao Zhang, Guanting Chen, Xiao Bi, Y. Wu, Y. K. Li
DeepSeek-MoE - deepseek-ai

· (huggingface)
DeepSeek-LLM - deepseek-ai

DeepSeek LLM: Let there be answers · (huggingface) · (mp.weixin.qq)
Fetching Title#j2s8

Xverse

GitHub - xverse-ai/XVERSE-65B: XVERSE-65B: A multilingual large language model developed by XVERSE Technology Inc.

· (huggingface) · (jiqizhixin)
XVERSE-13B - xverse-ai

XVERSE-13B: A multilingual large language model developed by XVERSE Technology Inc. · (qbitai) · (huggingface)
xverse/XVERSE-13B-256K · Hugging Face

Qwen

Qwen2 - QwenLM

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.
Hello Qwen2 | Qwen
Qwen2-72B-Instruct - Qwen 🤗
Qwen1.5-110B-Chat · 模型库
Qwen1.5-110B-Chat-demo - Qwen 🤗
Qwen1.5-32B - Qwen 🤗
Qwen1.5-32B: Fitting the Capstone of the Qwen1.5 Language Model Series | Qwen
Qwen-Agent - QwenLM

Agent framework and applications built upon Qwen1.5, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
Qwen1.5 - QwenLM

Qwen1.5 is the improved version of Qwen, the large language model series developed by Qwen team, Alibaba Cloud. · (qwenlm.github)
Qwen - QwenLM

The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
- 国产720亿参数开源免费模型来了！对标Llama2 70B，一手实测在此 | 量子位
Qwen-7B - QwenLM

The official repo of Qwen-7B (通义千问-7B) chat & pretrained large language model proposed by Alibaba Cloud. · (mp.weixin.qq) · (qbitai)
Qwen1.5-MoE-A2.7B - Qwen 🤗
qwen1.5-MoE-A2.7B-Chat-demo - Qwen 🤗
Qwen-72B-Chat-Demo - Qwen 🤗
Quyen - a vilm Collection
d-Qwen1.5-0.5B - aloobun 🤗
Arcee-Spark - arcee-ai 🤗

Baichuan

Baichuan 2: Open Large-scale Language Models, arXiv, 2309.10305, arxiv, pdf, cication: 16

Aiyuan Yang, Bin Xiao, Bingning Wang, Borong Zhang, Ce Bian, Chao Yin, Chenxu Lv, Da Pan, Dian Wang, Dong Yan · (Baichuan2 - baichuan-inc) · (cdn.baichuan-ai) · (mp.weixin.qq) · (jiqizhixin)
Baichuan-13B - baichuan-inc

A 13B large language model developed by Baichuan Intelligent Technology · (mp.weixin.qq)
baichuan-7B - baichuan-inc

A large-scale 7B pretraining language model developed by BaiChuan-Inc.

ChatGLM

Fetching Title#9hjh
ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools, arXiv, 2406.12793, arxiv, pdf, cication: -1

Team GLM, :, Aohan Zeng, Bin Xu, Bowen Wang, Chenhui Zhang, Da Yin, Diego Rojas, Guanyu Feng, Hanlin Zhao
GLM-4 - THUDM

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型 · (huggingface)
ChatGLM3 - THUDM

ChatGLM3 series: Open Bilingual Chat LLMs | 开源双语对话语言模型 · (qbitai)
ChatGLM2-6B - THUDM

ChatGLM2-6B: An Open Bilingual Chat LLM | 开源双语对话语言模型 · (qbitai)
chatglm.cpp - li-plus

C++ implementation of ChatGLM-6B & ChatGLM2-6B
TigerBot - TigerResearch

TigerBot: A multi-language multi-task LLM · (qbitai)

Finetuning

Llama-Chinese - LlamaFamily

Llama中文社区，Llama3在线体验和微调模型已开放，实时汇总最新Llama3学习资料，已将所有代码更新适配Llama3，构建最好的中文Llama大模型，完全开源可商用
llama3-Chinese-chat - CrazyBoyM

Llama3 中文仓库（聚合资料：各种网友及厂商微调、魔改版本有趣权重 & 训练、推理、部署教程视频 & 文档）
Rethinking LLM Language Adaptation: A Case Study on Chinese Mixtral, arXiv, 2403.01851, arxiv, pdf, cication: -1

Yiming Cui, Xin Yao · (Chinese-Mixtral - ymcui)
Aurora:Activating Chinese chat capability for Mistral-8x7B sparse Mixture-of-Experts through Instruction-Tuning, arXiv, 2312.14557, arxiv, pdf, cication: -1

Rongsheng Wang, Haoming Chen, Ruizhe Zhou, Yaofei Duan, Kunyan Cai, Han Ma, Jiaxi Cui, Jian Li, Patrick Cheong-Iao Pang, Yapeng Wang

· (Aurora - WangRongsheng)
Taiwan-LLaMa - MiuLab

Traditional Mandarin LLMs for Taiwan
Chinese-LLaMA-Alpaca-2 - ymcui

中文LLaMA-2 & Alpaca-2大语言模型 (Chinese LLaMA-2 & Alpaca-2 LLMs)
TransGPT - DUOMO

· (jiqizhixin)
Llama2-Chinese - FlagAlpha

Llama中文社区，最好的中文Llama大模型，完全开源可商用
Chinese-Llama-2-7b - LinkSoul-AI

开源社区第一个能下载、能运行的中文 LLaMA2 模型！
ChatGLM-Efficient-Tuning - hiyouga

Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调
BayLing: Bridging Cross-lingual Alignment and Instruction Following through Interactive Translation for Large Language Models, arXiv, 2306.10968, arxiv, pdf, cication: -1

Shaolei Zhang, Qingkai Fang, Zhuocheng Zhang, Zhengrui Ma, Yan Zhou, Langlin Huang, Mengyu Bu, Shangtong Gui, Yunji Chen, Xilin Chen · (jiqizhixin) · (BayLing - ictnlp) · (huggingface)

Other

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Gemini: A Family of Highly Capable Multimodal Models
LLMs-In-China - wgwang

中国大模型
中文大语言模型赶考：商汤与上海AI Lab等新发布「书生·浦语」 | 机器之心
ChatGLM2保姆级微调教程_哔哩哔哩_bilibili

· (mp.weixin.qq)

Extra

LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages, arXiv, 2407.05975, arxiv, pdf, cication: -1

Yinquan Lu, Wenhao Zhu, Lei Li, Yu Qiao, Fei Yuan

· (LLaMAX - CONE-MT)
Aya 23: Open Weight Releases to Further Multilingual Progress, arXiv, 2405.15032, arxiv, pdf, cication: -1

Viraat Aryabumi, John Dang, Dwarak Talupuru, Saurabh Dash, David Cairuz, Hangyu Lin, Bharat Venkitesh, Madeline Smith, Kelly Marchisio, Sebastian Ruder
aya-101 - CohereForAI 🤗

· (huggingface) · (cohere)
SUTRA: Scalable Multilingual Language Model Architecture, arXiv, 2405.06694, arxiv, pdf, cication: -1

Abhijit Bendale, Michael Sapienza, Steven Ripplinger, Simon Gibbs, Jaewon Lee, Pranav Mistry
SambaLingo: Teaching Large Language Models New Languages, arXiv, 2404.05829, arxiv, pdf, cication: -1

Zoltan Csaki, Bo Li, Jonathan Li, Qiantong Xu, Pian Pawakapan, Leon Zhang, Yun Du, Hengyu Zhao, Changran Hu, Urmish Thakker
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order, arXiv, 2404.00399, arxiv, pdf, cication: -1

Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo · (huggingface)
Sailor: Open Language Models for South-East Asia, arXiv, 2404.03608, arxiv, pdf, cication: -1

Longxu Dou, Qian Liu, Guangtao Zeng, Jia Guo, Jiahui Zhou, Wei Lu, Min Lin · (twitter)
- earning rate can have an even more impact on the dreaded phenomenon known as 𝚌𝚊𝚝𝚊𝚜𝚝𝚛𝚘𝚙𝚑𝚒𝚌 𝚏𝚘𝚛𝚐𝚎𝚝𝚝𝚒𝚗𝚐?
HyperCLOVA X Technical Report, arXiv, 2404.01954, arxiv, pdf, cication: -1

Kang Min Yoo, Jaegeun Han, Sookyo In, Heewon Jeon, Jisu Jeong, Jaewook Kang, Hyunwook Kim, Kyung-Min Kim, Munhyong Kim, Sungju Kim
Nemotron-4 15B Technical Report, arXiv, 2402.16819, arxiv, pdf, cication: -1

Jupinder Parmar, Shrimai Prabhumoye, Joseph Jennings, Mostofa Patwary, Sandeep Subramanian, Dan Su, Chen Zhu, Deepak Narayanan, Aastha Jhunjhunwala, Ayush Dattagupta
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model, arXiv, 2402.07827, arxiv, pdf, cication: -1

Ahmet Üstün, Viraat Aryabumi, Zheng-Xin Yong, Wei-Yin Ko, Daniel D'souza, Gbemileke Onilude, Neel Bhandari, Shivalika Singh, Hui-Lee Ooi, Amr Kayid · (hf)
CroissantLLM: A Truly Bilingual French-English Language Model, arXiv, 2402.00786, arxiv, pdf, cication: -1

Manuel Faysse, Patrick Fernandes, Nuno Guerreiro, António Loison, Duarte Alves, Caio Corro, Nicolas Boizard, João Alves, Ricardo Rei, Pedro Martins
MaLA-500: Massive Language Adaptation of Large Language Models, arXiv, 2401.13303, arxiv, pdf, cication: -1

Peiqin Lin, Shaoxiong Ji, Jörg Tiedemann, André F. T. Martins, Hinrich Schütze · (huggingface)
Multilingual Instruction Tuning With Just a Pinch of Multilinguality, arXiv, 2401.01854, arxiv, pdf, cication: -1

Uri Shaham, Jonathan Herzig, Roee Aharoni, Idan Szpektor, Reut Tsarfaty, Matan Eyal
LLaMA Beyond English: An Empirical Study on Language Capability Transfer, arXiv, 2401.01055, arxiv, pdf, cication: -1

Jun Zhao, Zhihao Zhang, Qi Zhang, Tao Gui, Xuanjing Huang
2023, year of open LLMs
FinGPT: Large Generative Models for a Small Language, arXiv, 2311.05640, arxiv, pdf, cication: -1

Risto Luukkonen, Ville Komulainen, Jouni Luoma, Anni Eskelinen, Jenna Kanerva, Hanna-Mari Kupari, Filip Ginter, Veronika Laippala, Niklas Muennighoff, Aleksandra Piktus · (turkunlp)

Toolkits

LLMZoo - FreedomIntelligence

⚡LLM Zoo is a project that provides data, models, and evaluation benchmark for large language models.⚡

Products

The Claude 3 Model Family: Opus, Sonnet, Haiku

· (anthropic)

Extra reference

The Foundation Model Transparency Index
Ecosystem Graphs for Foundation Models
LLM Collection | Prompt Engineering Guide
open-llms - eugeneyan

📋 A list of open LLMs available for commercial use.
List of Open Sourced Fine-Tuned Large Language Models (LLM) | by Sung Kim | Medium
Awesome-Chinese-LLM - HqWu-HITCS

整理开源的中文大语言模型，以规模较小、可私有化部署、训练成本较低的模型为主，包括底座模型，垂直领域微调及应用，数据集与教程等。
self-llm - datawhalechina

《开源大模型食用指南》基于AutoDL快速部署开源大模型，更适合中国宝宝的部署教程

Files

awesome_openllm.md

Latest commit

History

awesome_openllm.md

File metadata and controls

Awesome opengpt

English

Foundation

Llama 3

Jamba

Databricks

Gemma

OLMo

Phi

Mistral

StripedHyena-7B

BLOOM

Mosaic pretrained transformers (MPT)

h2oGPT

LLaMA

Falcon

Pythia

Other

Finetuning

Vicuna

Alpaca

Dolly

Misc

Mulitlingual (chinese)

Foundation

Yi-01

InterLM

DeepSeek

Xverse

Qwen

Baichuan

ChatGLM

Finetuning

Other

Extra

Toolkits

Products

Extra reference