Xmodel_LM-1.1B

🌟 Introduction

We introduce Xmodel-LM, a compact and efficient 1.1B language model pre-trained on around 2 trillion tokens. Trained on our self-built dataset (Xdata), which balances Chinese and English corpora based on downstream task optimization, Xmodel-LM exhibits remarkable performance despite its smaller size. It notably surpasses existing open-source language models of similar scale.

📊 Benchmark

Commonsense Reasoning

Model	ARC-c	ARC-e	Boolq	HellaSwag	OpenbookQA	PiQA	SciQ	TriviaQA	Winogrande	Avg
OPT-1.3B	23.29	57.03	57.80	41.52	23.20	71.71	84.30	7.48	59.59	47.32
Pythia-1.4B	25.60	57.58	60.34	39.81	20.20	71.06	85.20	5.01	56.20	47.00
TinyLLaMA-3T-1.1B	27.82	60.31	57.83	44.98	21.80	73.34	88.90	11.30	59.12	48.59
MobileLLaMA-1.4B	26.28	61.32	57.92	42.87	23.60	71.33	87.40	12.02	58.25	49.00
Qwen1.5-1.8B	32.25	64.69	66.48	45.49	23.80	73.45	92.90	1.01	61.17	51.25
Xmodel-LM-1.1B	28.16	62.29	61.44	45.96	24.00	72.03	89.70	18.46	60.62	51.41
H2O-danube-1.8B	32.94	67.42	65.75	50.85	27.40	75.73	91.50	25.05	62.35	55.44
InternLM2-1.8B	37.54	70.20	69.48	46.52	24.40	75.57	93.90	36.67	65.67	57.77

Problem Solving

Model	BBH (3-shot)	GLUE (5-shot)	GSM8K (5-shot)	MMLU (5-shot)	Avg	Avg w.o. GSM8k
OPT-1.3B	22.67	51.06	0.83	26.70	25.32	33.48
Pythia-1.4B	25.37	52.23	1.63	25.40	26.16	34.33
MobileLLaMA-1.4B	23.48	43.34	1.44	24.60	23.22	30.47
TinyLLaMA-3T-1.1B	26.75	48.25	1.97	25.70	25.67	33.57
H2O-danube-1.8B	27.31	49.83	1.90	25.70	26.19	34.28
Xmodel-LM-1.1B	27.34	52.61	2.58	25.90	27.11	35.28
InternLM2-1.8B	16.86	58.96	23.50	42.00	35.34	39.27
Qwen1.5-1.8B	13.84	64.57	33.59	45.10	39.28	41.17

Chinese Ability

Model	ARC-zh	XCOPA-zh	XNLI-zh	Avg
OPT-1.3B	18.80	53.00	33.45	35.08
Pythia-1.4B	21.03	52.60	34.06	35.90
MobileLLaMA-1.4B	20.26	52.80	33.82	35.63
TinyLLaMA-3T-1.1B	21.37	56.80	33.25	37.14
H2O-danube-1.8B	21.79	55.60	34.74	37.38
Xmodel-LM-1.1B	26.24	60.60	36.02	40.95
InternLM2-1.8B	27.69	66.80	34.58	43.00
Qwen1.5-1.8B	32.14	66.00	39.28	45.81

🛠️ Install

Clone this repository and navigate to XmodelLM folder

git clone https://github.com/XiaoduoAILab/XmodelLM.git
cd xmodellm

Install Package
```
pip install -r requirements.txt
```

🗝️ Quick Start

Download Xmodel_LM model

Our model files are fully open source on huggingface, you can download them at here.

Example for Xmodel_LM model inference

Download the model files first and save them in your folder. Then you can run the scripts below, we recommend entering an absolute path as the parameter.

python generate.py --model_path path/to/folder --device cuda:0

✏️ Reference

If you find Xmodel_LM useful in your research or applications, please consider giving a star ⭐ and citing using the following BibTeX:

@misc{wang2024xmodellm,
      title={Xmodel-LM Technical Report}, 
      author={Yichuan Wang and Yang Liu and Yu Yan and Qun Wang and Shulei Wu and Xucheng Huang and Ling Jiang},
      year={2024},
      eprint={2406.02856},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
models		models
LICENSE		LICENSE
README.md		README.md
generate.py		generate.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Xmodel_LM-1.1B

🌟 Introduction

📊 Benchmark

Commonsense Reasoning

Problem Solving

Chinese Ability

🛠️ Install

🗝️ Quick Start

Download Xmodel_LM model

Example for Xmodel_LM model inference

✏️ Reference

About

Releases

Packages

Contributors 2

Languages

License

XiaoduoAILab/XmodelLM

Folders and files

Latest commit

History

Repository files navigation

Xmodel_LM-1.1B

🌟 Introduction

📊 Benchmark

Commonsense Reasoning

Problem Solving

Chinese Ability

🛠️ Install

🗝️ Quick Start

Download Xmodel_LM model

Example for Xmodel_LM model inference

✏️ Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages