Skip to content

Commit

Permalink
mod: README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
duterscmy committed Nov 22, 2024
1 parent 5fa0652 commit 13cede9
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 3 deletions.
19 changes: 18 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,4 +80,21 @@ For some intermediate variables, we provide some already generated results. The

## Evaluation

TODO
Install [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)
Evaluate the pruned model:
```bash
lm_eval --model hf \
--model_args $modelpath \
--tasks arc-challenge,boolq,piqa,rte,obqa,winogrande,mmlu,hellaswag \
--device cuda:0 \
--batch_size 8
```
Evaluate the fine-tuned model:
```bash
lm_eval --model hf \
--model_args $modelpath \
--tasks arc-challenge,boolq,piqa,rte,obqa,winogrande,mmlu,hellaswag \
--device cuda:0 \
--batch_size 8 \
--ignore_mismatched_sizes
```
4 changes: 2 additions & 2 deletions cd-moe/finetune/finetune.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
help="finetune data")
parser.add_argument("--c4-input", default="datasets/c4-train.00000-of-01024.1w.json",
help="finetune data")
parser.add_argument("--input-name", default="",
parser.add_argument("--input-name", default="c4",
help="finetune data name")
parser.add_argument("--model", default="./deepseek",
help="预训练模型路径")
Expand All @@ -46,7 +46,7 @@
help="默认为qw16B层数") # deepseek 27 qw24
parser.add_argument("--num-expert", type=int, default=64, help="默认为qw16B专家数")

parser.add_argument("--score-mode", type=str, default="l1", help="层间对专家排序的指标")
parser.add_argument("--score-mode", type=str, default="greedy_jl", help="层间对专家排序的指标")
parser.add_argument("--prune-num-expert", default=6, type=int,
help="剪枝后剩余的expert数量")
parser.add_argument("--prune-num-layer", default=9, type=int,
Expand Down

0 comments on commit 13cede9

Please sign in to comment.