Skip to content

Latest commit

 

History

History
41 lines (32 loc) · 8.08 KB

index.md

File metadata and controls

41 lines (32 loc) · 8.08 KB
layout title nav_order image description
default
Home
0
Twitter_card.png
Model-recycling - the best model per architecture. Comparing finetuned models from HF, as base models for future finetuning.

Welcome to model-recycling page

Hardly anyone trains from scratch anymore, we all finetune over a pretrained model.

Research slowly reaches consensus that some finetuned models are better base models than the pretrained models themselves.

This site presents a dynamic view of the best models to choose for a given model size and architecture. We follow the findings and methodology from our paper: We download finetuned models found in HuggingFace per architecture and efficiently rank them over a representative task. We then evaluate the top ranked models by finetuning over a large set of 36 target tasks, and report the average performance of each base model.

Tested so far: 2685 (and counting)

Best models per architectures


Pretrained Best model Avg. Pretrained Avg. Ranking
roberta-base ibm/ColD-Fusion 78.47 76.22 link
bert-base-uncased ibm/ColD-Fusion-bert-base-uncased-itr23-seed0 75.64 72.20 link
bert-base-cased skim945/bert-finetuned-squad 74.43 72.43 link
t5-base adit94/nlpcharade 78.23 75.45 link
google/t5-v1_1-base shaiman12/flan-t5-base-samsum 78.18 68.82 link
microsoft/deberta-v3-base sileod/deberta-v3-base-tasksource-nli 80.73 79.04 link


To learn more see our FAQ or read the paper. See detailed evaluation results on each architecture here. If you have any feedback or question please contact us.

This work was performed in IBM Research by Leshem Choshen, Elad Venezian, Shachar Don-Yehiya, Noam Slonim and Yoav Katz.