layout | title | parent | has_children |
---|---|---|---|
default |
google_t5-v1_1-base |
Rankings |
true |
[comment]: # (This page contains a link to a table with the ranking and performance of all ranked google_t5-v1_1-base models. In addition, it contains a table with the baseline and the 10 best models. The original ranking was done by finetuning only the classification head of the model (linear probing) over the MNLI dataset. The best models by this ranking where ranked by the average accuracy after finetuning over the 36 datasets (except for the stsb dataset, where we used the Spearman correlation instead of accuracy).)
Ranking and performance of all 140 ranked google_t5-v1_1-base models (full table). The top 45 models were fully tested.
Notes:
- The baseline results can be found here
- While the average improvement is small, many datasets show large gains
model_name | avg | mnli_lp | 20_newsgroup | ag_news | amazon_reviews_multi | anli | boolq | cb | cola | copa | dbpedia | esnli | financial_phrasebank | imdb | isear | mnli | mrpc | multirc | poem_sentiment | qnli | qqp | rotten_tomatoes | rte | sst2 | sst_5bins | stsb | trec_coarse | trec_fine | tweet_ev_emoji | tweet_ev_emotion | tweet_ev_hate | tweet_ev_irony | tweet_ev_offensive | tweet_ev_sentiment | wic | wnli | wsc | yahoo_answers | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
baseline | google/t5-v1_1-base | 68.82 | nan | 82.88 | 88.18 | 66.91 | 38.06 | 65.57 | 55.45 | 70.18 | 40.50 | 70.77 | 85.58 | 66.74 | 92.99 | 71.06 | 75.51 | 72.83 | 56.14 | 68.08 | 89.37 | 83.60 | 86.05 | 60.58 | 93.72 | 51.84 | 68.79 | 93.25 | 82.07 | 33.46 | 75.61 | 51.52 | 67.62 | 82.61 | 69.88 | 55.84 | 46.90 | 48.32 | 69.26 |
1 | shaiman12/flan-t5-base-samsum | 78.18 | 84.77 | 86.92 | 89.87 | 66.60 | 52.91 | 82.29 | 80.36 | 80.35 | 67.00 | 76.53 | 90.20 | 86.40 | 93.26 | 72.95 | 87.19 | 89.46 | 62.27 | 84.62 | 93.39 | 89.49 | 90.06 | 85.56 | 94.38 | 57.92 | 89.60 | 97.40 | 92.60 | 46.85 | 80.86 | 48.62 | 74.62 | 83.84 | 70.95 | 70.06 | 56.34 | 69.23 | 73.43 |
2 | shaiman12/flan-t5-base-samsum | 78.18 | 0.00 | 86.92 | 89.87 | 66.60 | 52.91 | 82.29 | 80.36 | 80.35 | 67.00 | 76.53 | 90.20 | 86.40 | 93.26 | 72.95 | 87.19 | 89.46 | 62.27 | 84.62 | 93.39 | 89.49 | 90.06 | 85.56 | 94.38 | 57.92 | 89.60 | 97.40 | 92.60 | 46.85 | 80.86 | 48.62 | 74.62 | 83.84 | 70.95 | 70.06 | 56.34 | 69.23 | 73.43 |
3 | emozilla/flan-t5-base-sat-reading-comprehension | 78.07 | 85.01 | 87.11 | 89.97 | 86.90 | 52.75 | 82.29 | 78.57 | 80.35 | 69.00 | 76.83 | 52.75 | 46.49 | 93.27 | 72.82 | 93.45 | 87.50 | 62.27 | 88.46 | 86.64 | 89.71 | 89.59 | 56.34 | 67.14 | 93.46 | 89.61 | 97.80 | 92.40 | 80.93 | 50.91 | 76.28 | 83.95 | 70.74 | 87.10 | 69.75 | 90.01 | 64.42 | 72.87 |
4 | google/flan-t5-base | 77.98 | 84.92 | 86.22 | 89.67 | 67.12 | 51.97 | 82.32 | 78.57 | 80.15 | 75.00 | 77.67 | 90.95 | 85.40 | 93.32 | 72.43 | 87.25 | 89.46 | 62.38 | 82.69 | 92.79 | 89.77 | 89.02 | 84.84 | 94.38 | 57.29 | 89.48 | 97.20 | 92.80 | 46.85 | 80.23 | 54.98 | 76.66 | 84.30 | 70.64 | 70.06 | 56.34 | 53.85 | 73.40 |
5 | shri07/babi_qa | 77.87 | 84.92 | 87.28 | 89.93 | 87.30 | 53.09 | 82.17 | 76.79 | 81.02 | 71.00 | 76.27 | 53.09 | 47.24 | 93.24 | 73.27 | 93.36 | 85.54 | 62.25 | 86.54 | 84.48 | 89.79 | 89.31 | 60.56 | 67.20 | 94.38 | 89.57 | 97.60 | 92.20 | 81.56 | 49.26 | 72.58 | 83.49 | 70.92 | 87.40 | 69.75 | 90.26 | 59.62 | 74.00 |
6 | andreaparker/flan-t5-base-samsum | 77.86 | 84.76 | 86.43 | 89.83 | 67.10 | 52.59 | 82.17 | 80.36 | 80.54 | 66.00 | 76.50 | 90.89 | 86.70 | 93.04 | 71.64 | 87.25 | 88.73 | 62.13 | 91.35 | 93.30 | 89.14 | 89.59 | 84.48 | 93.58 | 56.97 | 89.37 | 97.40 | 93.00 | 46.33 | 81.63 | 51.48 | 74.74 | 84.77 | 69.88 | 67.87 | 56.34 | 57.69 | 72.30 |
7 | talhaa/flant5 | 77.86 | 84.74 | 87.07 | 89.53 | 67.14 | 52.19 | 82.84 | 78.57 | 80.15 | 70.00 | 77.27 | 90.70 | 84.90 | 93.51 | 72.49 | 87.48 | 86.27 | 61.84 | 87.50 | 93.12 | 90.72 | 89.68 | 85.92 | 93.81 | 56.56 | 89.44 | 97.40 | 91.60 | 47.05 | 80.51 | 52.59 | 74.87 | 84.77 | 71.76 | 68.81 | 56.34 | 55.77 | 72.63 |
8 | mrm8488/flan-t5-base-finetuned-gsm8k | 77.84 | 84.55 | 86.62 | 89.57 | 66.84 | 53.12 | 82.69 | 78.57 | 79.96 | 67.00 | 75.93 | 90.36 | 87.40 | 93.31 | 72.29 | 87.93 | 88.97 | 61.96 | 86.54 | 93.08 | 90.07 | 89.21 | 84.84 | 94.15 | 56.70 | 89.40 | 97.20 | 90.80 | 47.34 | 81.77 | 49.76 | 78.32 | 85.35 | 71.16 | 69.44 | 56.34 | 55.77 | 72.63 |
9 | ybagoury/flan-t5-base-tldr_news | 77.83 | 84.80 | 86.79 | 89.90 | 66.70 | 51.44 | 81.96 | 76.79 | 81.21 | 70.00 | 77.23 | 90.98 | 87.90 | 93.43 | 73.01 | 87.17 | 87.75 | 61.82 | 84.62 | 93.34 | 90.29 | 89.49 | 83.75 | 94.27 | 57.15 | 89.67 | 97.20 | 92.80 | 47.40 | 80.37 | 50.00 | 75.77 | 83.72 | 71.26 | 68.81 | 56.34 | 58.65 | 72.97 |
10 | spacemanidol/flan-t5-base-cnndm | 77.75 | 84.22 | 85.91 | 89.83 | 66.98 | 51.38 | 81.93 | 80.36 | 80.54 | 64.00 | 76.63 | 90.38 | 84.70 | 93.20 | 72.82 | 86.65 | 89.22 | 61.61 | 89.42 | 93.34 | 89.85 | 89.49 | 80.51 | 94.04 | 55.66 | 88.89 | 97.80 | 91.40 | 46.51 | 80.79 | 50.64 | 75.51 | 84.42 | 70.10 | 68.65 | 56.34 | 66.35 | 73.30 |
Download full models ranking table: [csv](./results/google_t5-v1_1-base_table.csv)