From 0689d2a689ea4266be62cf06241c55061285f865 Mon Sep 17 00:00:00 2001 From: Yifan Mai Date: Thu, 26 Oct 2023 14:36:50 -0700 Subject: [PATCH 1/5] Release v0.3.0 --- CHANGELOG.md | 57 ++++++++++++++++++++++++++++++++++++++++++++++++++-- setup.cfg | 2 +- 2 files changed, 56 insertions(+), 3 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index f250b5ea71..613959abd4 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,58 @@ ## [Upcoming] +## [v0.3.0] - 2023-10-26 + +### Models + +- Added support for Lit-GPT (#1792) +- Added support for stop sequences in HuggingFaceClient (#1892, #1909) +- Added Mistral 7B model (#1906) + +### Scenarios + +- Added 31 scenarios from [CLEVA](https://arxiv.org/abs/2308.04813) for evaluation of Chinese language models (#1824, #1864) + +### Metrics + +- Fixed a bug that prevented using Anthropic Claude for model critique (#1862) + +### Frontend + +- Added an React frontend (#1819, #1893) + +### Framework + +- Added support for multi-modal scenarios and Vision Language Model (VLM) evaluation (#1871) +- Added support for Python 3.9 and 3.10 (#1897) + +### Evaluation Results + +- Added evaluation results for Stanford Alpaca, MosaicML MPT, TII UAE Falcon, LMSYS Vicuna + +### Contributors + +Thank you to the following contributors for your work on this HELM release! + +- @aniketmaurya +- @Anindyadeep +- @brianwgoldman +- @drisspg +- @farzaank +- @fzyxh +- @HenryHZY +- @Jianqiao-Zhao +- @JosselinSomervilleRoberts +- @LoryPack +- @lyy1994 +- @mkly +- @msaroufim +- @percyliang +- @RossBencina +- @teetone +- @yifanmai +- @zd11024 + ## [v0.2.4] - 2023-09-20 ### Models @@ -177,8 +229,9 @@ Thank you to the following contributors for your contributions to this HELM rele - Initial release -[upcoming]: https://github.com/stanford-crfm/helm/compare/v0.2.4...HEAD -[v0.2.3]: https://github.com/stanford-crfm/helm/releases/tag/v0.2.4 +[upcoming]: https://github.com/stanford-crfm/helm/compare/v0.3.0...HEAD +[v0.3.0]: https://github.com/stanford-crfm/helm/releases/tag/v0.3.0 +[v0.2.4]: https://github.com/stanford-crfm/helm/releases/tag/v0.2.4 [v0.2.3]: https://github.com/stanford-crfm/helm/releases/tag/v0.2.3 [v0.2.2]: https://github.com/stanford-crfm/helm/releases/tag/v0.2.2 [v0.2.1]: https://github.com/stanford-crfm/helm/releases/tag/v0.2.1 diff --git a/setup.cfg b/setup.cfg index 83e33cf859..7b491c9019 100644 --- a/setup.cfg +++ b/setup.cfg @@ -1,6 +1,6 @@ [metadata] name = crfm-helm -version = 0.2.4 +version = 0.3.0 author = Stanford CRFM author_email = contact-crfm@stanford.edu description = Benchmark for language models From 6444e1f105af5cdf475d39e785c2e327ae82aa10 Mon Sep 17 00:00:00 2001 From: Yifan Mai Date: Fri, 27 Oct 2023 18:07:38 -0700 Subject: [PATCH 2/5] Update changelog --- CHANGELOG.md | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 613959abd4..f70b6b3150 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,17 +2,20 @@ ## [Upcoming] -## [v0.3.0] - 2023-10-26 +## [v0.3.0] - 2023-10-27 ### Models - Added support for Lit-GPT (#1792) - Added support for stop sequences in HuggingFaceClient (#1892, #1909) - Added Mistral 7B model (#1906) +- Added IDEFICS model (#1871) ### Scenarios - Added 31 scenarios from [CLEVA](https://arxiv.org/abs/2308.04813) for evaluation of Chinese language models (#1824, #1864) +- Added VQA scenario model (#1871) +- Adddd support for running MCQA scenarios from users' JSONL files (#1889) ### Metrics @@ -20,12 +23,15 @@ ### Frontend -- Added an React frontend (#1819, #1893) +- Added a React frontend (#1819, #1893) ### Framework - Added support for multi-modal scenarios and Vision Language Model (VLM) evaluation (#1871) - Added support for Python 3.9 and 3.10 (#1897) +- Added a new `Tokenizer` class in preparation for removing `tokenize()` and `decode()` from `Client` in a future release (#1874) +- Made more dependencies optional instead of required and added install command suggestions (#1834, #1961) +- Added support for configuring users' model deployments through YAML configuration files (#1861) ### Evaluation Results From cb927817252d40a555a663320ec551209a65e5ea Mon Sep 17 00:00:00 2001 From: Yifan Mai Date: Tue, 31 Oct 2023 19:18:30 -0700 Subject: [PATCH 3/5] Updated changelog --- CHANGELOG.md | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index f70b6b3150..ddeb1926bf 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,6 +10,7 @@ - Added support for stop sequences in HuggingFaceClient (#1892, #1909) - Added Mistral 7B model (#1906) - Added IDEFICS model (#1871) +- Added Anthropic Claude 2 (#1900) ### Scenarios From 3368690da63f618498bd36d0f3ffd625163981f4 Mon Sep 17 00:00:00 2001 From: Yifan Mai Date: Tue, 31 Oct 2023 20:08:17 -0700 Subject: [PATCH 4/5] Update date --- CHANGELOG.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index ddeb1926bf..4b392e9193 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,7 +2,7 @@ ## [Upcoming] -## [v0.3.0] - 2023-10-27 +## [v0.3.0] - 2023-10-31 ### Models @@ -31,7 +31,7 @@ - Added support for multi-modal scenarios and Vision Language Model (VLM) evaluation (#1871) - Added support for Python 3.9 and 3.10 (#1897) - Added a new `Tokenizer` class in preparation for removing `tokenize()` and `decode()` from `Client` in a future release (#1874) -- Made more dependencies optional instead of required and added install command suggestions (#1834, #1961) +- Made more dependencies optional instead of required, and added install command suggestions (#1834, #1961) - Added support for configuring users' model deployments through YAML configuration files (#1861) ### Evaluation Results From 6b392f25ee4e2c8e21e6daea15af320046b02905 Mon Sep 17 00:00:00 2001 From: Yifan Mai Date: Wed, 1 Nov 2023 09:43:04 -0700 Subject: [PATCH 5/5] Update date again --- CHANGELOG.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 4b392e9193..a67550d266 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,7 +2,7 @@ ## [Upcoming] -## [v0.3.0] - 2023-10-31 +## [v0.3.0] - 2023-11-01 ### Models