Skip to content

Actions: openai/evals

Actions

Run new evals

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
120 workflow runs
120 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Drop two datasets from steganography
Run new evals #2210: Pull request #1477 synchronize by thesofakillers
March 12, 2024 07:44 2m 17s thesofakillers:change-steg-datasets
March 12, 2024 07:44 2m 17s
Drop two datasets from steganography
Run new evals #2209: Pull request #1477 synchronize by JunShern
March 12, 2024 04:45 2m 3s thesofakillers:change-steg-datasets
March 12, 2024 04:45 2m 3s
Adding Indian Women Menstrual Health Chatbot Eval
Run new evals #2204: Pull request #1430 synchronize by phalgunagopal
February 27, 2024 18:57 2m 8s cranberrydeveloper:main
February 27, 2024 18:57 2m 8s
Updates for Solvers
Run new evals #2201: Pull request #1461 synchronize by JunShern
January 29, 2024 05:04 2m 4s jun/solvers-update
January 29, 2024 05:04 2m 4s
Updates for Solvers
Run new evals #2200: Pull request #1461 synchronize by JunShern
January 29, 2024 04:48 2m 5s jun/solvers-update
January 29, 2024 04:48 2m 5s
Updates for Solvers
Run new evals #2198: Pull request #1461 synchronize by JunShern
January 26, 2024 16:47 2m 0s jun/solvers-update
January 26, 2024 16:47 2m 0s
Updates for Solvers
Run new evals #2197: Pull request #1461 synchronize by JunShern
January 26, 2024 08:25 1m 56s jun/solvers-update
January 26, 2024 08:25 1m 56s
Updates for Solvers
Run new evals #2196: Pull request #1461 opened by JunShern
January 26, 2024 08:22 2m 11s jun/solvers-update
January 26, 2024 08:22 2m 11s
Add eval yaml for Theory of Mind eval
Run new evals #2192: Pull request #1453 opened by ojaffe
January 8, 2024 10:37 1m 54s ojaffe:ollie/tom_fix
January 8, 2024 10:37 1m 54s
Add MMMU evals and runner
Run new evals #2189: Pull request #1442 synchronize by etr2460
December 21, 2023 00:19 2m 27s erik/mmmu
December 21, 2023 00:19 2m 27s
Add MMMU evals and runner
Run new evals #2188: Pull request #1442 synchronize by etr2460
December 20, 2023 22:36 1m 52s erik/mmmu
December 20, 2023 22:36 1m 52s
Add MMMU evals and runner
Run new evals #2187: Pull request #1442 opened by etr2460
December 20, 2023 22:17 1m 55s erik/mmmu
December 20, 2023 22:17 1m 55s
Add eval japanese prime minister
Run new evals #2186: Pull request #1422 synchronize by return-nil
December 16, 2023 02:07 1m 59s return-nil:japanese_prime_minister
December 16, 2023 02:07 1m 59s
Change wrong kwargs name
Run new evals #2184: Pull request #1435 opened by LoryPack
December 15, 2023 11:28 1m 58s LoryPack:fix_wrong_kwargs_name
December 15, 2023 11:28 1m 58s
Add eval japanese prime minister
Run new evals #2182: Pull request #1422 synchronize by return-nil
December 12, 2023 22:39 1m 56s return-nil:japanese_prime_minister
December 12, 2023 22:39 1m 56s
Adding Indian Women Menstrual Health Chatbot Eval
Run new evals #2181: Pull request #1430 opened by cranberrydeveloper
December 11, 2023 13:31 2m 1s cranberrydeveloper:main
December 11, 2023 13:31 2m 1s
icelandic gec eval
Run new evals #2179: Pull request #1400 synchronize by svanhvitlilja
December 6, 2023 15:16 1m 59s svanhvitlilja:gec-icelandic
December 6, 2023 15:16 1m 59s
Add eval japanese prime minister
Run new evals #2178: Pull request #1422 synchronize by return-nil
December 6, 2023 11:53 1m 54s return-nil:japanese_prime_minister
December 6, 2023 11:53 1m 54s
Upgrade openai to >=1.0.0
Run new evals #2176: Pull request #1420 synchronize by etr2460
December 5, 2023 02:11 2m 5s erik/openai-1.0
December 5, 2023 02:11 2m 5s
Upgrade openai to >=1.0.0
Run new evals #2175: Pull request #1420 synchronize by etr2460
December 5, 2023 01:57 2m 1s erik/openai-1.0
December 5, 2023 01:57 2m 1s
Upgrade openai to >=1.0.0
Run new evals #2174: Pull request #1420 synchronize by etr2460
December 5, 2023 01:16 2m 6s erik/openai-1.0
December 5, 2023 01:16 2m 6s
Upgrade openai to >=1.0.0
Run new evals #2173: Pull request #1420 synchronize by etr2460
December 5, 2023 00:51 1m 59s erik/openai-1.0
December 5, 2023 00:51 1m 59s