Skip to content

Commit

Permalink
Add non-programmatic BIG-bench-hard tasks (#406)
Browse files Browse the repository at this point in the history
* Support bigbench-hard json tasks using multiple_choice_grade

* Add support for greedy decoding in bigbench tasks

* move bigbench_resources to datasets

* rectify changes to rf.greedy_until w upstream

* make path to resource import reflect new location

---------

Co-authored-by: haileyschoelkopf <hailey.schoelkopf@yale.edu>
  • Loading branch information
yurodiviy and haileyschoelkopf authored Apr 28, 2023
1 parent e47e01b commit 602abce
Show file tree
Hide file tree
Showing 37 changed files with 699,688 additions and 19 deletions.
13 changes: 10 additions & 3 deletions lm_eval/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -342,18 +342,25 @@ def _collate(x):

re_ord = utils.Reorderer(requests, _collate)

for context, until in tqdm(re_ord.get_reordered()):
for context, request_args in tqdm(re_ord.get_reordered()):
until = request_args['until']
if isinstance(until, str):
until = [until]

(primary_until,) = self.tok_encode(until[0])
if until:
(primary_until,) = self.tok_encode(until[0])
else:
primary_until = None

context_enc = torch.tensor(
[self.tok_encode(context)[self.max_gen_toks - self.max_length :]]
).to(self.device)

max_gen_tokens = min(
self.max_gen_toks, request_args.get('max_length', self.max_gen_toks)
)
cont = self._model_generate(
context_enc, context_enc.shape[1] + self.max_gen_toks, primary_until
context_enc, context_enc.shape[1] + max_gen_tokens, primary_until
)

s = self.tok_decode(cont[0].tolist()[context_enc.shape[1] :])
Expand Down
1,541 changes: 1,541 additions & 0 deletions lm_eval/datasets/bigbench_resources/causal_judgement.json

Large diffs are not rendered by default.

Loading

0 comments on commit 602abce

Please sign in to comment.