single GPU automatic batching logic #394

fattorib · 2023-03-09T17:45:09Z

This PR addresses the single GPU component of https://github.com/EleutherAI/lm-eval2/issues/7.

Current method adds extra logic to determine maximum batch size based on the longest sample over all provided tasks and uses find_executable_batch_size from Accelerate to determine the largest batch size. Tested with both gpt2 and hf-causal model classes on both loglikelihood_rolling and _loglikelihood_tokens tasks.

Example use:

python main.py \
	--model gpt2 \
	--tasks lambada_openai,wikitext \
	--batch_size auto

or

python main.py \
	--model hf-causal \
	--model_args pretrained=EleutherAI/pythia-70m \
	--tasks lambada_openai,wikitext \
	--batch_size auto

Output for both looks like:

...
Running loglikelihood requests
Passed argument batch_size = auto. Detecting largest batch size
Determined Largest batch size: 128
...

…_ll_tokens to avoid recomputation

haileyschoelkopf · 2023-03-12T13:05:06Z

Thank you very much for this PR! I'll try to review this today.

haileyschoelkopf · 2023-03-14T13:16:06Z

What is the expected behavior / output of this PR when you test it?

When I test on my local machine: I get a RuntimeError: CUDA error: an illegal memory access was encountered error message or similar. I tried decreasing the starting batch size to one I knew fits, but I got the same errors, so for some reason either the Accelerate decorator is causing an OOM when without it and with the same batch size I don't get it, or something else has gone wrong--maybe on my end, but not sure.

fattorib · 2023-03-14T15:14:43Z

Marking this as draft until MultipleChoiceTask task types (ex: Piqa) have been debugged

…eded

haileyschoelkopf · 2023-04-11T15:15:01Z

Hi @fattorib , how's this PR going? Happy to take a look if things have changed in it, or to help debug / talk it through if you need!

…to OOMs

fattorib · 2023-04-17T09:19:15Z

Updated batch size logic to improve robustness. Tested and confirmed no OOMs across the following models on two different setups with clean install:

EleutherAI/pythia-1b
EleutherAI/pythia-160m
gpt2

CLAassistant · 2023-04-23T02:51:20Z

All committers have signed the CLA.

haileyschoelkopf · 2023-04-28T17:11:51Z

I'll be merging this PR over the weekend most likely. After discussion with @fattorib , it seems that although I get errors in this mode on my local setup this PR has been robust in his and my remote testing across machines.

juletx

I have tested this code and it works properly for me. All the results in this issue were calculated using this code: #443. I have added some comments to extend it to greedy_until method in huggingface.py.

lm_eval/base.py

juletx · 2023-05-01T17:13:48Z

lm_eval/base.py

+
+        # automatic batch size detection for vectorization
+        adaptive_batch_size = None
+        if self.batch_size == 'auto': 
+            # using rolling window with maximum context
+            print('Passed argument batch_size = auto. Detecting largest batch size')
+            @find_executable_batch_size(starting_batch_size=512) # if OOM, then halves batch_size and tries again
+            def forward_batch(batch_size):
+                test_batch = torch.ones((batch_size, self.max_length), device=self.device).long()
+                for _ in range(5): 
+                    out = F.log_softmax(self._model_call(test_batch), dim = -1).cpu()
+                return batch_size
+
+            batch_size = forward_batch() 
+            print(f"Determined Largest batch size: {batch_size}")
+            adaptive_batch_size = batch_size


I have tested this and it works correctly. You can also add this logic to the greedy_until method in huggingface.py and it works.

I added this in. What tasks did you test this on? Just want to make sure everything works

I just realized that you added the wrong code to the greedy_until method in huggingface.py. The correct code is the one in loglikelihood_rolling method.

lm_eval/base.py

fattorib · 2023-05-02T09:43:55Z

I have tested this code and it works properly for me. All the results in this issue were calculated using this code: #443. I have added some comments to extend it to greedy_until method in huggingface.py.

Thank you for the revisions + extra verification that the code works. Will add these changes in now

StellaAthena

Excellent contribution! Thank you.

Fix automatic batching for greedy_until Fix stop sequences bug for greedy_until Fix max length bug when model max length is not specified in tokenizer

Fix bugs introduced in #394 #406 and max length bug

single GPU automatic batching logic

Fix automatic batching for greedy_until Fix stop sequences bug for greedy_until Fix max length bug when model max length is not specified in tokenizer

Fix bugs introduced in EleutherAI#394 EleutherAI#406 and max length bug

single GPU automatic batching logic

Fix automatic batching for greedy_until Fix stop sequences bug for greedy_until Fix max length bug when model max length is not specified in tokenizer

Fix bugs introduced in EleutherAI#394 EleutherAI#406 and max length bug

fattorib added 2 commits March 9, 2023 17:42

single GPU automatic batching logic

d5720d5

accelerate import not needed

b824fc9

fattorib requested review from jon-tow and StellaAthena as code owners March 9, 2023 17:45

ll_rolling computes adaptive bs separately and passes computed bs to …

f1dedbf

…_ll_tokens to avoid recomputation

add accelerate to deps

97f936b

fattorib marked this pull request as draft March 14, 2023 15:13

fattorib added 3 commits March 16, 2023 18:32

empty cuda cache after determing largest possible batch size

8a89b30

wrap self._model_call with F.log_softmax + remove empty_cache, not ne…

99304fe

…eded

additional external call to empty_cache + gc collect

42c6b7d

fattorib added 3 commits April 12, 2023 15:11

remove double assign typo of test_batch

d86de51

compute multiple forward passes in autobatcher to improve robustness …

e87c083

…to OOMs

update type arguments for hf-causal to accept str for batch_size

9142999

fattorib marked this pull request as ready for review April 17, 2023 09:15

juletx suggested changes May 1, 2023

View reviewed changes

fattorib and others added 2 commits May 2, 2023 10:47

address PR comments

4d21ab6

Merge branch 'master' into auto-batching

d6ceced

StellaAthena approved these changes May 3, 2023

View reviewed changes

StellaAthena merged commit 9a87719 into EleutherAI:master May 3, 2023

fattorib deleted the auto-batching branch May 3, 2023 12:47

juletx mentioned this pull request May 3, 2023

Fix bugs introduced in #394 #406 and max length bug #472

Merged

StellaAthena added a commit that referenced this pull request May 3, 2023

Merge pull request #472 from juletx/patch-1

f4355f9

Fix bugs introduced in #394 #406 and max length bug

jquesnelle mentioned this pull request May 10, 2023

Pulling loglikelihood tasks from cache crashes #489

Closed

qmdnls pushed a commit to qmdnls/lm-evaluation-harness that referenced this pull request Aug 17, 2023

Merge pull request EleutherAI#394 from fattorib/auto-batching

24964f2

single GPU automatic batching logic

qmdnls pushed a commit to qmdnls/lm-evaluation-harness that referenced this pull request Aug 17, 2023

Merge pull request EleutherAI#472 from juletx/patch-1

1ee9be2

Fix bugs introduced in EleutherAI#394 EleutherAI#406 and max length bug

LZY-the-boys pushed a commit to LZY-the-boys/lm-evaluation-harness-fast that referenced this pull request Sep 12, 2023

Merge pull request EleutherAI#394 from fattorib/auto-batching

36eea9e

single GPU automatic batching logic

LZY-the-boys pushed a commit to LZY-the-boys/lm-evaluation-harness-fast that referenced this pull request Sep 12, 2023

Merge pull request EleutherAI#472 from juletx/patch-1

902b4e5

Fix bugs introduced in EleutherAI#394 EleutherAI#406 and max length bug

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

single GPU automatic batching logic #394

single GPU automatic batching logic #394

fattorib commented Mar 9, 2023 •

edited

Loading

haileyschoelkopf commented Mar 12, 2023

haileyschoelkopf commented Mar 14, 2023

fattorib commented Mar 14, 2023

haileyschoelkopf commented Apr 11, 2023

fattorib commented Apr 17, 2023

CLAassistant commented Apr 23, 2023 •

edited

Loading

haileyschoelkopf commented Apr 28, 2023

juletx left a comment •

edited

Loading

juletx May 1, 2023

fattorib May 2, 2023

juletx May 3, 2023

fattorib commented May 2, 2023

StellaAthena left a comment

single GPU automatic batching logic #394

single GPU automatic batching logic #394

Conversation

fattorib commented Mar 9, 2023 • edited Loading

haileyschoelkopf commented Mar 12, 2023

haileyschoelkopf commented Mar 14, 2023

fattorib commented Mar 14, 2023

haileyschoelkopf commented Apr 11, 2023

fattorib commented Apr 17, 2023

CLAassistant commented Apr 23, 2023 • edited Loading

haileyschoelkopf commented Apr 28, 2023

juletx left a comment • edited Loading

Choose a reason for hiding this comment

juletx May 1, 2023

Choose a reason for hiding this comment

fattorib May 2, 2023

Choose a reason for hiding this comment

juletx May 3, 2023

Choose a reason for hiding this comment

fattorib commented May 2, 2023

StellaAthena left a comment

Choose a reason for hiding this comment

fattorib commented Mar 9, 2023 •

edited

Loading

CLAassistant commented Apr 23, 2023 •

edited

Loading

juletx left a comment •

edited

Loading