-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
single GPU automatic batching logic #394
Conversation
…_ll_tokens to avoid recomputation
Thank you very much for this PR! I'll try to review this today. |
What is the expected behavior / output of this PR when you test it? When I test on my local machine: I get a |
Marking this as draft until |
Hi @fattorib , how's this PR going? Happy to take a look if things have changed in it, or to help debug / talk it through if you need! |
Updated batch size logic to improve robustness. Tested and confirmed no OOMs across the following models on two different setups with clean install:
|
I'll be merging this PR over the weekend most likely. After discussion with @fattorib , it seems that although I get errors in this mode on my local setup this PR has been robust in his and my remote testing across machines. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tested this code and it works properly for me. All the results in this issue were calculated using this code: #443. I have added some comments to extend it to greedy_until
method in huggingface.py
.
|
||
# automatic batch size detection for vectorization | ||
adaptive_batch_size = None | ||
if self.batch_size == 'auto': | ||
# using rolling window with maximum context | ||
print('Passed argument batch_size = auto. Detecting largest batch size') | ||
@find_executable_batch_size(starting_batch_size=512) # if OOM, then halves batch_size and tries again | ||
def forward_batch(batch_size): | ||
test_batch = torch.ones((batch_size, self.max_length), device=self.device).long() | ||
for _ in range(5): | ||
out = F.log_softmax(self._model_call(test_batch), dim = -1).cpu() | ||
return batch_size | ||
|
||
batch_size = forward_batch() | ||
print(f"Determined Largest batch size: {batch_size}") | ||
adaptive_batch_size = batch_size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tested this and it works correctly. You can also add this logic to the greedy_until
method in huggingface.py
and it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this in. What tasks did you test this on? Just want to make sure everything works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized that you added the wrong code to the greedy_until
method in huggingface.py
. The correct code is the one in loglikelihood_rolling
method.
Thank you for the revisions + extra verification that the code works. Will add these changes in now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent contribution! Thank you.
Fix automatic batching for greedy_until Fix stop sequences bug for greedy_until Fix max length bug when model max length is not specified in tokenizer
single GPU automatic batching logic
Fix automatic batching for greedy_until Fix stop sequences bug for greedy_until Fix max length bug when model max length is not specified in tokenizer
Fix bugs introduced in EleutherAI#394 EleutherAI#406 and max length bug
single GPU automatic batching logic
Fix automatic batching for greedy_until Fix stop sequences bug for greedy_until Fix max length bug when model max length is not specified in tokenizer
Fix bugs introduced in EleutherAI#394 EleutherAI#406 and max length bug
This PR addresses the single GPU component of https://github.com/EleutherAI/lm-eval2/issues/7.
Current method adds extra logic to determine maximum batch size based on the longest sample over all provided tasks and uses
find_executable_batch_size
from Accelerate to determine the largest batch size. Tested with both gpt2 and hf-causal model classes on bothloglikelihood_rolling
and_loglikelihood_tokens
tasks.Example use:
or
Output for both looks like: