FIX: Generating with mixed adapter batches and with beam search enabled #2287
+125
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See #2283
Right now, using mixed adapter batches (introduced in #1558) with beam search generations does not work. This is because users need to pass the adapter names associated with each sample, i.e. the number of adapter names should be identical to the number of samples in the input.
When applying beam search, transformers internally repeats the samples once per beam (or so it looks like). Therefore, we have more samples during generation than samples in the input. Consequently, the adapter names have to be extended accordingly. This is now taken care of.
For encoder-decoder models, we need to be careful. I seems like only the decoder needs to be extended, whereas the encoder receives the original number of inputs. Therefore, when an encoder-decoder model is identified, the extension is only applied to the decoder part.