Handle batch input inference for MoD Infini-Former more gracefully #10

dingo-actual · 2024-04-23T12:48:24Z

Currently, the token sampling for MoD Infini-Former at inference time can result in different length sequences for each observation in the batch. The current workaround is to force the batch size to one and loop through the observations in the batch, which is highly inefficient.

There are two main options for handling this efficiently:

Pad the sampled sequences to the longest sequence length in such a way that the additional tokens contribute nothing to downstream calculations.
Wait for PyTorch to implement a ragged tensor type

I'm likely to pursue the first because there's no telling how long it'll be before the PyTorch devs add ragged tensors.

muditbhargava66 · 2024-04-28T05:45:04Z

I worked on this issue #15.

muditbhargava66 · 2024-04-28T19:09:22Z

Should this issue be closed, or do you need any more changes? Please let me know if you have any further questions.

dingo-actual · 2024-04-28T19:17:50Z

Unfortunately, the fix you introduced assumes that calling .forward_() produces a valid result when called on the original input. What needs to happen during inference is for .forward() to use sample_mask_seg to pad the samples along the token dimension until they all have the same length. The part I haven't gotten around to is going through the math to determine a choice of padding token that doesn't affect downstream calculations.

For the moment, I'm going to revert the change, just to maintain functionality (slow as it is). I really appreciate your putting in time on this though!

dingo-actual added enhancement New feature or request help wanted Extra attention is needed labels Apr 23, 2024

dingo-actual assigned dingo-actual and muditbhargava66 Apr 28, 2024

dingo-actual mentioned this issue Apr 28, 2024

inference issue #2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle batch input inference for MoD Infini-Former more gracefully #10

Handle batch input inference for MoD Infini-Former more gracefully #10

dingo-actual commented Apr 23, 2024

muditbhargava66 commented Apr 28, 2024

muditbhargava66 commented Apr 28, 2024

dingo-actual commented Apr 28, 2024

Handle batch input inference for MoD Infini-Former more gracefully #10

Handle batch input inference for MoD Infini-Former more gracefully #10

Comments

dingo-actual commented Apr 23, 2024

muditbhargava66 commented Apr 28, 2024

muditbhargava66 commented Apr 28, 2024

dingo-actual commented Apr 28, 2024