Enable usage of batch_by_size for other code paths than just train #17

taylanbil · 2019-11-07T18:36:19Z

In train.py, due to our custom data preparation requirements, we use --input_shapes

This doesn't exist if we train on gpu, or if we are in the sequence generation code path. Thus, we get a Namespace doesn't have attribute error. This PR:

uses batch_by_size instead of batch_by_size_tpu if we're training on gpu.
safely passes None if args doesn't have input_shapes (currently the case for generate.py, interactive.py etc)

dlibenzi · 2019-11-07T18:41:55Z

fairseq/tasks/fairseq_task.py

+                required_batch_size_multiple=required_batch_size_multiple,
+            )
+        else:
+            batch_sampler = data_utils.batch_by_size_tpu(


Where is this implemented?
Did I review it?

yeah, it was implemented a long time ago, let me find the pr.

This needs to be "world aware" otherwise you will end up with issues at the end of the samples set.

so, this is the pr that made the tpu branch in this repo. But it was really first implemented in our examples repo, when our runner was still there. See here

when we moved our runner inside the fairseq repo, it got carried over.

ah here is the actual pr that introduced batch_by_size_tpu

The classes that uses this sampler downstream are world aware. Unfortunately, due to the .ceil thing discussed before, we do get empty batches at the end for some cores. We definitely do always get the same number of iterations per core. All we then have to do is to drop the last batches, which we do here

Enable usage of batch_by_size for other code paths than just train

9066a9b

taylanbil requested a review from dlibenzi November 7, 2019 18:36

dlibenzi reviewed Nov 7, 2019

View reviewed changes

dlibenzi approved these changes Nov 7, 2019

View reviewed changes

taylanbil merged commit aa2c3b3 into pytorch-tpu:tpu Nov 7, 2019

taylanbil deleted the enable-batchbysize-gpu branch November 7, 2019 19:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable usage of batch_by_size for other code paths than just train #17

Enable usage of batch_by_size for other code paths than just train #17

taylanbil commented Nov 7, 2019

dlibenzi Nov 7, 2019

taylanbil Nov 7, 2019

dlibenzi Nov 7, 2019

taylanbil Nov 7, 2019

taylanbil Nov 7, 2019

taylanbil Nov 7, 2019

taylanbil Nov 7, 2019

Enable usage of batch_by_size for other code paths than just train #17

Enable usage of batch_by_size for other code paths than just train #17

Conversation

taylanbil commented Nov 7, 2019

dlibenzi Nov 7, 2019

Choose a reason for hiding this comment

taylanbil Nov 7, 2019

Choose a reason for hiding this comment

dlibenzi Nov 7, 2019

Choose a reason for hiding this comment

taylanbil Nov 7, 2019

Choose a reason for hiding this comment

taylanbil Nov 7, 2019

Choose a reason for hiding this comment

taylanbil Nov 7, 2019

Choose a reason for hiding this comment

taylanbil Nov 7, 2019

Choose a reason for hiding this comment