-
Notifications
You must be signed in to change notification settings - Fork 480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[benchmarks] Default to bfloat16
(inference) and AMP (training) precision.
#6518
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
- Pick data type based on `test`. - Create `cast_to_dtype` function.
f83895a
to
4864f13
Compare
I'm still running the benchmarks in order to grasp the regressions this PR introduces. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
@@ -144,6 +144,18 @@ | |||
"hf_T5_generate", | |||
} | |||
|
|||
FORCE_AMP_FOR_FP16_BF16_MODELS = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know how this is configured on PyTorch HUD? Maintaining a list like this feels prone to divergence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lists are taken from the scripts in the PyTorch main repo. They are used for generating the PyTorch HUD results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will leave a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you share the link to the scripts in the PyTorch main repo that are used to generate the HUD result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -144,6 +144,18 @@ | |||
"hf_T5_generate", | |||
} | |||
|
|||
FORCE_AMP_FOR_FP16_BF16_MODELS = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you post BERT_pytorch
kernel profile after the change?
The profiling results (in the following posts) were generated with the following command: python xla/benchmarks/experiment_runner.py \
--suite-name torchbench --accelerator cuda --dump-pytorch-profiles \
--xla PJRT --dynamo openxla --test eval --repeat 8 --iterations-per-run 1 \
-k BERT_pytorch |
BERT_pytorch (before)
|
BERT_pytorch (after)
|
Thank you! Looks good to push forward. |
Fix: #6483
This PR makes
bfloat16
the default data-type for inference, and AMP the default execution mode for training. This follows the execution found in the PyTorch HUD.cc @miladm