-
Notifications
You must be signed in to change notification settings - Fork 863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BERT nightly benchmark on Inferentia2 #2283
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2283 +/- ##
==========================================
+ Coverage 69.39% 69.82% +0.42%
==========================================
Files 77 77
Lines 3441 3420 -21
Branches 57 57
==========================================
Hits 2388 2388
+ Misses 1050 1029 -21
Partials 3 3 see 2 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@namannandan If the transformers version expected is 4.19.0, where is this being set?
@agunapal the issue with the transformers version is only observed when tracing the model. Loading the traced model and inference works as expected even with more recent versions of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@namannandan Is the issue with the validate_benchmark.py resolved now?
42ac457
to
4146b29
Compare
Successful benchmark run with validation: https://github.com/pytorch/serve/actions/runs/4986426850 |
Description
Benchmark BERT model on Inferentia2 instance
Model artifacts:
Self hosted runner(inf2.8xlarge):
Type of change
Feature testing
Checkpoint file generation
Note: The artifacts above were traced using
transformers
version4.19.0
. With more recenttransformers
versions, the traced model for Neuron may generate incorrect inference result. Model output isNaN
.MAR file generation
Workflow test
Test branch:
test-inf2-benchmark
Workflow run and artifacts: https://github.com/pytorch/serve/actions/runs/4834127396
(Artifacts and metrics are being published but validation fails currently).
Benchmark results:
TorchServe Benchmark on neuronx
Date: 2023-04-28 20:57:15
TorchServe Version: torchserve-nightly==2023.4.27
scripted_mode_bert_neuronx_batch_1
scripted_mode_bert_neuronx_batch_2
scripted_mode_bert_neuronx_batch_4
scripted_mode_bert_neuronx_batch_8
Checklist: