support remove_input_padding for BertForSequenceClassification models #1834

Altair-Alpha · 2024-06-25T11:30:59Z

Related issue: #1755
Content:

Support remove_input_padding for BertForSequenceClassification models (implementation details given in code comment)
Refinement of the build.py script, e.g., the original script doen't have a input model dir parameter, and will init the model with random weights, which is not intuitive.
Since the model input is changed from 3 to 5, i.e. input_ids, input_lengths, token_type_ids, position_ids, max_input_length, I add a standalone run_remove_input_padding.py demo script, and show how to build them with only input_ids and token_type_ids.

I only implemented and tested this for BertForSequenceClassification but not other BERT models yet, please feel free to do further work on this :)

Altair-Alpha · 2024-06-25T11:40:15Z

For our data, there's about 35% latency and 28% throughput enhancement when batch_size is 16, and the diff of output logits between the pytorch model and trt with remove_input_padding is below 0.01

nv-guomingz · 2024-06-25T11:41:03Z

Thanks @Altair-Alpha, we'll merge your changes and upstream to github next week.

examples/bert/build.py

nv-guomingz · 2024-07-09T04:36:33Z

hi @Altair-Alpha , we've merged your PR into our internal code base with several minimal changes for internal ci.
This MR will be avaiable on next Tensor Tuesday.

magpiezhang added 2 commits June 25, 2024 14:29

support remove_input_padding for BertForSequenceClassification models

3b43f4c

support remove_input_padding for BertForSequenceClassification models

cbf60cc

Altair-Alpha mentioned this pull request Jun 25, 2024

Support --remove_input_padding for BERT models? #1755

Closed

nv-guomingz reviewed Jun 26, 2024

View reviewed changes

examples/bert/build.py Outdated Show resolved Hide resolved

backward compatibilty for no model_dir arg

37e3579

nv-guomingz added the Merged label Jul 10, 2024

kaiyux mentioned this pull request Jul 16, 2024

Update TensorRT-LLM #1954

Merged

nv-guomingz closed this Jul 16, 2024

Shixiaowei02 mentioned this pull request Aug 29, 2024

TensorRT-LLM v0.12 Update #2164

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support remove_input_padding for BertForSequenceClassification models #1834

support remove_input_padding for BertForSequenceClassification models #1834

Altair-Alpha commented Jun 25, 2024

Altair-Alpha commented Jun 25, 2024

nv-guomingz commented Jun 25, 2024

nv-guomingz commented Jul 9, 2024

support remove_input_padding for BertForSequenceClassification models #1834

support remove_input_padding for BertForSequenceClassification models #1834

Conversation

Altair-Alpha commented Jun 25, 2024

Altair-Alpha commented Jun 25, 2024

nv-guomingz commented Jun 25, 2024

nv-guomingz commented Jul 9, 2024