[Longformer] fix longformer slow-down #5811

patrickvonplaten · 2020-07-16T13:49:56Z

A drastic slow-down of Longformer after the PR: #5219 was deteced by @HHousen (thanks a lot!) here: #4406 (comment) .

After digging a bit into the code it can be seen that the line:
is_global_attn = any(is_index_global_attn.flatten()) is the reason of the drastic slow-down.

Running the following benchmark on master:

python examples/benchmarking/run_benchmark.py --models allenai/longformer-base-4096 --no_memory --sequence_length 512 1024

yields the following result:

====================       INFERENCE - SPEED - RESULT       ====================
--------------------------------------------------------------------------------
          Model Name             Batch Size     Seq Length     Time in s   
--------------------------------------------------------------------------------
 allenai/longformer-base-4096        8              512            0.647     
 allenai/longformer-base-4096        8              1024           1.284     
--------------------------------------------------------------------------------

====================        ENVIRONMENT INFORMATION         ====================
- transformers_version: 3.0.2
- framework: PyTorch
- use_torchscript: False
- framework_version: 1.5.0
- python_version: 3.7.7
- system: Linux
- cpu: x86_64
- architecture: 64bit
- date: 2020-07-16
- time: 14:07:02.719531
- fp16: False
- use_multiprocessing: True
- only_pretrain_model: False
- cpu_ram_mb: 32089
- use_gpu: True
- num_gpus: 1
- gpu: TITAN RTX
- gpu_ram_mb: 24217
- gpu_power_watts: 280.0
- gpu_performance_state: 0
- use_tpu: False

Now on this branch the results are as follows:

====================       INFERENCE - SPEED - RESULT       ====================
--------------------------------------------------------------------------------
          Model Name             Batch Size     Seq Length     Time in s   
--------------------------------------------------------------------------------
 allenai/longformer-base-4096        8              512            0.141     
 allenai/longformer-base-4096        8              1024           0.271     
--------------------------------------------------------------------------------

====================        ENVIRONMENT INFORMATION         ====================
- transformers_version: 3.0.2
- framework: PyTorch
- use_torchscript: False
- framework_version: 1.5.0
- python_version: 3.7.7
- system: Linux
- cpu: x86_64
- architecture: 64bit
- date: 2020-07-16
- time: 14:07:57.623378
- fp16: False
- use_multiprocessing: True
- only_pretrain_model: False
- cpu_ram_mb: 32089
- use_gpu: True
- num_gpus: 1
- gpu: TITAN RTX
- gpu_ram_mb: 24217
- gpu_power_watts: 280.0
- gpu_performance_state: 0
- use_tpu: False

So this simple line is responsible for a slow-down of factor 4.

Moral of the story: Never use any() on a PyTorch tensor => always use tensor.any(). I wonder if this is actually a known problem of PyTorch/Python. It might be a good to check our code if we have more statements like this.

Another lesson for me is that one should always run the benchmarking before and after doing such a big refactoring as in #5219 .
It's very simple to run the benchmark script for a model and takes usually only a couple of seconds. Ideally we should have performance regression tests to automatically detect such slow downs.

Pinging @thomwolf @mfuntowicz @sshleifer @LysandreJik @ibeltagy - think this is quite interesting.

codecov · 2020-07-16T13:58:58Z

Codecov Report

Merging #5811 into master will decrease coverage by 0.80%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #5811      +/-   ##
==========================================
- Coverage   78.19%   77.38%   -0.81%     
==========================================
  Files         146      146              
  Lines       26047    26047              
==========================================
- Hits        20367    20157     -210     
- Misses       5680     5890     +210

Impacted Files	Coverage Δ
src/transformers/modeling_longformer.py	`89.21% <100.00%> (ø)`
src/transformers/modeling_tf_mobilebert.py	`23.38% <0.00%> (-73.39%)`	⬇️
src/transformers/generation_tf_utils.py	`86.21% <0.00%> (ø)`
src/transformers/generation_utils.py	`97.11% <0.00%> (+0.28%)`	⬆️
src/transformers/modeling_openai.py	`82.31% <0.00%> (+1.28%)`	⬆️
src/transformers/modeling_tf_openai.py	`95.18% <0.00%> (+74.91%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 89a78be...25c6419. Read the comment docs.

sshleifer · 2020-07-16T16:53:38Z

Great find. +1 to tests!

fix longformer slow down

25c6419

patrickvonplaten requested review from sshleifer, mfuntowicz, thomwolf and julien-c July 16, 2020 13:50

patrickvonplaten changed the title ~~[Longformer] fix longformer slow down~~ [Longformer] fix longformer slow-down Jul 16, 2020

patrickvonplaten merged commit 057411c into huggingface:master Jul 16, 2020

patrickvonplaten mentioned this pull request Aug 6, 2020

TF Longformer #5764

Merged

9 tasks

lalitpagaria mentioned this pull request Oct 27, 2020

Configuration of DensePassageRetreiver deepset-ai/haystack#519

Closed

dysby mentioned this pull request Apr 20, 2023

TypeError: forward() takes from 2 to 7 positional arguments but 8 were given allenai/longformer#215

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Longformer] fix longformer slow-down #5811

[Longformer] fix longformer slow-down #5811

patrickvonplaten commented Jul 16, 2020 •

edited

Loading

codecov bot commented Jul 16, 2020

sshleifer commented Jul 16, 2020

[Longformer] fix longformer slow-down #5811

[Longformer] fix longformer slow-down #5811

Conversation

patrickvonplaten commented Jul 16, 2020 • edited Loading

codecov bot commented Jul 16, 2020

Codecov Report

sshleifer commented Jul 16, 2020

patrickvonplaten commented Jul 16, 2020 •

edited

Loading