Skip to content

Commit

Permalink
Optimize Gelu module for caffe2 export (facebookresearch#918)
Browse files Browse the repository at this point in the history
Summary:
Pull Request resolved: facebookresearch#918

TIL ONNX->Caffe2 is very memory inefficient, it creates an intermediate blob for each intermediate output. So, the Gelu operator creates a lot of intermediate ops since it does a bunch of math.

Fix is to use the caffe2 Gelu operator, so all that computation is captured in a single op.

https://pxl.cl/HzGf

Reviewed By: seayoung1112

Differential Revision: D16849396

fbshipit-source-id: 40920b74caa3ad4244f2af5e24f11d4f123b935f
  • Loading branch information
geof90 authored and facebook-github-bot committed Aug 16, 2019
1 parent 2fc87e8 commit e32c2a5
Showing 1 changed file with 20 additions and 8 deletions.
28 changes: 20 additions & 8 deletions pytext/optimizer/activations.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,21 +10,33 @@

class GeLU(nn.Module):
"""
Implements Gaussian Error Linear Units (GELUs). Note: x * x * x is used
instead of torch.pow(x, 3) due to issues with ONNX compatibility:
https://github.com/pytorch/pytorch/issues/18475
Implements Gaussian Error Linear Units (GELUs).
Reference:
Gaussian Error Linear Units (GELUs). Dan Hendrycks, Kevin Gimpel.
Technical Report, 2017. https://arxiv.org/pdf/1606.08415.pdf
"""

def forward(self, x):
return (
0.5
* x
* (1 + torch.tanh(math.sqrt(2 / math.pi) * (x + 0.044715 * (x * x * x))))
)
if torch.onnx.is_in_onnx_export():
# ONNX -> Caffe2 conversion will create an intermediate blob for
# each intermediate math output, which is very memory inefficient.
# We use the Gelu operator directly to reduce the memory footprint
# in the exported model.
return torch.ops._caffe2.Gelu(x, True)
else:

return (
0.5
* x
* (
# Note: x * x * x is used instead of torch.pow(x, 3) due to
# issues with ONNX compatibility:
# https://github.com/pytorch/pytorch/issues/18475
1
+ torch.tanh(math.sqrt(2 / math.pi) * (x + 0.044715 * (x * x * x)))
)
)


def get_activation(name):
Expand Down

0 comments on commit e32c2a5

Please sign in to comment.