Skip to content

Commit

Permalink
apply activation if transformer used
Browse files Browse the repository at this point in the history
  • Loading branch information
tabergma committed Apr 15, 2020
1 parent 61b6767 commit 9d1802c
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 5 deletions.
2 changes: 1 addition & 1 deletion changelog/5626.msic.rst
Original file line number Diff line number Diff line change
@@ -1 +1 @@
Move ``tfa.activations.gelu(x)`` from ``DIETClassifier`` to transformer block.
Apply ``tfa.activations.gelu(x)`` only if min 1 transformer block is used in ``DIETClassifier``.
4 changes: 4 additions & 0 deletions rasa/nlu/classifiers/diet_classifier.py
Original file line number Diff line number Diff line change
Expand Up @@ -1250,6 +1250,10 @@ def _create_sequence(
inputs, 1 - mask, self._training
)

if self.config[TRANSFORMER_SIZE] > 0:
# apply final activation
outputs = tfa.activations.gelu(outputs)

return outputs, inputs, seq_ids, lm_mask_bool

def _create_all_labels(self) -> Tuple[tf.Tensor, tf.Tensor]:
Expand Down
5 changes: 1 addition & 4 deletions rasa/utils/tensorflow/transformer.py
Original file line number Diff line number Diff line change
Expand Up @@ -626,7 +626,4 @@ def call(
# if normalization is done in encoding layers, then it should also be done
# on the output, since the output can grow very large, being the sum of
# a whole stack of unnormalized layer outputs.
normalized_x = self._layer_norm(x) # (batch_size, length, units)

# apply final activation
return tfa.activations.gelu(normalized_x)
return self._layer_norm(x) # (batch_size, length, units)

0 comments on commit 9d1802c

Please sign in to comment.