Skip to content

Latest commit

 

History

History
58 lines (58 loc) · 2.17 KB

2022-06-28-ali22a.md

File metadata and controls

58 lines (58 loc) · 2.17 KB
title booktitle abstract layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title volume genre issued pdf extras
XAI for Transformers: Better Explanations through Conservative Propagation
Proceedings of the 39th International Conference on Machine Learning
Transformers have become an important workhorse of machine learning, with numerous applications. This necessitates the development of reliable methods for increasing their transparency. Multiple interpretability methods, often based on gradient information, have been proposed. We show that the gradient in a Transformer reflects the function only locally, and thus fails to reliably identify the contribution of input features to the prediction. We identify Attention Heads and LayerNorm as main reasons for such unreliable explanations and propose a more stable way for propagation through these layers. Our proposal, which can be seen as a proper extension of the well-established LRP method to Transformers, is shown both theoretically and empirically to overcome the deficiency of a simple gradient-based approach, and achieves state-of-the-art explanation performance on a broad range of Transformer models and datasets.
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
ali22a
0
{XAI} for Transformers: Better Explanations through Conservative Propagation
435
451
435-451
435
false
Ali, Ameen and Schnake, Thomas and Eberle, Oliver and Montavon, Gr{\'e}goire and M{\"u}ller, Klaus-Robert and Wolf, Lior
given family
Ameen
Ali
given family
Thomas
Schnake
given family
Oliver
Eberle
given family
Grégoire
Montavon
given family
Klaus-Robert
Müller
given family
Lior
Wolf
2022-06-28
Proceedings of the 39th International Conference on Machine Learning
162
inproceedings
date-parts
2022
6
28