-
Notifications
You must be signed in to change notification settings - Fork 499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IntegratedGradient with XLM type models (FlauBERT) #414
Comments
Hi carodupdup, Layer named Checkout if inputs_embeds is None:
inputs_embeds = self.word_embeddings(input_ids)
position_embeddings = self.position_embeddings(position_ids)
token_type_embeddings = self.token_type_embeddings(token_type_ids)
embeddings = inputs_embeds + position_embeddings + token_type_embeddings Using Code written by @vfdev-5 is chronologicaly before this trick with input-output mirroring surfaced I guess. I've actually used modified |
Sorry it took so long for me to answer, I have been trying to figure it out since you answered but without managing. I am aware those questions might be on a beginners level, so I am sorry if I lack some kind of expertise. Finally, to come back to your answer, I do not understand what do you mean by using the |
I'll try to answer Your questions one by one
With From tutorial lig = LayerIntegratedGradients(squad_pos_forward_func, model.bert.embeddings) As You can see, You still need to pass forward function, one reason is that under the hood, Looking at If all You care about is the whole layer, go for If You would like to attribute to sub-embeddings using # scale features and compute gradients. (batch size is abbreviated as bsz)
# scaled_features' dim -> (bsz * #steps x inputs[0].shape[1:], ...)
scaled_features_tpl = tuple(
torch.cat(
[baseline + alpha * (input - baseline) for alpha in alphas], dim=0
).requires_grad_()
for input, baseline in zip(inputs, baselines)
) Tutorial shows You calculations for all sub-embeddings, but You could be more specific and interpret only one of those sub-embeddings. |
Input is forwarded for the whole network, not only one layer, meaning Captum would compute gradient of the output w.r.t inputs of the network forward function. SQUAD tutorial note: |
Had to refactor Flair's |
Hi @lipka-clazzpl, How did you manage to integrate Flair with the IntegratedGradient. Do you have a code snippet to help me out? I have a fine-tuned a transformers-based classification model via Flair, and I am looking to explain what drives the predictions. |
Hi there,
I am facing some issues trying to implement the IntegratedGradient algorithms with the pertained Flaubert Model I have made.
First, I used the tutorial for the Bert for SQUAD and scrolling the issues, I came upon this gist: https://gist.github.com/davidefiocco/3e1a0ed030792230a33c726c61f6b3a5
that allowed me to apply the Flaubert Model by making only small changes.
However, this notebook allows me to use the LayerIntegratedGradient (LIG) algorithm but not the IntegratedGradient (IG). Therefore, going through the tutorial, I saw this paragraph explaining how to modify the algorithm used from LIG to IG:
"we can also use IntegratedGradients class instead (of LayerIntegratedGradients), however in that case we need to precompute the embeddings and wrap Embedding layer with InterpretableEmbeddingBase module. This is necessary because we cannot perform input scaling and subtraction on the level of word/token indices and need access to the embedding layer."
I'm afraid that it confused me very much. Finally, looking at issue #150 (the code written by vfdev-5) is where I am blocked. First, I am wondering where the person uses the InterpretableEmbeddingBase module. Furthermore, I am trying to use the code, but the encoder, that it an object in the BERT models, is not one in XLM (at least from my knowledge). So I am wondering if it is possible to achieve the use the IntegratedGradient algorithm for a model such as an XLM.
If anyone has been working on this specific problem or is willing to help me, it would be much appreciated. Thank you in advance!
The text was updated successfully, but these errors were encountered: