-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attention weight logging #5673
Attention weight logging #5673
Conversation
@JEM-Mosig what's up with this PR? |
The corresponding project is on hold due to deadline for dataset collection. Should continue mid June. |
how is the implementation going, is this still relevant @JEM-Mosig ? |
I'm still on this. It's a side project and might take a bit more time. Not critical for 2.0 or anything. I only created a draft PR so its easier for others to see the code changes when we discuss. |
@JEM-Mosig I just had a look and it's making me wonder what the easiest way might to go about visualising this. Your PR opens up the attention weights but there's also some other vectors that I'd love to get a hold on too. I'm wondering what is practical here. I'd love to make DIET more "peekable" but doing so might add unwanted complexity. |
@JEM-Mosig why do we need special |
If we output the attention weights (and maybe other data) with the predict / process functions, then messages get a lot bigger. Though we could add a parameter that decides to add or not to add the diagnostic data. I think I'll change it to be parameters @koaning . |
I see. @wochinge what do you think would be the best approach here? |
Proposed changes:
Status (please check what you already did):
black
(please check Readme for instructions)