-
-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding nn.Embedding layer. #406
Conversation
src/nn/embedding.rs
Outdated
{ | ||
type Output = Tensor<Rank2<SEQ, DIM>, f32, D>; | ||
fn forward(&self, input: Tensor<Rank1<SEQ>, usize, D>) -> Self::Output { | ||
self.weight.retaped().gather(input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the particular line that seems to be not working.
In other layer I found
self.weight.retaped::<T>() // T is the tape of the input
Which seems to be the trick. However, it seems to me that the weight could contain the tape, and the input cannot ( since it's only indexing within the weight tensor).
I'm out of ideas to make this work. Could you provide any help @coreylowman ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figure out a way by making GatherTo
generic on the Tape too. I'm not to sure about the modifications though.
src/nn/embedding.rs
Outdated
fn forward(&self, input: Tensor<Rank2<BATCH, SEQ>, usize, D, T>) -> Self::Output { | ||
self.weight.retaped::<T>().gather(input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One trick we could do here is with SplitTape and PutTape:
fn forward(&self, input: Tensor<Rank2<BATCH, SEQ>, usize, D, T>) -> Self::Output { | |
self.weight.retaped::<T>().gather(input) | |
fn forward(&self, input: Tensor<Rank2<BATCH, SEQ>, usize, D, T>) -> Self::Output { | |
let (input, tape) = input.split_tape(); | |
self.weight.clone().put_tape(tape).gather(input) |
I think this should avoid the need to change select/gather?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works !
Still somehow mystical how the tape thing works :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Hmm I wonder what we could to do make it easier to understand intuitively... That is a big difference with pytorch, and while most use cases shouldn't need to do stuff with tapes, for understanding internals it would be helpful.
src/nn/embedding.rs
Outdated
fn forward(&self, input: Tensor<Rank2<BATCH, SEQ>, usize, D, T>) -> Self::Output { | ||
self.weight.retaped::<T>().gather(input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, do you still need the retaped even with the modifications to gather?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not with the manual split_tape.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks for the contribution!
Attempt to create
nn.Embedding
layer.However I am not able to finish the gradients part.