Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1D Convolutions not Expected Functionality (Kim 2014) #236

Closed
neubig opened this issue Jan 11, 2017 · 9 comments
Closed

1D Convolutions not Expected Functionality (Kim 2014) #236

neubig opened this issue Jan 11, 2017 · 9 comments
Labels
moderate bug Issues that should be fixed but only affect less common environments or functionality

Comments

@neubig
Copy link
Contributor

neubig commented Jan 11, 2017

(Edited for clarity)
We were taking a look at 1D convolutions now, and it looks like they aren't doing what we would expect according to Kim 2014: http://aclweb.org/anthology/D/D14/D14-1181.pdf

For example, for conv1d, we would expect:

  • Input X is a {L,M} matrix.
  • There are N filters F, each of size {L,O}.
  • The computation result Y is an {N,M-O+1} matrix. Y_{i,j} is equal to the dot product of filter F[i] and matrix slice X[:,j:j+O].

On the other hand, what we have now is:

  • Input X is a {L,M} matrix.
  • There are L filters F, each of size {1,O}.
  • The computation result Y is an {L, M-O+1} matrix, where Y_{i,j} is equal to the dot product of filter F[i] and matrix slice F[i,j:j+O].

In other words, filters are run row-wise over the input instead of being calculated with respect to the whole input.

@neubig neubig added the major bug Issues that silently cause incorrect results, break installation on common environments, etc. label Jan 11, 2017
@neubig neubig changed the title 1D Convolutions not Correct 1D Convolutions not Expected Functionality (Kim 2014) Jan 11, 2017
@neubig neubig added minor bug Bugs that aren't too bad, only concern documentation, or have easy work-arounds moderate bug Issues that should be fixed but only affect less common environments or functionality and removed major bug Issues that silently cause incorrect results, break installation on common environments, etc. minor bug Bugs that aren't too bad, only concern documentation, or have easy work-arounds labels Jan 11, 2017
@sunalbert
Copy link
Contributor

Hi @neubig , does L, M represent the length of each word vector and the length of input sentence respectively? According to Kim 2014: http://aclweb.org/anthology/D/D14/D14-1181.pdf, I think Y[i][j] may be equal to the dot product of filter F[i] and matrix slice X[:,j:j+O] instead of X[:j,j+O].

@neubig
Copy link
Contributor Author

neubig commented Jan 17, 2017

@MalcolmSun Thanks for the catch. I just had gotten my slice notation wrong. I've fixed the original post.

@sunalbert
Copy link
Contributor

Hi @neubig , I am attempting to modify the 1D convolution operation. To the best of my knowledge, 1D convolution is between two 1D sequences, the convolution in Kim 2014 seems like a 2D convolution operation in computer vision. So I decide to implement 2D convolution function wich supports padding and stride. What do you think about it?

@neubig
Copy link
Contributor Author

neubig commented Jan 25, 2017

Hi @MalcolmSun , yes we've had some discussions about this offline and basically the current understanding is this:

  1. Kim 2014 is basically a 2D convolution where the height of the filter is equal to the word embedding. So to answer your question, yes it would be sufficient for you to implement the more general class of 2D convolutions, and it would be greatly appreciated.
  2. The 1D convolution in DyNet is implementing something else entirely. I'm not aware of any research paper that uses a function like this, and don't have any immediate ideas about how we could use it, although it might be useful for something. I think perhaps we can rename it something else more intuitive, or remove it entirely?

@pmichel31415
Copy link
Collaborator

@MalcolmSun @neubig Just to give my understanding :

I think you should not consider the embedding dimension as the "second dimension" of the convolution.

To me it is more like the number of channels that you have in an image (R, G, B, alpha,...), or like a "depth" and you only "shift" the filter along 1 dimension : the time dimension

So

  • 1D dimension = filters have the shape [temporal span]x[number of channels] and you only shift the filters along the time dimension (1D)

  • 2D dimension = filters have the shape [width]x[height]x[number of channels] and you shift the filter along the width and height dimension of the 2D input (image,...)

etc...

@neubig
Copy link
Contributor Author

neubig commented Jan 25, 2017

@pmichel31415 Thinking about channels is a good idea. Just putting this out there, but can we make convolutions general where the filter has the same number of dimensions of the input, and convolutions are shifted along all dimensions where the size is less than the input? In the case of channels, the filter width along the "channel" dimension is equal to the number of channels, so we don't do any shifting. It would be great if we could think about things this way and not have to worry about naming our dimensions at all.

@dbc148
Copy link
Contributor

dbc148 commented Jan 25, 2017

I like the idea of having it be essentially an n-dimensional convolution, but it still might be a good idea to use it to make conventional conv2d and conv1d operators available for ease of use. I think we'd have to do this manually, though, along the same way you implemented conv1d but just covering all of the dimensions, as I don't think eigen supports anything higher than conv2d+channels.

@dpressel
Copy link

I have the Kim algorithm implemented in Keras, PyTorch, Torch7 and TensorFlow. I'm interested in implementing it in DyNet as well, once this issue is resolved. If helpful for reference of what other FWs do, code is here:

https://github.com/dpressel/baseline/tree/master/classify

@neubig
Copy link
Contributor Author

neubig commented Apr 21, 2017

1D convolutions have been deprecated, because things should be able to be handled by conv2d: #388

@neubig neubig closed this as completed Apr 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
moderate bug Issues that should be fixed but only affect less common environments or functionality
Projects
None yet
Development

No branches or pull requests

5 participants