1D Convolutions not Expected Functionality (Kim 2014) #236

neubig · 2017-01-11T20:55:20Z

(Edited for clarity)
We were taking a look at 1D convolutions now, and it looks like they aren't doing what we would expect according to Kim 2014: http://aclweb.org/anthology/D/D14/D14-1181.pdf

For example, for conv1d, we would expect:

Input X is a {L,M} matrix.
There are N filters F, each of size {L,O}.
The computation result Y is an {N,M-O+1} matrix. Y_{i,j} is equal to the dot product of filter F[i] and matrix slice X[:,j:j+O].

On the other hand, what we have now is:

Input X is a {L,M} matrix.
There are L filters F, each of size {1,O}.
The computation result Y is an {L, M-O+1} matrix, where Y_{i,j} is equal to the dot product of filter F[i] and matrix slice F[i,j:j+O].

In other words, filters are run row-wise over the input instead of being calculated with respect to the whole input.

sunalbert · 2017-01-12T08:29:20Z

Hi @neubig , does L, M represent the length of each word vector and the length of input sentence respectively? According to Kim 2014: http://aclweb.org/anthology/D/D14/D14-1181.pdf, I think Y[i][j] may be equal to the dot product of filter F[i] and matrix slice X[:,j:j+O] instead of X[:j,j+O].

neubig · 2017-01-17T16:05:52Z

@MalcolmSun Thanks for the catch. I just had gotten my slice notation wrong. I've fixed the original post.

sunalbert · 2017-01-25T15:24:30Z

Hi @neubig , I am attempting to modify the 1D convolution operation. To the best of my knowledge, 1D convolution is between two 1D sequences, the convolution in Kim 2014 seems like a 2D convolution operation in computer vision. So I decide to implement 2D convolution function wich supports padding and stride. What do you think about it?

neubig · 2017-01-25T20:26:09Z

Hi @MalcolmSun , yes we've had some discussions about this offline and basically the current understanding is this:

Kim 2014 is basically a 2D convolution where the height of the filter is equal to the word embedding. So to answer your question, yes it would be sufficient for you to implement the more general class of 2D convolutions, and it would be greatly appreciated.
The 1D convolution in DyNet is implementing something else entirely. I'm not aware of any research paper that uses a function like this, and don't have any immediate ideas about how we could use it, although it might be useful for something. I think perhaps we can rename it something else more intuitive, or remove it entirely?

pmichel31415 · 2017-01-25T20:36:09Z

@MalcolmSun @neubig Just to give my understanding :

I think you should not consider the embedding dimension as the "second dimension" of the convolution.

To me it is more like the number of channels that you have in an image (R, G, B, alpha,...), or like a "depth" and you only "shift" the filter along 1 dimension : the time dimension

So

1D dimension = filters have the shape [temporal span]x[number of channels] and you only shift the filters along the time dimension (1D)
2D dimension = filters have the shape [width]x[height]x[number of channels] and you shift the filter along the width and height dimension of the 2D input (image,...)

etc...

neubig · 2017-01-25T20:52:36Z

@pmichel31415 Thinking about channels is a good idea. Just putting this out there, but can we make convolutions general where the filter has the same number of dimensions of the input, and convolutions are shifted along all dimensions where the size is less than the input? In the case of channels, the filter width along the "channel" dimension is equal to the number of channels, so we don't do any shifting. It would be great if we could think about things this way and not have to worry about naming our dimensions at all.

dbc148 · 2017-01-25T21:10:03Z

I like the idea of having it be essentially an n-dimensional convolution, but it still might be a good idea to use it to make conventional conv2d and conv1d operators available for ease of use. I think we'd have to do this manually, though, along the same way you implemented conv1d but just covering all of the dimensions, as I don't think eigen supports anything higher than conv2d+channels.

dpressel · 2017-02-21T13:32:14Z

I have the Kim algorithm implemented in Keras, PyTorch, Torch7 and TensorFlow. I'm interested in implementing it in DyNet as well, once this issue is resolved. If helpful for reference of what other FWs do, code is here:

https://github.com/dpressel/baseline/tree/master/classify

neubig · 2017-04-21T21:11:00Z

1D convolutions have been deprecated, because things should be able to be handled by conv2d: #388

neubig added the major bug Issues that silently cause incorrect results, break installation on common environments, etc. label Jan 11, 2017

neubig changed the title ~~1D Convolutions not Correct~~ 1D Convolutions not Expected Functionality (Kim 2014) Jan 11, 2017

neubig mentioned this issue Apr 11, 2017

Incorporate cuDNN, add conv2d CPU/GPU version (based on Eigen and cuDNN) #388

Merged

neubig closed this as completed Apr 21, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1D Convolutions not Expected Functionality (Kim 2014) #236

1D Convolutions not Expected Functionality (Kim 2014) #236

neubig commented Jan 11, 2017 •

edited

Loading

sunalbert commented Jan 12, 2017

neubig commented Jan 17, 2017

sunalbert commented Jan 25, 2017

neubig commented Jan 25, 2017

pmichel31415 commented Jan 25, 2017

neubig commented Jan 25, 2017

dbc148 commented Jan 25, 2017

dpressel commented Feb 21, 2017

neubig commented Apr 21, 2017

1D Convolutions not Expected Functionality (Kim 2014) #236

1D Convolutions not Expected Functionality (Kim 2014) #236

Comments

neubig commented Jan 11, 2017 • edited Loading

sunalbert commented Jan 12, 2017

neubig commented Jan 17, 2017

sunalbert commented Jan 25, 2017

neubig commented Jan 25, 2017

pmichel31415 commented Jan 25, 2017

neubig commented Jan 25, 2017

dbc148 commented Jan 25, 2017

dpressel commented Feb 21, 2017

neubig commented Apr 21, 2017

neubig commented Jan 11, 2017 •

edited

Loading