Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does not work with CPU: Grouped Convolution #1

Open
SchernHe opened this issue Mar 12, 2021 · 4 comments
Open

Does not work with CPU: Grouped Convolution #1

SchernHe opened this issue Mar 12, 2021 · 4 comments

Comments

@SchernHe
Copy link

SchernHe commented Mar 12, 2021

Hey, first of all, thanks for your work - pretty fast :)

I just wanted to test your repository and noticed that the code fails for inference on CPU due to the grouped convolution.

Code:

model = NFNet(num_classes=1000, variant=variant)
model.build((None, 320, 320, 3))
model.load_weights(f"{variant}_NFNet/{variant}_NFNet")

test_image = tf.zeros(
    shape=(1, 320, 320, 3), dtype=tf.float32
)
model(test_image)

Error:
UnimplementedError: The Conv2D op currently does not support grouped convolutions on the CPU. A grouped convolution was attempted to be run because the input depth of 256 does not match the filter input depth of 128 [Op:Conv2D] (https://github.com/tensorflow/tensorflow/blob/669993ebe8534eac877eec61225925cff737eac2/tensorflow/core/kernels/conv_ops.cc#L160)

I already started debugging and as far as I see, the error occurs in the Second Block in conv1 when calling WSConv2D(). Here, the inputs are of shape (1, 68, 120, 256), while the weights are (3, 3, 128, 256).

I am not that familiar with grouped convolutions and NFNets in general. So I thought, you maybe already know how to solve the issue, if possible?

Edit:
Is it possible that the filters are already divided by the number of groups (in this case 2) and the inputs are not? See here

@Sicily-F
Copy link

Sicily-F commented Apr 2, 2021

hi there, can I just ask were you testing on your own data, or on ImageNet?

@abhay-7
Copy link

abhay-7 commented May 19, 2021

Hey were you able to find a workaround to run it on CPU?

@SchernHe
Copy link
Author

Hey, sorry for late response @Sicily-F.

I run everything on custom data. Moreover, I could make it run on the CPU via @tf.function(experimental_compile=True) (Issue 29005), but it was not applicable wrt. perfomance/time.

@Sicily-F
Copy link

ok, I'll give that a try
I largely use the image datagenerator function from Keras - so will see if I can make a workaround work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants