You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We use a read_and_pad to process the weights for the value expanding from q7_t to q15_t, and also a group: __PKHxx for
reording the value from (a0, a2, a1, a3) to (a0, a1, a2, a3).
My question is that why we add this two PKHxx operations, I think that We can still use the (a0, a2, a1, a3), if we process the
input with the same way (I found that the 1x1 conv2d has the similarity operation, without __PKHxx). So that we can save two-instructs and then save the inference time.
Regards,
Crist
The text was updated successfully, but these errors were encountered:
Hi @CristXu ,
Thanks for your comments!
You are right that we do additional ordering in some places.
We are looking over this now and see where we can get rid of PKHTB/PKHBT.
Thanks,
Måns
Hi,
I found that at below line:
CMSIS-NN/Source/NNSupportFunctions/arm_nn_mat_mul_kernel_s16.c
Line 93 in d071e9f
We use a read_and_pad to process the weights for the value expanding from q7_t to q15_t, and also a group: __PKHxx for
reording the value from (a0, a2, a1, a3) to (a0, a1, a2, a3).
My question is that why we add this two PKHxx operations, I think that We can still use the (a0, a2, a1, a3), if we process the
input with the same way (I found that the 1x1 conv2d has the similarity operation, without __PKHxx). So that we can save two-instructs and then save the inference time.
Regards,
Crist
The text was updated successfully, but these errors were encountered: