You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fastor outperform the other inner_product implementation except with small kernel size and I'm sure that I don't use Fastor correctly. In the main processing loop (over the sample buffer), I can't call Fastor::inner directly with a subview like that:
where N is the templated FIR order, h is the FIR coefficients tensor of the impulse response, z a double-buffer state tensor related of the z-N essence of the FIR equation and buffer the sample buffer that receive the discrete convolution result.
I need to cast the subview like that to allow compilation:
Even if the method outperform the other method on kernel > 32 (in the benchmark of power of 2), I'm pretty sure that the assignment operator in the main loop is a bottleneck for smaller sizes kernels.
Why can I directly call Fastor::inner with the subview ? What is wrong with my code ?
Thank you very much for you answer and your time !!!
The text was updated successfully, but these errors were encountered:
ghost
changed the title
Question about FIR-like using inner product
Question about FIR using inner product
Jun 13, 2021
Hi Roman, sorry to ask you something maybe naïve, but I have add a little class to a FIR Benchmark produce by jatinchowdhury18 repo, you can find the issue here : New promising benchmark using Fastor C++
Fastor outperform the other
inner_product
implementation except with small kernel size and I'm sure that I don't use Fastor correctly. In the main processing loop (over the sample buffer), I can't callFastor::inner
directly with a subview like that:where
N
is the templated FIR order,h
is the FIR coefficients tensor of the impulse response,z
a double-buffer state tensor related of thez-N
essence of the FIR equation andbuffer
the sample buffer that receive the discrete convolution result.I need to cast the subview like that to allow compilation:
Even if the method outperform the other method on kernel > 32 (in the benchmark of power of 2), I'm pretty sure that the assignment operator in the main loop is a bottleneck for smaller sizes kernels.
Why can I directly call
Fastor::inner
with the subview ? What is wrong with my code ?Thank you very much for you answer and your time !!!
The text was updated successfully, but these errors were encountered: