Skip to content

Commit

Permalink
PERF: VectorNeighborhoodInnerProduct pixel retrieval out of inner loop
Browse files Browse the repository at this point in the history
Moved the retrieval of neighbor pixels out of the inner `for` loops of both
`VectorNeighborhoodInnerProduct<TImage>::operator()` overloads.

A reduction of more than 8 % of the duration of `itkSyNPointSetRegistrationTest`
was observed, running a Visual Studio 2019 Release build.
  • Loading branch information
N-Dekker authored and dzenanz committed Nov 1, 2022
1 parent 59d12d8 commit eb4c9b6
Showing 1 changed file with 6 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -43,9 +43,11 @@ VectorNeighborhoodInnerProduct<TImage>::operator()(const std::slice &
const auto stride = static_cast<unsigned int>(s.stride());
for (unsigned int i = start; o_it < op_end; i += stride, ++o_it)
{
const auto & neighborPixel = it.GetPixel(i);

for (j = 0; j < VectorDimension; ++j)
{
sum[j] += *o_it * (it.GetPixel(i))[j];
sum[j] += *o_it * neighborPixel[j];
}
}

Expand Down Expand Up @@ -75,9 +77,11 @@ VectorNeighborhoodInnerProduct<TImage>::operator()(const std::slice & s,
const auto stride = static_cast<unsigned int>(s.stride());
for (unsigned int i = start; o_it < op_end; i += stride, ++o_it)
{
const auto & neighborPixel = it[i];

for (j = 0; j < VectorDimension; ++j)
{
sum[j] += *o_it * it[i][j];
sum[j] += *o_it * neighborPixel[j];
}
}

Expand Down

0 comments on commit eb4c9b6

Please sign in to comment.