Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

itkParallelSparseFieldLevelSetImageFilterTest read/write race with GetPixel() / SetPixel() #4708

Open
seanm opened this issue Jun 3, 2024 · 3 comments
Labels
type:Bug Inconsistencies or issues which will cause an incorrect result under some or all circumstances

Comments

@seanm
Copy link
Contributor

seanm commented Jun 3, 2024

Running the itkParallelSparseFieldLevelSetImageFilterTest test with TSan:

WARNING: ThreadSanitizer: data race (pid=68287)
  Read of size 1 at 0x00010b524516 by thread T12:
    #0 itk::ParallelSparseFieldLevelSetImageFilter<itk::Image<float, 3u>, itk::Image<float, 3u>>::ThreadedUpdateActiveLayerValues(double const&, itk::SparseFieldLayer<itk::ParallelSparseFieldLevelSetNode<itk::Index<3u>>>*, itk::SparseFieldLayer<itk::ParallelSparseFieldLevelSetNode<itk::Index<3u>>>*, unsigned int) itkParallelSparseFieldLevelSetImageFilter.hxx:1556 (ITKLevelSetsTestDriver:arm64+0x1002f7048)
    #1 itk::ParallelSparseFieldLevelSetImageFilter<itk::Image<float, 3u>, itk::Image<float, 3u>>::ThreadedApplyUpdate(double const&, unsigned int) itkParallelSparseFieldLevelSetImageFilter.hxx:1374 (ITKLevelSetsTestDriver:arm64+0x1002c5bac)
    #2 itk::ParallelSparseFieldLevelSetImageFilter<itk::Image<float, 3u>, itk::Image<float, 3u>>::Iterate()::'lambda1'(unsigned long)::operator()(unsigned long) const itkParallelSparseFieldLevelSetImageFilter.hxx:1210 (ITKLevelSetsTestDriver:arm64+0x1002e91cc)
    #3 decltype(std::declval<itk::ParallelSparseFieldLevelSetImageFilter<itk::Image<float, 3u>, itk::Image<float, 3u>>::Iterate()::'lambda1'(unsigned long)&>()(std::declval<unsigned long>())) std::__1::__invoke[abi:ue170006]<itk::ParallelSparseFieldLevelSetImageFilter<itk::Image<float, 3u>, itk::Image<float, 3u>>::Iterate()::'lambda1'(unsigned long)&, unsigned long>(itk::ParallelSparseFieldLevelSetImageFilter<itk::Image<float, 3u>, itk::Image<float, 3u>>::Iterate()::'lambda1'(unsigned long)&, unsigned long&&) invoke.h:340 (ITKLevelSetsTestDriver:arm64+0x1002e9134)
    <snip>
    
  Previous write of size 1 at 0x00010b524516 by thread T3:
    #0 itk::Image<signed char, 3u>::SetPixel(itk::Index<3u> const&, signed char const&) itkImage.h:211 (ITKLevelSetsTestDriver:arm64+0x1000978ac)
    #1 itk::ParallelSparseFieldLevelSetImageFilter<itk::Image<float, 3u>, itk::Image<float, 3u>>::ThreadedUpdateActiveLayerValues(double const&, itk::SparseFieldLayer<itk::ParallelSparseFieldLevelSetNode<itk::Index<3u>>>*, itk::SparseFieldLayer<itk::ParallelSparseFieldLevelSetNode<itk::Index<3u>>>*, unsigned int) itkParallelSparseFieldLevelSetImageFilter.hxx:1583 (ITKLevelSetsTestDriver:arm64+0x1002f72a4)
    #2 itk::ParallelSparseFieldLevelSetImageFilter<itk::Image<float, 3u>, itk::Image<float, 3u>>::ThreadedApplyUpdate(double const&, unsigned int) itkParallelSparseFieldLevelSetImageFilter.hxx:1374 (ITKLevelSetsTestDriver:arm64+0x1002c5bac)
    #3 itk::ParallelSparseFieldLevelSetImageFilter<itk::Image<float, 3u>, itk::Image<float, 3u>>::Iterate()::'lambda1'(unsigned long)::operator()(unsigned long) const itkParallelSparseFieldLevelSetImageFilter.hxx:1210 (ITKLevelSetsTestDriver:arm64+0x1002e91cc)
    <snip>

We see from the above that one thread is writing to 0x00010b524516 and another is simultaneously reading from the same address.

Here is the relevant code, NB the fire emojis:

    else if (new_value < LOWER_ACTIVE_THRESHOLD)
    {
      // This index will move DOWN into a negative (inside) layer.
      // First check for active layer neighbors moving in the opposite direction
      flag = false;
      for (unsigned int i = 0; i < Neighbor_Size; ++i)
      {
        // READ🔥🔥🔥🔥
        if (m_StatusImage->GetPixel(centerIndex + m_NeighborList.GetNeighborhoodOffset(i)) == m_StatusActiveChangingUp)
        {
          flag = true;
          break;
        }
      }
      if (flag)
      {
        ++layerIt;
        continue;
      }

      rms_change_accumulator += itk::Math::sqr(static_cast<float>(new_value - centerValue));
      // update the value of the pixel
      m_OutputImage->SetPixel(centerIndex, new_value);

      // Now remove this index from the active list.
      release_node = layerIt.GetPointer();
      ++layerIt;

      m_Data[ThreadId].m_Layers[0]->Unlink(release_node);
      m_Data[ThreadId].m_ZHistogram[release_node->m_Index[m_SplitAxis]] =
        m_Data[ThreadId].m_ZHistogram[release_node->m_Index[m_SplitAxis]] - 1;

      // now add release_node to status down list
      DownList->PushFront(release_node);

      // WRITE🔥🔥🔥🔥
      m_StatusImage->SetPixel(centerIndex, m_StatusActiveChangingDown);
    }

To me, this indeed looks like a buggy race. But I don't know this code at all.

@seanm seanm added the type:Bug Inconsistencies or issues which will cause an incorrect result under some or all circumstances label Jun 3, 2024
@seanm
Copy link
Contributor Author

seanm commented Jun 3, 2024

@blowekamp maybe this is similar to #3031 ?

@blowekamp
Copy link
Member

The other was a logic bug in the algorithm, which required a good amount of effort in figuring out details of the method. This is a different algorithm.

I could look into the SLIC TSan issues, as I contributed that class.

@seanm
Copy link
Contributor Author

seanm commented Jun 4, 2024

I could look into the SLIC TSan issues, as I contributed that class.

I triaged them last night, and will write up a ticket for them shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:Bug Inconsistencies or issues which will cause an incorrect result under some or all circumstances
Projects
None yet
Development

No branches or pull requests

2 participants