Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MinDistanceLabel does not enforce minimum distance, leading to over-segmentation #2006

Closed
iimog opened this issue Nov 15, 2023 · 1 comment
Closed
Labels
bug An issue with an existing feature

Comments

@iimog
Copy link
Contributor

iimog commented Nov 15, 2023

Description

The minimum distance passed to starfish.morphology.Filter.MinDistanceLabel is not enforced. In particular, if multiple local maxima exist with identical values, all of them are retained, independent of their distance. This leads to surprising cases of over-segmentation, where perfectly contiguous masks are split into multiple components (see example below).

Steps/Code to Reproduce

Using this exemplary mask (saved as mask.png):
mask

import numpy as np
import skimage as ski
import matplotlib.pyplot as plt
from starfish import ImageStack
from starfish.morphology import Binarize, Filter

# Load the image, transform it to a BinaryMaskCollection and apply the MinDistanceLabel
img = ImageStack.from_numpy(np.expand_dims(ski.util.img_as_float32(ski.io.imread("mask.png")), (0,1,2)))
binarized = Binarize.ThresholdBinarize(.5).run(img)
masks = Filter.MinDistanceLabel(120, 120).run(binarized)

print("Nuclei found:", len(list(masks.masks())))
plt.imshow(masks.to_label_image().xarray.squeeze(),interpolation="nearest")

Expected Results

Two clearly separated nuclei.
properseg

Actual Results

Four detected nuclei, the top one separated into three distinct areas (the small line between the top and bottom has its own class).
overseg

Cause and possible solution

The cause of this problem is, that the internally called skimage.filter.peak_local_max (

local_maximum: np.ndarray = peak_local_max(
distance,
exclude_border=self._exclude_border,
footprint=footprint,
labels=np.asarray(mask),
)
) is not passed an appropriate value for min_distance, so the default value of 1 is used. Setting the footprint according to the minimum_distance ensures, that there can be only one maximum value in the respective area, but if that value occurs multiple times, all occurrences are returned as local maxima and thus used as seeds in a watershed. This is certainly undesirable as it leads to cases of over-segmentation, as exemplified.
As I'm currently working with 2D data only, changing the starfish code above to pass min_distance=self._minimum_distance_xy fixes this problem for me. However, for 3D data, a more sophisticated solution might be desirable.

Versions

Linux-5.15.0-84-generic-x86_64-with-glibc2.35
Python 3.9.16 | packaged by conda-forge | (main, Feb 1 2023, 21:39:03)
[GCC 11.3.0]
NumPy 1.26.0
SciPy 1.11.3
scikit-image 0.18.3
pandas 1.3.2
sklearn 0.24.2
xarray 0.19.0
sympy 1.5.1
starfish 0.2.2+41.g0e668d12

@iimog iimog added the bug An issue with an existing feature label Nov 15, 2023
@berl
Copy link
Collaborator

berl commented Nov 16, 2023

@iimog this looks like a good candidate for a bugfix PR! Please make one with your change and reference this issue. I agree a 3D version may require more work, but it would be great to solve this problem first anyway.

iimog added a commit to BioMeDS/starfish that referenced this issue Nov 17, 2023
otherwise multiple local maxima are returned even within the distance if
their values are identical. This lead to lots of over-segmentation. In
particular, multiple perfectly contigous nuclei were split in half.

Related spacetx#2006
berl pushed a commit that referenced this issue Dec 1, 2023
* Add min_distance parameter to peak_local_max call

otherwise multiple local maxima are returned even within the distance if
their values are identical. This lead to lots of over-segmentation. In
particular, multiple perfectly contigous nuclei were split in half.

Related #2006

* Fix iss test

adjust assumption on number of detected cells

* Fix docker build

the docker build was failing, because botocore needs urllib3 to be in a
certain version range. Pinning urllib3 to that range fixes the issue.
@berl berl closed this as completed Jan 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An issue with an existing feature
Projects
None yet
Development

No branches or pull requests

2 participants