Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch extractor can go (fully) out of bounds #710

Closed
measty opened this issue Sep 2, 2023 · 0 comments · Fixed by #712 or #761
Closed

Patch extractor can go (fully) out of bounds #710

measty opened this issue Sep 2, 2023 · 0 comments · Fixed by #712 or #761

Comments

@measty
Copy link
Collaborator

measty commented Sep 2, 2023

  • TIA Toolbox version: 1.4
  • Python version: ALL
  • Operating System: ALL

Description

I think the logic for getting the coordinate list in patch extraction isn't quite right.

The following code:

output_x_end = (
np.ceil(image_shape[0] / patch_output_shape[0]) * patch_output_shape[0]
)
output_x_list = np.arange(0, int(output_x_end), stride_shape[0])
output_y_end = (
np.ceil(image_shape[1] / patch_output_shape[1]) * patch_output_shape[1]
)
output_y_list = np.arange(0, int(output_y_end), stride_shape[1])

Only works if stride is the same as the patch size. If it isn't, its possible that some patch locations are generated that are entirely outside the slide.

What I Did

An example:

wsi_shape = [43668, 14634]
coords = PatchExtractor.get_coordinates(
            image_shape=wsi_shape,
            patch_input_shape=(256,256),
            stride_shape=(128,128),
            input_within_bound=False,
        )
np.max(coords, axis=0) # gives array([43648, 14720, 43904, 14976])

note there are patches with top-left y coord of 14720, but the slide dimension is 14634. That means there are patches which lie wholly outside the slide bounds, which should not be happening. (input_within_bounds=False just means we allow patches that overlap with the slide boundary)

This also raises another discussion point. WSIreader.read_rect (or read bounds) will happily read a patch that is entirely outside the bounds of the slide, and will do so silently. Its designed to safely pad regions that overlap the edge of the slide, and that is fine, but I think in most cases, if your code is ending up reading patches from entirely outside the slide, theres something wrong somewhere and it would be good to know that its happening. So we could have a discussion on what the behaviour for this case should be.

potential solution

I think the code should look like:

        output_x_end = (
            np.ceil(image_shape[0] / stride_shape[0]) * stride_shape[0]
        )
        output_x_list = np.arange(0, int(output_x_end), stride_shape[0])
        output_y_end = (
            np.ceil(image_shape[1] / stride_shape[1]) * stride_shape[1]
        )
        output_y_list = np.arange(0, int(output_y_end), stride_shape[1])

which gives in the above example:

wsi_shape = [43668, 14634]
np.max(coords, axis=0) # gives array([43648, 14592, 43904, 14848])

which correctly has all patches with at least some overlap with the slide.
This removes output_shape from those equations entirely, which I can't see a problem with but i'm also not 100% sure why output_shape was there in the first place so want to make sure i'm not missing anything.

@measty measty linked a pull request Sep 7, 2023 that will close this issue
shaneahmed pushed a commit that referenced this issue Sep 13, 2023
Fix #710 

Fix the situation where PatchExtractor.get_coords() can return patch coords which lie fully outside the bounds of a slide.

- Adds a test that would flag this situation.
shaneahmed added a commit that referenced this issue Dec 15, 2023
## 1.5.0 (2023-12-15)

### Major Updates and Feature Improvements

- Adds the bokeh visualization tool. #684
  - The tool allows a user to launch a server on their machine to visualise whole slide images, overlay the results of deep learning algorithms or to select a patch from whole slide image and run TIAToolbox deep learning engines.
  - This tool powers the TIA demos server. For details please see https://tiademos.dcs.warwick.ac.uk/.
- Extends Annotation to Support Init from WKB #639
- Adds `IOConfig` for NuClick in `pretrained_model.yaml` #709
- Adds functions to save the TIAToolbox Engine outputs to Zarr and AnnotationStore files. #724
- Adds Support for QuPath Annotation Imports #721

### Changes to API

- Adds `model.to(device)` and `model.load_model_from_file()` functionality to make it compatible with PyTorch API. #733
- Replaces `pretrained` with `weights` to make the engines compatible with the new PyTorch API. #621
- Adds support for high-level imports for various utility functions and classes such as `WSIReader`, `PatchPredictor` and `imread` #606, #607,
- Adds `tiatoolbox.typing` for type hints. #619
- Fixes incorrect file size saved by `save_tiles`, issue with certain WSIs raised by @TomastpPereira
- TissueMasker transform now returns mask instead of a list. #748
  - Fixes #732

### Bug Fixes and Other Changes

- Fixes `pixman` incompability error on Colab #601
- Removes `shapely.speedups`. The module no longer has any affect in Shapely >=2.0. #622
- Fixes errors in the slidegraph example notebook #608
- Fixes bugs in WSI Registration #645, #670, #693
- Fixes the situation where PatchExtractor.get_coords() can return patch coords which lie fully outside the bounds of a slide. #712
  - Fixes #710
- Fixes #738 raised by @xiachenrui

### Development related changes

- Replaces `flake8` and `isort` with `ruff` #625, #666
- Adds `mypy` checks to `root` and `utils` package. This will be rolled out in phases to other modules. #723
- Adds a module to detect file types using magic number/signatures #616
- Uses `poetry` for version updates instead of `bump2version`. #638
- Removes `setup.cfg` and uses `pyproject.toml` for project configurations.
- Reduces runtime for some unit tests e.g., #627, #630, #631, #629
- Reuses models and datasets in tests on GitHub actions by utilising cache #641, #644
- Set up parallel tests locally #671

**Full Changelog:** v1.4.0...v1.5.0

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: mostafajahanifar <74412979+mostafajahanifar@users.noreply.github.com>
Co-authored-by: John Pocock <John-P@users.noreply.github.com>
Co-authored-by: DavidBAEpstein <David.Epstein@warwick.ac.uk>
Co-authored-by: David Epstein <22086916+DavidBAEpstein@users.noreply.github.com>
Co-authored-by: Ruqayya Awan <18444369+ruqayya@users.noreply.github.com>
Co-authored-by: Mark Eastwood <20169086+measty@users.noreply.github.com>
Co-authored-by: adamshephard <39619155+adamshephard@users.noreply.github.com>
Co-authored-by: adamshephard <adam.shephard@warwick.ac.uk>
Co-authored-by: Abdol <a@fkrtech.com>
Co-authored-by: Jiaqi-Lv <60471431+Jiaqi-Lv@users.noreply.github.com>
Co-authored-by: Abishek <abishekraj6797@gmail.com>
Co-authored-by: Dmitrii Blaginin <blaginin@mbp.lan>
shaneahmed added a commit that referenced this issue Dec 15, 2023
## 1.5.0 (2023-12-15)

### Major Updates and Feature Improvements

- Adds the bokeh visualization tool. #684
  - The tool allows a user to launch a server on their machine to visualise whole slide images, overlay the results of deep learning algorithms or to select a patch from whole slide image and run TIAToolbox deep learning engines.
  - This tool powers the TIA demos server. For details please see https://tiademos.dcs.warwick.ac.uk/.
- Extends Annotation to Support Init from WKB #639
- Adds `IOConfig` for NuClick in `pretrained_model.yaml` #709
- Adds functions to save the TIAToolbox Engine outputs to Zarr and AnnotationStore files. #724
- Adds Support for QuPath Annotation Imports #721

### Changes to API

- Adds `model.to(device)` and `model.load_model_from_file()` functionality to make it compatible with PyTorch API. #733
- Replaces `pretrained` with `weights` to make the engines compatible with the new PyTorch API. #621
- Adds support for high-level imports for various utility functions and classes such as `WSIReader`, `PatchPredictor` and `imread` #606, #607,
- Adds `tiatoolbox.typing` for type hints. #619
- Fixes incorrect file size saved by `save_tiles`, issue with certain WSIs raised by @TomastpPereira
- TissueMasker transform now returns mask instead of a list. #748
  - Fixes #732

### Bug Fixes and Other Changes

- Fixes `pixman` incompability error on Colab #601
- Removes `shapely.speedups`. The module no longer has any affect in Shapely >=2.0. #622
- Fixes errors in the slidegraph example notebook #608
- Fixes bugs in WSI Registration #645, #670, #693
- Fixes the situation where PatchExtractor.get_coords() can return patch coords which lie fully outside the bounds of a slide. #712
  - Fixes #710
- Fixes #738 raised by @xiachenrui

### Development related changes

- Replaces `flake8` and `isort` with `ruff` #625, #666
- Adds `mypy` checks to `root` and `utils` package. This will be rolled out in phases to other modules. #723
- Adds a module to detect file types using magic number/signatures #616
- Uses `poetry` for version updates instead of `bump2version`. #638
- Removes `setup.cfg` and uses `pyproject.toml` for project configurations.
- Reduces runtime for some unit tests e.g., #627, #630, #631, #629
- Reuses models and datasets in tests on GitHub actions by utilising cache #641, #644
- Set up parallel tests locally #671

**Full Changelog:** v1.4.0...v1.5.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant