Patch extractor can go (fully) out of bounds #710

measty · 2023-09-02T01:20:33Z

TIA Toolbox version: 1.4
Python version: ALL
Operating System: ALL

Description

I think the logic for getting the coordinate list in patch extraction isn't quite right.

The following code:

tiatoolbox/tiatoolbox/tools/patchextraction.py

Lines 455 to 462 in eb49f66

    
           output_x_end = ( 
        
               np.ceil(image_shape[0] / patch_output_shape[0]) * patch_output_shape[0] 
        
           ) 
        
           output_x_list = np.arange(0, int(output_x_end), stride_shape[0]) 
        
           output_y_end = ( 
        
               np.ceil(image_shape[1] / patch_output_shape[1]) * patch_output_shape[1] 
        
           ) 
        
           output_y_list = np.arange(0, int(output_y_end), stride_shape[1])

Only works if stride is the same as the patch size. If it isn't, its possible that some patch locations are generated that are entirely outside the slide.

What I Did

An example:

wsi_shape = [43668, 14634]
coords = PatchExtractor.get_coordinates(
            image_shape=wsi_shape,
            patch_input_shape=(256,256),
            stride_shape=(128,128),
            input_within_bound=False,
        )
np.max(coords, axis=0) # gives array([43648, 14720, 43904, 14976])

note there are patches with top-left y coord of 14720, but the slide dimension is 14634. That means there are patches which lie wholly outside the slide bounds, which should not be happening. (input_within_bounds=False just means we allow patches that overlap with the slide boundary)

This also raises another discussion point. WSIreader.read_rect (or read bounds) will happily read a patch that is entirely outside the bounds of the slide, and will do so silently. Its designed to safely pad regions that overlap the edge of the slide, and that is fine, but I think in most cases, if your code is ending up reading patches from entirely outside the slide, theres something wrong somewhere and it would be good to know that its happening. So we could have a discussion on what the behaviour for this case should be.

potential solution

I think the code should look like:

        output_x_end = (
            np.ceil(image_shape[0] / stride_shape[0]) * stride_shape[0]
        )
        output_x_list = np.arange(0, int(output_x_end), stride_shape[0])
        output_y_end = (
            np.ceil(image_shape[1] / stride_shape[1]) * stride_shape[1]
        )
        output_y_list = np.arange(0, int(output_y_end), stride_shape[1])

which gives in the above example:

wsi_shape = [43668, 14634]
np.max(coords, axis=0) # gives array([43648, 14592, 43904, 14848])

which correctly has all patches with at least some overlap with the slide.
This removes output_shape from those equations entirely, which I can't see a problem with but i'm also not 100% sure why output_shape was there in the first place so want to make sure i'm not missing anything.

The text was updated successfully, but these errors were encountered:

Fix #710 Fix the situation where PatchExtractor.get_coords() can return patch coords which lie fully outside the bounds of a slide. - Adds a test that would flag this situation.

@TomastpPereira

## 1.5.0 (2023-12-15) ### Major Updates and Feature Improvements - Adds the bokeh visualization tool. #684 - The tool allows a user to launch a server on their machine to visualise whole slide images, overlay the results of deep learning algorithms or to select a patch from whole slide image and run TIAToolbox deep learning engines. - This tool powers the TIA demos server. For details please see https://tiademos.dcs.warwick.ac.uk/. - Extends Annotation to Support Init from WKB #639 - Adds `IOConfig` for NuClick in `pretrained_model.yaml` #709 - Adds functions to save the TIAToolbox Engine outputs to Zarr and AnnotationStore files. #724 - Adds Support for QuPath Annotation Imports #721 ### Changes to API - Adds `model.to(device)` and `model.load_model_from_file()` functionality to make it compatible with PyTorch API. #733 - Replaces `pretrained` with `weights` to make the engines compatible with the new PyTorch API. #621 - Adds support for high-level imports for various utility functions and classes such as `WSIReader`, `PatchPredictor` and `imread` #606, #607, - Adds `tiatoolbox.typing` for type hints. #619 - Fixes incorrect file size saved by `save_tiles`, issue with certain WSIs raised by @TomastpPereira - TissueMasker transform now returns mask instead of a list. #748 - Fixes #732 ### Bug Fixes and Other Changes - Fixes `pixman` incompability error on Colab #601 - Removes `shapely.speedups`. The module no longer has any affect in Shapely >=2.0. #622 - Fixes errors in the slidegraph example notebook #608 - Fixes bugs in WSI Registration #645, #670, #693 - Fixes the situation where PatchExtractor.get_coords() can return patch coords which lie fully outside the bounds of a slide. #712 - Fixes #710 - Fixes #738 raised by @xiachenrui ### Development related changes - Replaces `flake8` and `isort` with `ruff` #625, #666 - Adds `mypy` checks to `root` and `utils` package. This will be rolled out in phases to other modules. #723 - Adds a module to detect file types using magic number/signatures #616 - Uses `poetry` for version updates instead of `bump2version`. #638 - Removes `setup.cfg` and uses `pyproject.toml` for project configurations. - Reduces runtime for some unit tests e.g., #627, #630, #631, #629 - Reuses models and datasets in tests on GitHub actions by utilising cache #641, #644 - Set up parallel tests locally #671 **Full Changelog:** v1.4.0...v1.5.0 --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: mostafajahanifar <74412979+mostafajahanifar@users.noreply.github.com> Co-authored-by: John Pocock <John-P@users.noreply.github.com> Co-authored-by: DavidBAEpstein <David.Epstein@warwick.ac.uk> Co-authored-by: David Epstein <22086916+DavidBAEpstein@users.noreply.github.com> Co-authored-by: Ruqayya Awan <18444369+ruqayya@users.noreply.github.com> Co-authored-by: Mark Eastwood <20169086+measty@users.noreply.github.com> Co-authored-by: adamshephard <39619155+adamshephard@users.noreply.github.com> Co-authored-by: adamshephard <adam.shephard@warwick.ac.uk> Co-authored-by: Abdol <a@fkrtech.com> Co-authored-by: Jiaqi-Lv <60471431+Jiaqi-Lv@users.noreply.github.com> Co-authored-by: Abishek <abishekraj6797@gmail.com> Co-authored-by: Dmitrii Blaginin <blaginin@mbp.lan>

@TomastpPereira

## 1.5.0 (2023-12-15) ### Major Updates and Feature Improvements - Adds the bokeh visualization tool. #684 - The tool allows a user to launch a server on their machine to visualise whole slide images, overlay the results of deep learning algorithms or to select a patch from whole slide image and run TIAToolbox deep learning engines. - This tool powers the TIA demos server. For details please see https://tiademos.dcs.warwick.ac.uk/. - Extends Annotation to Support Init from WKB #639 - Adds `IOConfig` for NuClick in `pretrained_model.yaml` #709 - Adds functions to save the TIAToolbox Engine outputs to Zarr and AnnotationStore files. #724 - Adds Support for QuPath Annotation Imports #721 ### Changes to API - Adds `model.to(device)` and `model.load_model_from_file()` functionality to make it compatible with PyTorch API. #733 - Replaces `pretrained` with `weights` to make the engines compatible with the new PyTorch API. #621 - Adds support for high-level imports for various utility functions and classes such as `WSIReader`, `PatchPredictor` and `imread` #606, #607, - Adds `tiatoolbox.typing` for type hints. #619 - Fixes incorrect file size saved by `save_tiles`, issue with certain WSIs raised by @TomastpPereira - TissueMasker transform now returns mask instead of a list. #748 - Fixes #732 ### Bug Fixes and Other Changes - Fixes `pixman` incompability error on Colab #601 - Removes `shapely.speedups`. The module no longer has any affect in Shapely >=2.0. #622 - Fixes errors in the slidegraph example notebook #608 - Fixes bugs in WSI Registration #645, #670, #693 - Fixes the situation where PatchExtractor.get_coords() can return patch coords which lie fully outside the bounds of a slide. #712 - Fixes #710 - Fixes #738 raised by @xiachenrui ### Development related changes - Replaces `flake8` and `isort` with `ruff` #625, #666 - Adds `mypy` checks to `root` and `utils` package. This will be rolled out in phases to other modules. #723 - Adds a module to detect file types using magic number/signatures #616 - Uses `poetry` for version updates instead of `bump2version`. #638 - Removes `setup.cfg` and uses `pyproject.toml` for project configurations. - Reduces runtime for some unit tests e.g., #627, #630, #631, #629 - Reuses models and datasets in tests on GitHub actions by utilising cache #641, #644 - Set up parallel tests locally #671 **Full Changelog:** v1.4.0...v1.5.0

measty mentioned this issue Sep 7, 2023

🐛 Fix patch_coords #712

Merged

measty linked a pull request Sep 7, 2023 that will close this issue

🐛 Fix patch_coords #712

Merged

shaneahmed closed this as completed in #712 Sep 13, 2023

shaneahmed pushed a commit that referenced this issue Sep 13, 2023

🐛 Fix patch_coords (#712)

795607d

Fix #710 Fix the situation where PatchExtractor.get_coords() can return patch coords which lie fully outside the bounds of a slide. - Adds a test that would flag this situation.

shaneahmed mentioned this issue Dec 15, 2023

🔖 Release 1.5.0 #757

Merged

shaneahmed mentioned this issue Dec 15, 2023

🔖 Release 1.5.0 #761

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Patch extractor can go (fully) out of bounds #710

Patch extractor can go (fully) out of bounds #710

measty commented Sep 2, 2023 •

edited

Loading

Patch extractor can go (fully) out of bounds #710

Patch extractor can go (fully) out of bounds #710

Comments

measty commented Sep 2, 2023 • edited Loading

Description

What I Did

potential solution

measty commented Sep 2, 2023 •

edited

Loading