🆕 Integrate Foundation Models Available VIA `timm`: `UNI`, `Prov-GigaPath`, `H-optimus-0` #856

GeorgeBatch · 2024-09-02T15:54:11Z

Making a pull request as discussed in issue #855

Copied from the issue:

I think it would be useful to integrate pre-trained foundation models from other labs into tiatoolbox.models.architecture.vanilla.py.

Currently, the _get_architecture() function allows the use of models from torchvision.models.

But another function _get_timm_architecture() could be made to incorporate foundation models which are available from timm with weights on HuggingFace Hub. All the models from timm that I've used require users to sign the licence agreement with the authors, so the licencing question seems to be solved itself since there is no way users will get access to the model weights just through Tiatoolbox without getting the access request approved by the authors first.

for more information, see https://pre-commit.ci

GeorgeBatch · 2024-09-02T15:57:26Z

Only added UNI and Prov-GigaPath for now. Will add more after initial comments.

I do not like that TimmBackbone pretty much repeats the CNNBackbone. The only difference is the absence of global average pooling in TimmBackbone.

GeorgeBatch · 2024-09-03T06:46:34Z

I found this file for testing: https://github.com/TissueImageAnalytics/tiatoolbox/blob/develop/tests/models/test_arch_vanilla.py

There might be a problem regarding memory and compute resources when running some of the larger feature extractors through GitHub actions, e.g. Prov-GigaPath needs a considerable amount of memory just to be loaded.

tiatoolbox/models/architecture/vanilla.py

shaneahmed · 2024-09-03T11:52:57Z

I found this file for testing: https://github.com/TissueImageAnalytics/tiatoolbox/blob/develop/tests/models/test_arch_vanilla.py

There might be a problem regarding memory and compute resources when running some of the larger feature extractors through GitHub actions, e.g. Prov-GigaPath needs a considerable amount of memory just to be loaded.

As this is just testing the functionality and not loading weights, I hope this would work.

shaneahmed · 2024-09-03T11:55:45Z

Only added UNI and Prov-GigaPath for now. Will add more after initial comments.

I do not like that TimmBackbone pretty much repeats the CNNBackbone. The only difference is the absence of global average pooling in TimmBackbone.

In that case, you can inherit CNNBackbone and just define the functions which change like we are doing here

tiatoolbox/tiatoolbox/models/engine/patch_predictor.py

Line 25 in e4e6f22

class PatchPredictor(EngineABC):

for PatchPredictor. The PatchPredictor uses all the functionalities of EngineABC other than the ones defined explicitly.

allows to reuse the `infer_batch` method of `CNNBackbone`

tiatoolbox/models/architecture/vanilla.py

shaneahmed · 2024-09-20T10:57:01Z

I have updated this branch to make sure that tests pass on Ubuntu-24 before we merge it with develop.

tiatoolbox/models/architecture/vanilla.py

…ture-extractors

- remove explicit assert statement for `timm` version - add `timm` version into in if statement for prov-gigapath - add comment about `timm` version for timm-based models Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com>

for more information, see https://pre-commit.ci

was not removed while accepting suggested changes

tiatoolbox/models/architecture/vanilla.py

…ture-extractors

…fied during the init of super(), which is CNNModel

GeorgeBatch · 2024-10-29T09:28:25Z

I think we should add a notebook to show how to

Extract and save metadata from your slides
Make and save thumbnails and masks for the WSI-s
Extract features using the masks saved before using either CNNBackbone or TimmBackbone

This is precisely what I wanted to know how to do, and there was no end-to-end example notebook.

Alternatively, we can add code for saving a mask into the Masking Notebook, which currently does not show how to save the masks:
https://github.com/TissueImageAnalytics/tiatoolbox/blob/develop/examples/03-tissue-masking.ipynb

wsi = WSIReader.open(slide_path)
mask = wsi.tissue_mask(resolution=1.25, units="power")
mask_thumbnail = mask.slide_thumbnail(resolution=1.25, units="power",)
mask_thumbnail_path = os.path.join(f"{slide_name}_mask.png")
imwrite(mask_thumbnail_path, np.uint8(mask_thumbnail * 255))

And add model = TimmBackbone("UNI") as an alternative to model = CNNBackbone("resnet50") in Slide Graph pipeline since it has the feature extraction code. Note, in Slide Graph notebooks, masks are assumed to be already saved.
https://github.com/TissueImageAnalytics/tiatoolbox/blob/develop/examples/inference-pipelines/slide-graph.ipynb

GeorgeBatch · 2024-10-30T07:36:14Z

There are 3 more families of models that can be integrated, but they require their own files to be created: Hibou -b and -L, Phikon v1 and v2, Virchow v1 and v2

Should I add those files and call them from _get_timm_network()?

Hibou requires to trust remote code when creating the model, which I do not really like.

…sc.imwrite`

review-notebook-app · 2024-11-04T17:27:25Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

for more information, see https://pre-commit.ci

…net50")`

…et50") into the full pipelines version of slide-graph

for more information, see https://pre-commit.ci

…pipelines

for more information, see https://pre-commit.ci

GeorgeBatch · 2024-11-05T09:39:59Z

@shaneahmed, I think this branch is ready to be merged into develop.

I added code for saving a mask into the Masking Notebook, which currently does not show how to save the masks:
https://github.com/TissueImageAnalytics/tiatoolbox/blob/develop/examples/03-tissue-masking.ipynb

I also added a comment showing that TimmBackbone could be used as an alternative:

model = CNNBackbone("resnet50")  # TimmBackbone(backbone="UNI", pretrained=True)

in both slide graph notebooks:
https://github.com/TissueImageAnalytics/tiatoolbox/blob/develop/examples/inference-pipelines/slide-graph.ipynb
https://github.com/TissueImageAnalytics/tiatoolbox/blob/develop/examples/full-pipelines/slide-graph.ipynb

shaneahmed

Thanks @GeorgeBatch This would be very helpful.

@GeorgeBatch

## TIAToolbox v1.6.0 (2024-12-12) ### Major Updates and Feature Improvements - **Foundation Models Support via `timm` API** (#856, contributed by @GeorgeBatch) - Introduced `TimmBackbone` for running additional PyTorch Image Models. - Tested models include `UNI`, `Prov-GigaPath`, and `H-optimus-0`. - Added an example notebook demonstrating feature extraction with foundation models. - `timm` added as a dependency. - **Performance Enhancements with `torch.compile`** (#716) - Improved performance on newer GPUs using `torch.compile`. - **Multichannel Input Support in `WSIReader`** (#742) - **AnnotationStore Filtering for Patch Extraction** (#822) - **Python 3.12 Support** - **Deprecation of Python 3.8 Support** - **CLI Response Time Improvements** (#795) ### API Changes - **Device Specification Update** (#882) - Replaced `has_gpu` with `device` for specifying GPU or CPU usage, aligning with PyTorch's `Model.to()` functionality. - **Windows Compatibility Enhancement** (#769) - Replaced `POWER` with explicit multiplication. ### Bug Fixes and Other Changes - **TIFFWSIReader Bound Reading Adjustment** (#777) - Fixed `read_bound` to use adjusted bounds. - Reduced code complexity in `WSIReader` (#814). - **Annotation Rendering Fixes** (#813) - Corrected rendering of annotations with holes. - **Non-Tiled TIFF Support in `WSIReader`** (#807, contributed by @GeorgeBatch) - **HoVer-Net Documentation Update** (#751) - Corrected class output information. - **Citation File Fix for `cffconvert`** (#869, contributed by @Alon-Alexander) - **Bokeh Compatibility Updates** - Updated `bokeh_app` for compatibility with `bokeh>=3.5.0`. - Switched from `size` to `radius` for `bokeh>3.4.0` compatibility (#796). - **JSON Extraction Fixes** (#772) - Restructured SQL expression construction for JSON properties with dots in keys. - **VahadaneExtractor Warning** (#871) - Added warning due to changes in `scikit-learn>0.23.0` dictionary learning (#382). - **PatchExtractor Error Message Refinement** (#883) - **Immutable Output Fix in `WSIReader`** (#850) ### Development-Related Changes - **Mypy Checks Added** - Applied to `utils`, `tools`, `data`, `annotation`, and `cli/common`. - **ReadTheDocs PDF Build Deprecation** - **Formatter Update** - Replaced `black` with `ruff-format`. - **Dependency Removal** - Removed `jinja2`. - **Test Environment Update** - Updated to `Ubuntu 24.04`. - **Conda Environment Workflow Update** - Implemented `micromamba` setup. - **Codecov Reporting Fix** (#811) **Full Changelog:** v1.5.1...v1.6.0 --------- Co-authored-by: John Pocock <John-P@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Adam Shephard <39619155+adamshephard@users.noreply.github.com> Co-authored-by: Mark Eastwood <20169086+measty@users.noreply.github.com> Co-authored-by: Mostafa Jahanifar <74412979+mostafajahanifar@users.noreply.github.com> Co-authored-by: Simon Graham <20071401+simongraham@users.noreply.github.com> Co-authored-by: Abdol A <u2271662@live.warwick.ac.uk> Co-authored-by: Jiaqi-Lv <60471431+Jiaqi-Lv@users.noreply.github.com> Co-authored-by: Dmitrii Blaginin <blaginin@mbp.lan> Co-authored-by: behnazelhaminia <30952176+behnazelhaminia@users.noreply.github.com> Co-authored-by: George Batchkala <46561186+GeorgeBatch@users.noreply.github.com> Co-authored-by: vqdang <24943262+vqdang@users.noreply.github.com> Co-authored-by: Jiaqi Lv <lvjiaqi9@gmail.com> Co-authored-by: Alon Alexander <alon008@gmail.com>

@GeorgeBatch

## TIAToolbox v1.6.0 (2024-12-12) ### Major Updates and Feature Improvements - **Foundation Models Support via `timm` API** (#856, contributed by @GeorgeBatch) - Introduced `TimmBackbone` for running additional PyTorch Image Models. - Tested models include `UNI`, `Prov-GigaPath`, and `H-optimus-0`. - Added an example notebook demonstrating feature extraction with foundation models. - `timm` added as a dependency. - **Performance Enhancements with `torch.compile`** (#716) - Improved performance on newer GPUs using `torch.compile`. - **Multichannel Input Support in `WSIReader`** (#742) - **AnnotationStore Filtering for Patch Extraction** (#822) - **Python 3.12 Support** - **Deprecation of Python 3.8 Support** - **CLI Response Time Improvements** (#795) ### API Changes - **Device Specification Update** (#882) - Replaced `has_gpu` with `device` for specifying GPU or CPU usage, aligning with PyTorch's `Model.to()` functionality. - **Windows Compatibility Enhancement** (#769) - Replaced `POWER` with explicit multiplication. ### Bug Fixes and Other Changes - **TIFFWSIReader Bound Reading Adjustment** (#777) - Fixed `read_bound` to use adjusted bounds. - Reduced code complexity in `WSIReader` (#814). - **Annotation Rendering Fixes** (#813) - Corrected rendering of annotations with holes. - **Non-Tiled TIFF Support in `WSIReader`** (#807, contributed by @GeorgeBatch) - **HoVer-Net Documentation Update** (#751) - Corrected class output information. - **Citation File Fix for `cffconvert`** (#869, contributed by @Alon-Alexander) - **Bokeh Compatibility Updates** - Updated `bokeh_app` for compatibility with `bokeh>=3.5.0`. - Switched from `size` to `radius` for `bokeh>3.4.0` compatibility (#796). - **JSON Extraction Fixes** (#772) - Restructured SQL expression construction for JSON properties with dots in keys. - **VahadaneExtractor Warning** (#871) - Added warning due to changes in `scikit-learn>0.23.0` dictionary learning (#382). - **PatchExtractor Error Message Refinement** (#883) - **Immutable Output Fix in `WSIReader`** (#850) ### Development-Related Changes - **Mypy Checks Added** - Applied to `utils`, `tools`, `data`, `annotation`, and `cli/common`. - **ReadTheDocs PDF Build Deprecation** - **Formatter Update** - Replaced `black` with `ruff-format`. - **Dependency Removal** - Removed `jinja2`. - **Test Environment Update** - Updated to `Ubuntu 24.04`. - **Conda Environment Workflow Update** - Implemented `micromamba` setup. - **Codecov Reporting Fix** (#811) **Full Changelog:** v1.5.1...v1.6.0

GeorgeBatch and others added 2 commits September 2, 2024 16:50

add _get_timm_architecture() and TimmBackbone

5bf228f

[pre-commit.ci] auto fixes from pre-commit.com hooks

1bc3424

for more information, see https://pre-commit.ci

shaneahmed reviewed Sep 3, 2024

View reviewed changes

tiatoolbox/models/architecture/vanilla.py Outdated Show resolved Hide resolved

GeorgeBatch added 2 commits September 3, 2024 14:53

inherit TimmBackbone from CNNBackbone

f1ec821

allows to reuse the `infer_batch` method of `CNNBackbone`

update link: change GitHub to HuggingDace for UNI model

d823c1e

shaneahmed reviewed Sep 3, 2024

View reviewed changes

tiatoolbox/models/architecture/vanilla.py Outdated Show resolved Hide resolved

shaneahmed changed the title ~~Integrate foundation models available through timm: UNI, Virchow, Hibou, H-optimus-0, etc.~~ 🆕 Integrate Foundation Models Available VIA timm: UNI, Virchow, Hibou, H-optimus-0 Sep 3, 2024

shaneahmed added this to the Release v1.6.0 milestone Sep 18, 2024

shaneahmed added the enhancement New feature or request label Sep 18, 2024

shaneahmed linked an issue Sep 20, 2024 that may be closed by this pull request

Integrate foundation models available through timm: UNI, Virchow, Hibou, H-optimus-0, etc. #855

Closed

Merge branch 'develop' into enhance-add-timm-feature-extractors

7aa3ca3

shaneahmed reviewed Sep 20, 2024

View reviewed changes

tiatoolbox/models/architecture/vanilla.py Outdated Show resolved Hide resolved

shaneahmed reviewed Sep 20, 2024

View reviewed changes

tiatoolbox/models/architecture/vanilla.py Outdated Show resolved Hide resolved

GeorgeBatch and others added 4 commits September 24, 2024 16:41

Merge branch 'TissueImageAnalytics:develop' into enhance-add-timm-fea…

934bd9a

…ture-extractors

[pre-commit.ci] auto fixes from pre-commit.com hooks

2c9a47f

for more information, see https://pre-commit.ci

remove explicit assert statement re prov-gigapath version of timm

6ec9ca1

was not removed while accepting suggested changes

shaneahmed reviewed Sep 27, 2024

View reviewed changes

tiatoolbox/models/architecture/vanilla.py Outdated Show resolved Hide resolved

GeorgeBatch and others added 7 commits October 16, 2024 14:15

Merge branch 'TissueImageAnalytics:develop' into enhance-add-timm-fea…

5445ff5

…ture-extractors

remove unused arguments in ; fix formatting

142eaf6

remove unused arguments from docstring of _get_timm_architecture

36126c5

improve error message in _get_timm_architecture()

1b666b8

simplify inheretance code of TimmBackbone

9c494a2

add TimmModel class inheriting from CNNModel

5b5f674

add tests for TimmModel - fail because backbone argument in not speci…

7f51fec

…fied during the init of super(), which is CNNModel

pre-commit-ci bot and others added 7 commits November 4, 2024 17:27

[pre-commit.ci] auto fixes from pre-commit.com hooks

961a9ba

for more information, see https://pre-commit.ci

fix typo

18070c5

[pre-commit.ci] auto fixes from pre-commit.com hooks

c270d7d

for more information, see https://pre-commit.ci

add Google Colab link

ad9053d

[pre-commit.ci] auto fixes from pre-commit.com hooks

9eebbb4

for more information, see https://pre-commit.ci

add TimmBackbone example as a comment under `model = CNNBackbone("res…

dabf9a9

…net50")`

add TimmBackbone example as a comment under model = CNNBackbone("resn…

5c93e9a

…et50") into the full pipelines version of slide-graph

GeorgeBatch force-pushed the enhance-add-timm-feature-extractors branch from 4a7ad04 to 5c93e9a Compare November 5, 2024 07:42

pre-commit-ci bot and others added 5 commits November 5, 2024 07:43

[pre-commit.ci] auto fixes from pre-commit.com hooks

1ab884e

for more information, see https://pre-commit.ci

Attempt to fix ruff error: commented code (TimmBackbone)

276da9c

[pre-commit.ci] auto fixes from pre-commit.com hooks

eba8d16

for more information, see https://pre-commit.ci

Attempt to fix ruff error: commented code (TimmBackbone) - inference …

6450d01

…pipelines

[pre-commit.ci] auto fixes from pre-commit.com hooks

ce2ef64

for more information, see https://pre-commit.ci

shaneahmed requested review from mostafajahanifar and measty November 8, 2024 14:51

shaneahmed added 3 commits November 15, 2024 14:02

[skip ci] 📝 Update 03-tissue-masking.ipynb

3631b34

Merge branch 'develop' into enhance-add-timm-feature-extractors

6f476c1

[skip ci] 📝 Update the jupyter notebooks.

c159182

shaneahmed approved these changes Nov 15, 2024

View reviewed changes

shaneahmed merged commit c980eec into TissueImageAnalytics:develop Nov 15, 2024
3 checks passed

GeorgeBatch deleted the enhance-add-timm-feature-extractors branch November 15, 2024 21:16

shaneahmed mentioned this pull request Dec 10, 2024

🔖 Release 1.6.0 #895

Merged

shaneahmed mentioned this pull request Dec 12, 2024

🔖 Release 1.6.0 #898

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🆕 Integrate Foundation Models Available VIA `timm`: `UNI`, `Prov-GigaPath`, `H-optimus-0` #856

🆕 Integrate Foundation Models Available VIA `timm`: `UNI`, `Prov-GigaPath`, `H-optimus-0` #856

GeorgeBatch commented Sep 2, 2024

GeorgeBatch commented Sep 2, 2024

GeorgeBatch commented Sep 3, 2024

shaneahmed commented Sep 3, 2024

shaneahmed commented Sep 3, 2024 •

edited

Loading

shaneahmed commented Sep 20, 2024

GeorgeBatch commented Oct 29, 2024

GeorgeBatch commented Oct 30, 2024

review-notebook-app bot commented Nov 4, 2024

GeorgeBatch commented Nov 5, 2024

shaneahmed left a comment

🆕 Integrate Foundation Models Available VIA timm: UNI, Prov-GigaPath, H-optimus-0 #856

🆕 Integrate Foundation Models Available VIA timm: UNI, Prov-GigaPath, H-optimus-0 #856

Conversation

GeorgeBatch commented Sep 2, 2024

GeorgeBatch commented Sep 2, 2024

GeorgeBatch commented Sep 3, 2024

shaneahmed commented Sep 3, 2024

shaneahmed commented Sep 3, 2024 • edited Loading

shaneahmed commented Sep 20, 2024

GeorgeBatch commented Oct 29, 2024

GeorgeBatch commented Oct 30, 2024

review-notebook-app bot commented Nov 4, 2024

GeorgeBatch commented Nov 5, 2024

shaneahmed left a comment

Choose a reason for hiding this comment

🆕 Integrate Foundation Models Available VIA `timm`: `UNI`, `Prov-GigaPath`, `H-optimus-0` #856

🆕 Integrate Foundation Models Available VIA `timm`: `UNI`, `Prov-GigaPath`, `H-optimus-0` #856

shaneahmed commented Sep 3, 2024 •

edited

Loading