-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✅ Reuse models and datasets in tests #641
Conversation
# Conflicts: # tests/models/test_arch_mapde.py # tests/models/test_arch_micronet.py # tests/models/test_patch_predictor.py # tiatoolbox/models/architecture/__init__.py
Codecov Report
@@ Coverage Diff @@
## develop #641 +/- ##
========================================
Coverage 99.78% 99.78%
========================================
Files 65 65
Lines 7072 7076 +4
Branches 1395 1397 +2
========================================
+ Hits 7057 7061 +4
Misses 7 7
Partials 8 8
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
@shaneahmed, I reviewed the PR with Ruff and fixed the error. Regarding the execution time, the first time you run the code, it doesn't have any effect as it will be downloading files into the cache. However, the second time, you will see a speedup. On my local machine with high-speed internet, the difference is 13.75 minutes vs 10.89 minutes. However, if I use 4G, the difference becomes more noticeable: 1 hour 10 minutes vs 36 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @blaginin
Adds the ability to run tests using several workers using [pytest-xdist](https://github.com/pytest-dev/pytest-xdist), significantly improving processing time. For example, on M1 Max (no CUDA), processing time dropped **from 14 minutes to 4 minutes 💨💨💨.** <img width="1614" alt="image" src="https://github.com/TissueImageAnalytics/tiatoolbox/assets/19199204/fbb607b0-3bf1-48c3-b14a-be4acf2b1ec3"> However, this optimization comes at a cost. Previously, tests depended on serial execution. For example, segmentation and prediction methods used to rely on "output" as a folder to store intermediate results. If many functions modified this folder at the same time, the result would be unpredictable. To address this, I made some tweaks alongside #641 and #673 so that functions will not depend on each other. If we merge this pull request, we will need to start checking that new tests are ready for parallel execution. **Depends on #641 and #673
## 1.5.0 (2023-12-15) ### Major Updates and Feature Improvements - Adds the bokeh visualization tool. #684 - The tool allows a user to launch a server on their machine to visualise whole slide images, overlay the results of deep learning algorithms or to select a patch from whole slide image and run TIAToolbox deep learning engines. - This tool powers the TIA demos server. For details please see https://tiademos.dcs.warwick.ac.uk/. - Extends Annotation to Support Init from WKB #639 - Adds `IOConfig` for NuClick in `pretrained_model.yaml` #709 - Adds functions to save the TIAToolbox Engine outputs to Zarr and AnnotationStore files. #724 - Adds Support for QuPath Annotation Imports #721 ### Changes to API - Adds `model.to(device)` and `model.load_model_from_file()` functionality to make it compatible with PyTorch API. #733 - Replaces `pretrained` with `weights` to make the engines compatible with the new PyTorch API. #621 - Adds support for high-level imports for various utility functions and classes such as `WSIReader`, `PatchPredictor` and `imread` #606, #607, - Adds `tiatoolbox.typing` for type hints. #619 - Fixes incorrect file size saved by `save_tiles`, issue with certain WSIs raised by @TomastpPereira - TissueMasker transform now returns mask instead of a list. #748 - Fixes #732 ### Bug Fixes and Other Changes - Fixes `pixman` incompability error on Colab #601 - Removes `shapely.speedups`. The module no longer has any affect in Shapely >=2.0. #622 - Fixes errors in the slidegraph example notebook #608 - Fixes bugs in WSI Registration #645, #670, #693 - Fixes the situation where PatchExtractor.get_coords() can return patch coords which lie fully outside the bounds of a slide. #712 - Fixes #710 - Fixes #738 raised by @xiachenrui ### Development related changes - Replaces `flake8` and `isort` with `ruff` #625, #666 - Adds `mypy` checks to `root` and `utils` package. This will be rolled out in phases to other modules. #723 - Adds a module to detect file types using magic number/signatures #616 - Uses `poetry` for version updates instead of `bump2version`. #638 - Removes `setup.cfg` and uses `pyproject.toml` for project configurations. - Reduces runtime for some unit tests e.g., #627, #630, #631, #629 - Reuses models and datasets in tests on GitHub actions by utilising cache #641, #644 - Set up parallel tests locally #671 **Full Changelog:** v1.4.0...v1.5.0 --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: mostafajahanifar <74412979+mostafajahanifar@users.noreply.github.com> Co-authored-by: John Pocock <John-P@users.noreply.github.com> Co-authored-by: DavidBAEpstein <David.Epstein@warwick.ac.uk> Co-authored-by: David Epstein <22086916+DavidBAEpstein@users.noreply.github.com> Co-authored-by: Ruqayya Awan <18444369+ruqayya@users.noreply.github.com> Co-authored-by: Mark Eastwood <20169086+measty@users.noreply.github.com> Co-authored-by: adamshephard <39619155+adamshephard@users.noreply.github.com> Co-authored-by: adamshephard <adam.shephard@warwick.ac.uk> Co-authored-by: Abdol <a@fkrtech.com> Co-authored-by: Jiaqi-Lv <60471431+Jiaqi-Lv@users.noreply.github.com> Co-authored-by: Abishek <abishekraj6797@gmail.com> Co-authored-by: Dmitrii Blaginin <blaginin@mbp.lan>
## 1.5.0 (2023-12-15) ### Major Updates and Feature Improvements - Adds the bokeh visualization tool. #684 - The tool allows a user to launch a server on their machine to visualise whole slide images, overlay the results of deep learning algorithms or to select a patch from whole slide image and run TIAToolbox deep learning engines. - This tool powers the TIA demos server. For details please see https://tiademos.dcs.warwick.ac.uk/. - Extends Annotation to Support Init from WKB #639 - Adds `IOConfig` for NuClick in `pretrained_model.yaml` #709 - Adds functions to save the TIAToolbox Engine outputs to Zarr and AnnotationStore files. #724 - Adds Support for QuPath Annotation Imports #721 ### Changes to API - Adds `model.to(device)` and `model.load_model_from_file()` functionality to make it compatible with PyTorch API. #733 - Replaces `pretrained` with `weights` to make the engines compatible with the new PyTorch API. #621 - Adds support for high-level imports for various utility functions and classes such as `WSIReader`, `PatchPredictor` and `imread` #606, #607, - Adds `tiatoolbox.typing` for type hints. #619 - Fixes incorrect file size saved by `save_tiles`, issue with certain WSIs raised by @TomastpPereira - TissueMasker transform now returns mask instead of a list. #748 - Fixes #732 ### Bug Fixes and Other Changes - Fixes `pixman` incompability error on Colab #601 - Removes `shapely.speedups`. The module no longer has any affect in Shapely >=2.0. #622 - Fixes errors in the slidegraph example notebook #608 - Fixes bugs in WSI Registration #645, #670, #693 - Fixes the situation where PatchExtractor.get_coords() can return patch coords which lie fully outside the bounds of a slide. #712 - Fixes #710 - Fixes #738 raised by @xiachenrui ### Development related changes - Replaces `flake8` and `isort` with `ruff` #625, #666 - Adds `mypy` checks to `root` and `utils` package. This will be rolled out in phases to other modules. #723 - Adds a module to detect file types using magic number/signatures #616 - Uses `poetry` for version updates instead of `bump2version`. #638 - Removes `setup.cfg` and uses `pyproject.toml` for project configurations. - Reduces runtime for some unit tests e.g., #627, #630, #631, #629 - Reuses models and datasets in tests on GitHub actions by utilising cache #641, #644 - Set up parallel tests locally #671 **Full Changelog:** v1.4.0...v1.5.0
This PR addresses part of issue #603 by improving model weights and datasets in tests. Previously, weights for each model were downloaded to
f"{tmp_path}/weights.pth"
and replaced each other. The only available dataset, Kather, was downloaded intoTIATOOLBOX_HOME
, but the entire home directory was deleted during test execution. This approach has several drawbacks:f"{tmp_path}/weights.pth"
file.Now, all weights are downloaded to TIATOOLBOX_HOME and cached between runs. Storing more data in memory (+8 GB) makes tests much faster, especially if you're not in Coventry or have slow internet. In rare cases where you specifically need to test behaviour without cache, you can set TIATOOLBOX_HOME to a temporary directory.