Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Enabling DeepSMILE on large encoded datasets #637

Merged
merged 41 commits into from
Jan 25, 2022

Conversation

vale-salvatelli
Copy link
Contributor

@vale-salvatelli vale-salvatelli commented Jan 19, 2022

This PR contains two changes necessary to run DeepSMILE on a large dataset when using the Innereye SSL checkpoint (or any other encoder with high dimension):

  • option to encode in chunks (this prevents OOM error when performing the encoding)
  • option to load the cached encoded dataset in CPU (this prevents OOM when loading from the cache)

It also changes how the PNG images are loaded all over the histo pipeline to make the loading faster (see https://hi-ml.readthedocs.io/en/latest/loading_images.html)

Please follow the guidelines for PRs contained here. Checklist:

  • [ x] Ensure that your PR is small, and implements one change.
  • [ x] Add unit tests for all functions that you introduced or modified.
  • [ x] Run PyCharm's code cleanup tools on your Python files.
  • Link the correct GitHub issue for tracking.
  • [x ] Update the Changelog file: Describe your change in terms of
    Added/Changed/Removed/... in the "Upcoming" section.
  • When merging your PR, replace the default merge message with a description of your PR,
    and if needed a motivation why that change was required.

@vale-salvatelli vale-salvatelli changed the title Enabling DeepSMILE on large datasets WIP: Enabling DeepSMILE on large datasets Jan 19, 2022
@vale-salvatelli vale-salvatelli changed the title WIP: Enabling DeepSMILE on large datasets Enabling DeepSMILE on large datasets Jan 19, 2022
@vale-salvatelli vale-salvatelli changed the title Enabling DeepSMILE on large datasets Enabling DeepSMILE on large encoded datasets Jan 19, 2022
Copy link
Contributor

@ant0nsc ant0nsc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, just a few minor comments

InnerEye/ML/Histopathology/datamodules/base_module.py Outdated Show resolved Hide resolved
InnerEye/ML/Histopathology/datamodules/base_module.py Outdated Show resolved Hide resolved
InnerEye/ML/Histopathology/datamodules/base_module.py Outdated Show resolved Hide resolved
InnerEye/ML/deep_learning_config.py Outdated Show resolved Hide resolved
vale-salvatelli and others added 4 commits January 20, 2022 09:45
…a.py

Co-authored-by: Anton Schwaighofer <antonsc@microsoft.com>
…a.py

Co-authored-by: Anton Schwaighofer <antonsc@microsoft.com>
Co-authored-by: Anton Schwaighofer <antonsc@microsoft.com>
InnerEye/ML/Histopathology/datamodules/base_module.py Outdated Show resolved Hide resolved
InnerEye/ML/Histopathology/datamodules/base_module.py Outdated Show resolved Hide resolved
InnerEye/ML/Histopathology/datamodules/base_module.py Outdated Show resolved Hide resolved
InnerEye/ML/Histopathology/datamodules/base_module.py Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@vale-salvatelli vale-salvatelli merged commit fb258d5 into main Jan 25, 2022
@vale-salvatelli vale-salvatelli deleted the vsalva/chunk_encoding branch January 25, 2022 10:31
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants