Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyperparameter Optimization Pipelines and Supporting Tools #74

Draft
wants to merge 97 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
97 commits
Select commit Hold shift + click to select a range
1997939
hyper parameter pipeline
GabrielBG0 May 17, 2024
6500cbe
Merge branch 'main' of https://github.com/discovery-unicamp/Minerva i…
GabrielBG0 May 17, 2024
bb67860
Merge branch 'main' of https://github.com/discovery-unicamp/Minerva i…
GabrielBG0 May 18, 2024
2dce907
Merge branch 'main' of https://github.com/discovery-unicamp/Minerva i…
GabrielBG0 May 19, 2024
21bb29e
Merge branch '48-hyper-parameter-optimization' of https://github.com/…
GabrielBG0 May 25, 2024
885aa66
hyper parameter in contruction
GabrielBG0 May 25, 2024
b0830d3
hps
GabrielBG0 May 28, 2024
0d6b4a1
changes
GabrielBG0 Jun 28, 2024
f388b49
Merge branch 'main' of https://github.com/discovery-unicamp/Minerva i…
GabrielBG0 Jul 4, 2024
bd69fc7
idk bro
GabrielBG0 Jul 4, 2024
e7e06b8
corrections and adaptations for ray
GabrielBG0 Jul 12, 2024
2d04f68
todos
GabrielBG0 Jul 12, 2024
ba0bb34
fix?
GabrielBG0 Jul 15, 2024
20a1d97
fix??
GabrielBG0 Jul 17, 2024
755e532
now its fixed (?)
GabrielBG0 Jul 17, 2024
728af1a
its fixed now 100% no fail, lets gooooooo
GabrielBG0 Jul 17, 2024
964c510
remove debug prints
GabrielBG0 Jul 17, 2024
18a09a6
exposing search configs
GabrielBG0 Jul 25, 2024
6519871
configs
GabrielBG0 Jul 25, 2024
67a607d
changes to setr and hyperserch pipeline
GabrielBG0 Aug 4, 2024
72415e4
add state loader to setr
GabrielBG0 Aug 18, 2024
09a253e
Merge branch 'main' of github.com:GabrielBG0/Minerva into 48-hyper-pa…
GabrielBG0 Aug 18, 2024
0b49f6b
add freeze option to backbone
GabrielBG0 Aug 27, 2024
c45e9c1
random flip
GabrielBG0 Aug 28, 2024
9e2fdef
takeout prints
GabrielBG0 Aug 28, 2024
83f3959
fix flip problem
GabrielBG0 Aug 28, 2024
3e2bf41
chore: Refactor hyperparameter search pipeline configuration
GabrielBG0 Aug 28, 2024
1fb61af
chore: Update hyperparameter search pipeline configuration with brack…
GabrielBG0 Aug 29, 2024
e482267
correcting space bug
GabrielBG0 Aug 30, 2024
6f64ca4
attempt to correct storage bug
GabrielBG0 Aug 31, 2024
020df73
feature: Add Custom Callbacks module and hypersearch callback
GabrielBG0 Sep 1, 2024
a39f3c5
feat: Add TrainerReportKeepOnlyLastCallback for saving and reporting …
GabrielBG0 Sep 1, 2024
1f41372
fix hyperserch callbacks
GabrielBG0 Sep 2, 2024
d0d453f
fix torch version
GabrielBG0 Sep 16, 2024
d85e2eb
Update torch version to 2.1.0a0+4136153
GabrielBG0 Sep 16, 2024
07219f8
Add missing imports for minerva package
GabrielBG0 Sep 16, 2024
373ada0
trying to fix for singularity container
GabrielBG0 Sep 17, 2024
030ee5d
Refactor hyperparameter search pipeline
GabrielBG0 Sep 23, 2024
7cde1bc
Refactor model creation in hyperparameter search pipeline
GabrielBG0 Sep 23, 2024
e21ba8c
change name hypersearch_pipeline to ray_hypersearch_pipeline
GabrielBG0 Sep 23, 2024
d2bd7a7
add comments
GabrielBG0 Sep 23, 2024
2c5b835
Merge branch 'main' of https://github.com/discovery-unicamp/Minerva i…
GabrielBG0 Sep 24, 2024
b1663de
add hyperopt support, add hyper opt to dependencies, cleanup ray search
GabrielBG0 Sep 24, 2024
5e34f15
fix bugs with hyperopt and setr
GabrielBG0 Sep 24, 2024
4e7d9b4
normalize transform
GabrielBG0 Sep 28, 2024
1234346
correct hyperopt pipeline
GabrielBG0 Sep 28, 2024
0ef95b9
add max_epochs parameter
GabrielBG0 Sep 30, 2024
13713ce
save on interval
GabrielBG0 Oct 1, 2024
18a60b0
add trial stopper
GabrielBG0 Oct 6, 2024
1348dc8
corrections to trial stop
GabrielBG0 Oct 7, 2024
dd25d45
debugging configs
GabrielBG0 Oct 7, 2024
23b1dfc
fixed errors
GabrielBG0 Oct 7, 2024
d493efb
remove prints
GabrielBG0 Oct 7, 2024
ec64c5a
add parameters to search
GabrielBG0 Oct 7, 2024
d5ff299
add interpolate post_embeddings functionality
GabrielBG0 Oct 30, 2024
d5c62a9
cropped metrics
GabrielBG0 Oct 30, 2024
b386d8d
test manual opt
GabrielBG0 Oct 30, 2024
efbce40
Crop transformer
GabrielBG0 Oct 30, 2024
0eb0118
Merge branch '48-hyper-parameter-optimization' of https://github.com/…
GabrielBG0 Oct 30, 2024
3bd7fb3
original res SetR
GabrielBG0 Oct 31, 2024
dc2cc03
Add head_lr_factor parameter to SETR_PUP for flexible learning rate a…
GabrielBG0 Oct 31, 2024
d2f468c
fiz setr
GabrielBG0 Oct 31, 2024
6f42039
fix optimizer
GabrielBG0 Nov 1, 2024
0e8def2
Enhance Padding transform with flexible padding modes and parameters
GabrielBG0 Nov 1, 2024
78b6fd2
Refactor SETR_PUP to support multiple optimizers based on head_lr_factor
GabrielBG0 Nov 1, 2024
b36f1a9
Refactor Padding transform to ensure consistent dimensionality and en…
GabrielBG0 Nov 1, 2024
3080d32
Add RandomResize transform and refactor RandomFlip to use numpy's def…
GabrielBG0 Nov 1, 2024
fa41ed8
Refactor RandomFlip and RandomResize transforms to improve functional…
GabrielBG0 Nov 1, 2024
dfd6b7d
Enhance Crop and Resize transforms with additional parameters for imp…
GabrielBG0 Nov 2, 2024
d4bc8fc
Implement _PatchInferencer class for patch-based inference; enhance S…
GabrielBG0 Nov 2, 2024
cf4cdce
Enhance _PatchInferencer class with additional input shape handling, …
GabrielBG0 Nov 3, 2024
d0b5d08
Refactor _PatchInferencer class to support multi-dimensional offsets …
GabrielBG0 Nov 3, 2024
91549ce
Refactor _PatchInferencer to accept offsets as a list of tuples; fix …
GabrielBG0 Nov 3, 2024
4ecd6b2
Refactor PatchInferencer to utilize _PatchInferencer for improved pat…
GabrielBG0 Nov 3, 2024
fb0d412
Rename _PatchInferencer to PatchInferencerEngine for clarity; update …
GabrielBG0 Nov 3, 2024
5f0d3ac
Refactor PatchInferencerEngine to improve padding handling and ref_sh…
GabrielBG0 Nov 3, 2024
4fd8ad8
Enhance PatchInferencerEngine to support return_tuple parameter for f…
GabrielBG0 Nov 3, 2024
e196a13
Fix typo in assertion error message for axis dimension validation in …
GabrielBG0 Nov 7, 2024
e347ec6
Refactor SETR_PUP to streamline metric computation; unify loss calcul…
GabrielBG0 Nov 11, 2024
8ee0ed7
Remove CroppedMetric from cropped_metric.py and implement it in trans…
GabrielBG0 Nov 11, 2024
0d96ce5
Rename ResizeMetric class to ResizedMetric for improved clarity and c…
GabrielBG0 Nov 11, 2024
2e68956
Add opencv-python dependency for enhanced image processing capabilities
GabrielBG0 Nov 11, 2024
53c1a76
Enable saving run status by default in hyperparameter search pipelines
GabrielBG0 Nov 11, 2024
cd03451
Fix dtype check in ClassRatioCrop and optimize Resize class for aspec…
GabrielBG0 Nov 11, 2024
2393932
Refactor type hint for weight_function and add base padding computati…
GabrielBG0 Nov 11, 2024
44b367f
Fix padding computation in PatchInferencerEngine to use correct paddi…
GabrielBG0 Nov 11, 2024
5db1803
Update padding documentation in PatchInferencerEngine to clarify expe…
GabrielBG0 Nov 11, 2024
8c9268e
Remove opencv-python dependency from pyproject.toml
GabrielBG0 Nov 11, 2024
f5ae031
Fix padding computation in PatchInferencerEngine and improve dtype ch…
GabrielBG0 Nov 12, 2024
78fdd78
Fix variable naming and padding logic in PatchInferencerEngine for im…
GabrielBG0 Nov 12, 2024
b4018be
Refactor slice computation in PatchInferencerEngine to remove unused …
GabrielBG0 Nov 12, 2024
a0078dd
Remove debug print statement and ensure tensor type conversion in Pat…
GabrielBG0 Nov 12, 2024
e2257a7
Fix tensor type conversion in ResizedMetric to ensure proper handling…
GabrielBG0 Nov 12, 2024
f1f0d66
Fix tensor type conversion in ResizedMetric to handle LongTensor form…
GabrielBG0 Nov 12, 2024
9100bd5
Update dependencies in pyproject.toml to specify exact versions for l…
GabrielBG0 Nov 13, 2024
7e63fd5
Fix tensor type conversion in ResizedMetric to correctly check for Lo…
GabrielBG0 Nov 15, 2024
e4131f6
Refactor tensor type conversion in ResizedMetric to handle LongTensor…
GabrielBG0 Nov 15, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions minerva/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@

import minerva
import minerva.analysis
import minerva.callbacks
import minerva.data
import minerva.losses
import minerva.models
import minerva.pipelines
import minerva.transforms
import minerva.utils
191 changes: 191 additions & 0 deletions minerva/analysis/metrics/transformed_metrics.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,191 @@
import warnings
from typing import Optional

import torch
from torchmetrics import Metric


class CroppedMetric(Metric):
def __init__(
self,
target_h_size: int,
target_w_size: int,
metric: Metric,
dist_sync_on_step: bool = False,
):
"""
Initializes a new instance of CroppedMetric.

Parameters
----------
target_h_size: int
The target height size.
target_w_size: int
The target width size.
dist_sync_on_step: bool, optional
Whether to synchronize metric state across processes at each step.
Defaults to False.
"""
super().__init__(dist_sync_on_step=dist_sync_on_step)
self.metric = metric
self.target_h_size = target_h_size
self.target_w_size = target_w_size

def update(self, preds: torch.Tensor, target: torch.Tensor):
"""
Updates the metric state with the predictions and targets.

Parameters
----------
preds: torch.Tensor
The predicted tensor.
target:
torch.Tensor The target tensor.
"""

preds = self.crop(preds)
target = self.crop(target)
self.metric.update(preds, target)

def compute(self) -> float:
"""
Computes the cropped metric.

Returns:
float: The cropped metric.
"""
return self.metric.compute()

def crop(self, x: torch.Tensor) -> torch.Tensor:
"""crops the input tensor to the target size.

Parameters
----------
x : torch.Tensor
The input tensor.

Returns
-------
torch.Tensor
The cropped tensor.
"""
h, w = x.shape[-2:]
start_h = (h - self.target_h_size) // 2
start_w = (w - self.target_w_size) // 2
end_h = start_h + self.target_h_size
end_w = start_w + self.target_w_size

return x[..., start_h:end_h, start_w:end_w]


class ResizedMetric(Metric):
def __init__(
self,
target_h_size: Optional[int],
target_w_size: Optional[int],
metric: Metric,
keep_aspect_ratio: bool = False,
dist_sync_on_step: bool = False,
):
"""
Initializes a new instance of ResizeMetric.

Parameters
----------
target_h_size: int
The target height size.
target_w_size: int
The target width size.
dist_sync_on_step: bool, optional
Whether to synchronize metric state across processes at each step.
Defaults to False.
"""
super().__init__(dist_sync_on_step=dist_sync_on_step)

if target_h_size is None and target_w_size is None:
raise ValueError(
"At least one of target_h_size or target_w_size must be provided."
)

if (
target_h_size is not None and target_w_size is None
) and keep_aspect_ratio is False:
warnings.warn(
"A target_w_size is not provided, but keep_aspect_ratio is set to False. keep_aspect_ratio will be set to True. If you want to resize the image to a specific width, please provide a target_w_size."
)
keep_aspect_ratio = True

if (
target_w_size is not None and target_h_size is None
) and keep_aspect_ratio is False:
warnings.warn(
"A target_h_size is not provided, but keep_aspect_ratio is set to False. keep_aspect_ratio will be set to True. If you want to resize the image to a specific height, please provide a target_h_size."
)
keep_aspect_ratio = True

self.metric = metric
self.target_h_size = target_h_size
self.target_w_size = target_w_size
self.keep_aspect_ratio = keep_aspect_ratio

def update(self, preds: torch.Tensor, target: torch.Tensor):
"""
Updates the metric state with the predictions and targets.

Parameters
----------
preds: torch.Tensor
The predicted tensor.
target:
torch.Tensor The target tensor.
"""

preds = self.resize(preds)
target = self.resize(target)
self.metric.update(preds, target)

def compute(self) -> float:
"""
Computes the resized metric.

Returns:
float: The resized metric.
"""
return self.metric.compute()

def resize(self, x: torch.Tensor) -> torch.Tensor:
"""Resizes the input tensor to the target size.

Parameters
----------
x : torch.Tensor
The input tensor.

Returns
-------
torch.Tensor
The resized tensor.
"""
h, w = x.shape[-2:]

target_h_size = self.target_h_size
target_w_size = self.target_w_size
if self.keep_aspect_ratio:
if self.target_h_size is None:
scale = target_w_size / w
target_h_size = int(h * scale)
elif self.target_w_size is None:
scale = target_h_size / h
target_w_size = int(w * scale)
type_convert = False
if "LongTensor" in x.type():
x = x.to(torch.uint8)
type_convert = True

return (
torch.nn.functional.interpolate(x, size=(target_h_size, target_w_size))
if not type_convert
else torch.nn.functional.interpolate(
x, size=(target_h_size, target_w_size)
).to(torch.long)
)
108 changes: 108 additions & 0 deletions minerva/callbacks/HyperSearchCallbacks.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
import os
import shutil
import tempfile
from pathlib import Path

import lightning.pytorch as L
from ray import train
from ray._private.usage.usage_lib import TagKey, record_extra_usage_tag
from ray.train import Checkpoint


class TrainerReportOnIntervalCallback(L.Callback):

CHECKPOINT_NAME = "checkpoint.ckpt"

def __init__(self, interval: int = 1) -> None:
super().__init__()
self.trial_name = train.get_context().get_trial_name()
self.local_rank = train.get_context().get_local_rank()
self.tmpdir_prefix = Path(tempfile.gettempdir(), self.trial_name).as_posix()
self.interval = interval
self.step = 0
if os.path.isdir(self.tmpdir_prefix) and self.local_rank == 0:
shutil.rmtree(self.tmpdir_prefix)

record_extra_usage_tag(TagKey.TRAIN_LIGHTNING_RAYTRAINREPORTCALLBACK, "1")

def on_train_epoch_end(
self, trainer: L.Trainer, pl_module: L.LightningModule
) -> None:

# Fetch metrics
metrics = trainer.callback_metrics
metrics = {k: v.item() for k, v in metrics.items()}

# (Optional) Add customized metrics
metrics["epoch"] = trainer.current_epoch
metrics["step"] = trainer.global_step

tmpdir = Path(self.tmpdir_prefix, str(trainer.current_epoch)).as_posix()
os.makedirs(tmpdir, exist_ok=True)

if self.step % self.interval == 0:

# Save checkpoint to local
ckpt_path = Path(tmpdir, self.CHECKPOINT_NAME).as_posix()
trainer.save_checkpoint(ckpt_path, weights_only=False)

# Report to train session
checkpoint = Checkpoint.from_directory(tmpdir)
train.report(metrics=metrics, checkpoint=checkpoint)
else:
train.report(metrics=metrics)

# Add a barrier to ensure all workers finished reporting here
trainer.strategy.barrier()

if self.local_rank == 0:
shutil.rmtree(tmpdir)

self.step += 1


class TrainerReportKeepOnlyLastCallback(L.Callback):

CHECKPOINT_NAME = "checkpoint.ckpt"

def __init__(self) -> None:
super().__init__()
self.trial_name = train.get_context().get_trial_name()
self.local_rank = train.get_context().get_local_rank()
self.tmpdir_prefix = Path(tempfile.gettempdir(), self.trial_name).as_posix()
if os.path.isdir(self.tmpdir_prefix) and self.local_rank == 0:
shutil.rmtree(self.tmpdir_prefix)

record_extra_usage_tag(TagKey.TRAIN_LIGHTNING_RAYTRAINREPORTCALLBACK, "1")

def on_train_epoch_end(
self, trainer: L.Trainer, pl_module: L.LightningModule
) -> None:
# Fetch metrics
metrics = trainer.callback_metrics
metrics = {k: v.item() for k, v in metrics.items()}

# (Optional) Add customized metrics
metrics["epoch"] = trainer.current_epoch
metrics["step"] = trainer.global_step

tmpdir = Path(self.tmpdir_prefix, "last").as_posix()
os.makedirs(tmpdir, exist_ok=True)

# Delete previous checkpoint
if os.path.isdir(tmpdir):
shutil.rmtree(tmpdir)

# Save checkpoint to local
ckpt_path = Path(tmpdir, self.CHECKPOINT_NAME).as_posix()
trainer.save_checkpoint(ckpt_path, weights_only=False)

# Report to train session
checkpoint = Checkpoint.from_directory(tmpdir)
train.report(metrics=metrics, checkpoint=checkpoint)

# Add a barrier to ensure all workers finished reporting here
trainer.strategy.barrier()

if self.local_rank == 0:
shutil.rmtree(tmpdir)
Empty file added minerva/callbacks/__init__.py
Empty file.
43 changes: 34 additions & 9 deletions minerva/data/datasets/supervised_dataset.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from typing import List, Optional, Tuple
from typing import Any, List, Optional, Tuple

import numpy as np

Expand All @@ -15,7 +15,7 @@ class SupervisedReconstructionDataset(SimpleDataset):
Usually, both input and target data have the same shape.

This dataset is useful for supervised tasks such as image reconstruction,
segmantic segmentation, and object detection, where the input data is the
semantic segmentation, and object detection, where the input data is the
original data and the target is a mask or a segmentation map.

Examples
Expand Down Expand Up @@ -45,7 +45,12 @@ class SupervisedReconstructionDataset(SimpleDataset):
```
"""

def __init__(self, readers: List[_Reader], transforms: Optional[_Transform] = None):
def __init__(
self,
readers: List[_Reader],
transforms: Optional[_Transform] = None,
support_context_transforms: bool = False,
):
"""A simple dataset class for supervised reconstruction tasks.

Parameters
Expand All @@ -62,12 +67,13 @@ def __init__(self, readers: List[_Reader], transforms: Optional[_Transform] = No
AssertionError: If the number of readers is not exactly 2.
"""
super().__init__(readers, transforms)
self.support_context_transforms = support_context_transforms

assert (
len(self.readers) == 2
), "SupervisedReconstructionDataset requires exactly 2 readers"

def __getitem__(self, index: int) -> Tuple[np.ndarray, np.ndarray]:
def __getitem__(self, index: int) -> Tuple[Any, Any]:
"""Load data from sources and apply specified transforms. The same
transform is applied to both input and target data.

Expand All @@ -78,10 +84,29 @@ def __getitem__(self, index: int) -> Tuple[np.ndarray, np.ndarray]:

Returns
-------
Tuple[np.ndarray, np.ndarray]
A tuple containing two numpy arrays representing the data.
Tuple[Any, Any]
A tuple containing two elements: the input data and the target data.

"""
data = super().__getitem__(index)

return (data[0], data[1])
if not self.support_context_transforms:
data = super().__getitem__(index)

return (data[0], data[1])
else:

data = []

# For each reader and transform, read the data and apply the transform.
# Then, append the transformed data to the list of data.
for reader, transform in zip(reversed(self.readers), self.transforms):
sample = reader[index]
# Apply the transform if it is not None
if transform is not None:
sample = transform(sample)
data.append(sample)
# Return the list of transformed data or a single sample if return_single
# is True and there is only one reader.
if self.return_single:
return data[1]
else:
return tuple(reversed(data))
Loading
Loading