Skip to content

Commit

Permalink
Merge back 2.2 (#4098)
Browse files Browse the repository at this point in the history
* update for releases 2.2.0rc0

* Fix Classification explain forward issue (#3867)

Fix bug

* Fix e2e code error (#3871)

* Update test_cli.py

* Update tests/e2e/cli/test_cli.py

Co-authored-by: Eunwoo Shin <eunwoo.shin@intel.com>

* Update test_cli.py

* Update test_cli.py

---------

Co-authored-by: Eunwoo Shin <eunwoo.shin@intel.com>

* Add documentation about configurable input size (#3870)

* add docs about configurable input size

* update api usecase and fix bug

* Fix zero-shot e2e (#3876)

Fix

* Fix DeiT for multi-label classification (#3881)

Remove init_args

* Fix Semi-SL for ViT accuracy drop (#3883)

Remove init_args

* Update docs for 2.2 (#3884)

Update docs

* Fix mean and scale for segmentation task (#3885)

fix mean and scale

* Update MAPI in 2.2 (#3889)

* Bump MAPI

* Update exportable code requirements

* Improve Semi-SL for LiteHRNet (small-medium case) (#3891)

* change drop pixels value

* go safe, change only tested models

* minor

* Improve h-cls for eff models (#3893)

* Update step size for eff v2

* Update effb0 recipe

* Fix maskrcnn swin nncf acc drop (#3900)

update maskrcnn swimt model type to transformer

* Add keypoint detection recipe for single object cases (#3903)

* add rtmpose_tiny for single obj

* add rtmpose_tiny for single obj

* modify test subset name

* fix unit test

* update recipe with reset

* Improve acc drop of efficientnetv2 for h-label cls (#3907)

* Add warmup_iters for effv2

* Update max_epochs

* Fix pretrained weight cached dir for timm (#3909)

* Fix pretrained_weight for timm

* Fix unit-test

* Fix keypoint detection single obj recipe (#3915)

* add rtmpose_tiny for single obj

* modify test subset name

* fix unit test

* property for pck

* Fix cached dir for timm & hugging-face (#3914)

* Fix cached dir

* Pretrained weight download unit-test

* Fix pre-commit

* Fix wrong template id mapping for anomaly (#3916)

* Update script to allow setting otx version using env. variable (#3913)

* Fix Datamodule creation for OV in AutoConfigurator (#3920)

Fix datamodule for ov

* Update tpp file for 2.2.0 (#3921)

* Fix names for ignored scope [HOT-FIX, 2.2.0] (#3924)

fix names for ignored scope

* Fix classification rt_info (#3922)

* Restore output_raw_scores for classificaiton

* Add uts

* Fix linter

* Update label info (#3925)

add label info to init

Signed-off-by: Ashwin Vaidya <ashwinnitinvaidya@gmail.com>

* Fix binary classification metric task (#3928)

* Fix binary classification

* Add unit-tests

* Improve MaskRCNN SwinT NNCF (#3929)

* ignore heads and disable smooth quant

* add activations_range_estimator_params

* update changelog

* Fix get_item for Chained Tasks in Classification (#3931)

* Fix Task Chain

* Add multi-label case as well

* Add multi-label case as well2

* Add H-label case

* Correct Keyerror for h-label cls in label_groups for dm_label_categories using label's id/key (#3932)

Modify label_groups for dm_label_categories with id/key of label

* Remove datumaro attribute id from tiling, add subset names (#3933)

* remove datumaro attribute id from tiling

* add subset names

* Fix soft predictions for Semantic Segmentation (#3934)

fix soft preds

* Update STFPM config (#3935)

* Add missing pretrained weights when creating a docker image (#3938)

* Fix pre-trained weight downloader

* Remove if condition for pretrained wiehgt download

* Change default option 'full' to 'base' in otx install (#3937)

* Change option full to base for otx install

* Fix wrong code

* Fix issue

* Fix docs

* Fix auto adapt batch size in Converter (#3939)

* Enable auto adapt batch size into converter

* Fix wrong

* Fix hpo converter (#3940)

* save best hp after hpo

* add test

* Fix tiling XAI out of range (#3943)

- Fix tile merge XAI out of range

* enable model export (#3952)

Signed-off-by: Ashwin Vaidya <ashwinnitinvaidya@gmail.com>

* Move templates from OTX1.X to OTX2.X (#3951)

* add otx1.6 templates

* added new models

* delete entrypoints and nncf cfg

* updated some hyperparams

* fix for rtmdet_tiny

* updated converter

* Update classification templates

* Update det, r-det, vpm

* Update template.yaml

* changed warmaup value in train.yaml

---------

Co-authored-by: Kang, Harim <harim.kang@intel.com>
Co-authored-by: Kim, Sungchul <sungchul.kim@intel.com>

* Add missing tile recipes and various tile recipe changes  (#3942)

* add missing tile recipes

* Fix tiling XAI out of range (#3943)

- Fix tile merge XAI out of range

* update xai tile merge

* update rtdetr

* update tile recipes

* update rtdetr tile postprocess

* update rtdetr recipes and tile recipes

* update tile recipes

* fix rtdetr unittest

* update recipes

* refactor tile unit test

* address pr reviews

* remove unnecessary files

* update color channel

* fix image channel passing

* include tiling in cli integration test

* remove transform_bbox

---------

Co-authored-by: Vladislav Sovrasov <sovrasov.vlad@gmail.com>

* Support ImageFromBytes (#3948)

* add image_from_bytes

Signed-off-by: Ashwin Vaidya <ashwinnitinvaidya@gmail.com>

* refactor code

Signed-off-by: Ashwin Vaidya <ashwinnitinvaidya@gmail.com>

* allow empty anomalous masks

Signed-off-by: Ashwin Vaidya <ashwinnitinvaidya@gmail.com>

---------

Signed-off-by: Ashwin Vaidya <ashwinnitinvaidya@gmail.com>

* Change categories mapping logic (#3946)

* change pre-filtering logic

* Update src/otx/core/data/pre_filtering.py

Co-authored-by: Eunwoo Shin <eunwoo.shin@intel.com>

---------

Co-authored-by: Eunwoo Shin <eunwoo.shin@intel.com>

* Update for 2.2.0rc1 (#3956)

* Include Geti arrow dataset subset names (#3962)

* restrited number of output masks by tiling

* add geti subset name

* update num of max pred

* Include full image with anno in case there's no tile in tile dataset (#3964)

* include full image with anno incase there's no tile in dataset

* update test

* Add type checker in converter for callable functions (optimizer, scheduler) (#3968)

Fix converter callable functions (optimizer, scheduler)

* Update for 2.2.0rc2 (#3969)

update for 2.2.0rc2

* Fix config converter for tiling (#3973)

fix config converter for tiling

* Update for 2.2.0rc3 (#3975)

* Change sematic segmentation to consider bbox only annotations. (#3996)

* segmentation consider bbox only annotations

* add unit test

* add unit test

* update fixture

* use name attribute

* revert tox file

* update for 2.2.0rc4

---------

Co-authored-by: Yunchu Lee <yunchu.lee@intel.com>

* Relieve memory usage criteria on batch size 2 during adaptive_bs (#4009)

* release memory usage cirteria on batch size 2 during adpative_bs

* update unit test

* update unit test

* Remove background label from RT Info for segmentation task (#4011)

* remove background from rt_info

* provide another solution

* fix unit test

* Fix num_trials calculation on dataset length less than num_class (#4014)

Fix balanced sampler

* Fix out_features in HierarchicalCBAMClsHead (#4016)

Fix out_features

* Fix empty anno (#4010)

* Refactor mask_target_single function to handle unsupported ground truth mask types and provide warnings for missing ground truth masks

* Refactor bbox_overlaps function to handle unsupported ground truth mask types and provide warnings for missing ground truth masks

* Refactor export script to export multiple directories

* Refactor test_bbox_overlaps_2d to handle mismatched batch dimensions of bboxes

* Refactor bbox_overlaps function error exception

* update changelog

---------

Co-authored-by: Harim Kang <harim.kang@intel.com>

* Update for release 2.2.0rc5 (#4015)

* Prevent using too low confidence thresholds in detection (#4018)

Prevent writing too low confidence thresholds to MAPI configuration

* Update for release 2.2.0rc6 (#4027)

* Update pre-merge workflow (#4032)

* Update HPO interface (#4035)

* update hpo interface

* update unit test

* update CHANGELOG.md

* Enable keypoint detection training through config conversion (#4034)

enable keypoint det config converter

* Update for release 2.2.0rc7 (#4036)

update for release 2.2.0rc7

* Fix multilabel_accuracy of MixedHLabelAccuracy (#4042)

* Fix metric for multi-label

* Fix1

* Add CHANGELOG

* Update for release 2.2.0rc8 (#4043)

* Fix wrong indices setting in HLabelInfo (#4044)

* Fix wrong indices setting in label_info

* Add unit-test & update for releases

* Add legacy template LiteHRNet_18 template (#4049)

added legacy template

* Model templates: rename model_status value 'DISCONTINUED' to 'OBSOLETE' (#4051)

rename 'DISCONTINUED' to 'OBSOLETE' in model templates

* Enable export of feature vectors for semantic segmentation task (#4055)

* Upgrade MAPI in 2.2 (#4052)

* Update MRCNN model export to include feature vector and saliency map (#4056)

* Fix applying model's hparams when loading model from checkpoint (#4057)

* Update anomaly transforms (#4059)

* Update transforms

Signed-off-by: Ashwin Vaidya <ashwinnitinvaidya@gmail.com>

* Update transforms

Signed-off-by: Ashwin Vaidya <ashwinnitinvaidya@gmail.com>

* Update changelog

Signed-off-by: Ashwin Vaidya <ashwinnitinvaidya@gmail.com>

* Update __init__.py

---------

Signed-off-by: Ashwin Vaidya <ashwinnitinvaidya@gmail.com>
Co-authored-by: Emily Chun <emily.chun@intel.com>

* Bump onnx to 1.17.0 to omit CVE-2024-5187 (#4063)

* Fix incorrect all_groups order configuration in HLabelInfo (#4067)

* Fix all_labels

* Update CHAGELOG

* label_groups change

* Fix wrong model name in converter & template (#4082)

* Fix wrong

* Update CHAGELOG

* RTMDet Inst Seg Explain Mode for 2.2 (#4083)

* Explain mode for RTMDet Inst Seg

* Update changelog

* reformat changelog

* Fix rtdetr recipes (#4079)

* Fix recipes

* Update CHANGELOG

* Enable adaptive_bs with Efficientnet-V2-L model template (#4085)

Enable adaptive_bs with Efficientnet-V2-L model

* Add Keypoint Detection legacy template (#4094)

added rtmpose_template

* fix template

* Revert the old workaround for detection confidence threshold (#4096)

Revert the old workaround

* OTX RC 2.2 version up (#4099)

* Update changelog

* OTX version up

* Fix linter

* fix linter

* Add dummy XAI to RTDETR (export mode) & disable strong aug (#4106)

* Implement warning for unsupported explain mode in DETR model and update transform probabilities to zero in RTDETR recipes

* update changelog

* Update photometric distortion probability in RTDETR recipes

* Fix task chain for Det -> Cls / Seg (#4105)

* fix linter

* return recipe back

* added roi extraction for multi cllass classification datasett

* fix linter

* add same logic to semantic seg

* added test for OTXDataset

* add clip and raise an error when coordinates are invalid.

* rewrite value error

* minor change to CHANGELOG

* fix linter

* fix diffusion

* fix tiling

* Disable tiling classifier toggle in configurable parameters (#4107)

* Disable tiling classifier toggle in configurable parameters

* Update changelog

* fix RTDETR

* fix test with augs

* switch off the IS for test_augs

* remove FilterAnnotations for RTMdet

* Update keypoint detection template (#4114)

* added default template

* update field

* quick fix for rtmdet

* minor update

* minor fix

---------

Signed-off-by: Ashwin Vaidya <ashwinnitinvaidya@gmail.com>
Co-authored-by: Yunchu Lee <yunchu.lee@intel.com>
Co-authored-by: Harim Kang <harim.kang@intel.com>
Co-authored-by: Emily Chun <emily.chun@intel.com>
Co-authored-by: Eunwoo Shin <eunwoo.shin@intel.com>
Co-authored-by: Kim, Sungchul <sungchul.kim@intel.com>
Co-authored-by: Vladislav Sovrasov <sovrasov.vlad@gmail.com>
Co-authored-by: Sooah Lee <sooah.lee@intel.com>
Co-authored-by: Eugene Liu <eugene.liu@intel.com>
Co-authored-by: Wonju Lee <wonju.lee@intel.com>
Co-authored-by: Ashwin Vaidya <ashwin.vaidya@intel.com>
Co-authored-by: Leonardo Lai <leonardo.lai@intel.com>
  • Loading branch information
12 people authored Nov 14, 2024
1 parent ac2393f commit 45e79b6
Show file tree
Hide file tree
Showing 29 changed files with 776 additions and 102 deletions.
20 changes: 18 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,10 @@ All notable changes to this project will be documented in this file.
(<https://github.com/openvinotoolkit/training_extensions/pull/3788>)
- Add diffusion task
(<https://github.com/openvinotoolkit/training_extensions/pull/3875>)
- Revert the old workaround for detection confidence threshold
(<https://github.com/openvinotoolkit/training_extensions/pull/4096>)
- Add Keypoint Detection legacy template
(<https://github.com/openvinotoolkit/training_extensions/pull/4094>)

### Enhancements

Expand Down Expand Up @@ -125,6 +129,8 @@ All notable changes to this project will be documented in this file.
(<https://github.com/openvinotoolkit/training_extensions/pull/4009>)
- Remove background label from RT Info for segmentation task
(<https://github.com/openvinotoolkit/training_extensions/pull/4011>)
- Enable export of the feature vectors for semantic segmentation task
(<https://github.com/openvinotoolkit/training_extensions/pull/4055>)
- Prevent using too low confidence thresholds in detection
(<https://github.com/openvinotoolkit/training_extensions/pull/4018>)
- Update HPO interface
Expand Down Expand Up @@ -162,8 +168,6 @@ All notable changes to this project will be documented in this file.
(<https://github.com/openvinotoolkit/training_extensions/pull/4049>)
- Model templates: rename model_status value 'DISCONTINUED' to 'OBSOLETE'
(<https://github.com/openvinotoolkit/training_extensions/pull/4051>)
- Enable export of feature vectors for semantic segmentation task
(<https://github.com/openvinotoolkit/training_extensions/pull/4055>)
- Update MRCNN model export to include feature vector and saliency map
(<https://github.com/openvinotoolkit/training_extensions/pull/4056>)
- Upgrade MAPI in 2.2
Expand All @@ -172,6 +176,18 @@ All notable changes to this project will be documented in this file.
(<https://github.com/openvinotoolkit/training_extensions/pull/4057>)
- Fix incorrect all_groups order configuration in HLabelInfo
(<https://github.com/openvinotoolkit/training_extensions/pull/4067>)
- Fix RTDETR recipes
(<https://github.com/openvinotoolkit/training_extensions/pull/4079>)
- Fix wrong model name in converter & template
(<https://github.com/openvinotoolkit/training_extensions/pull/4082>)
- Fix RTMDet Inst Explain Mode
(<https://github.com/openvinotoolkit/training_extensions/pull/4083>)
- Fix RTDETR Explain Mode
(<https://github.com/openvinotoolkit/training_extensions/pull/4106>)
- Fix classification and semantic segmentation tasks, when ROI provided for images
(<https://github.com/openvinotoolkit/training_extensions/pull/4105>)
- Disable tiling classifier toggle in configurable parameters
(<https://github.com/openvinotoolkit/training_extensions/pull/4107>)

## \[v2.1.0\]

Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ base = [
"timm==1.0.3",
"openvino==2024.4",
"openvino-dev==2024.4",
"openvino-model-api==0.2.4",
"openvino-model-api==0.2.5",
"onnx==1.17.0",
"onnxconverter-common==1.14.0",
"nncf==2.13.0",
Expand Down
17 changes: 12 additions & 5 deletions src/otx/algo/detection/detectors/detection_transformer.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@

from __future__ import annotations

import warnings
from typing import Any

import numpy as np
Expand Down Expand Up @@ -95,16 +96,22 @@ def export(
explain_mode: bool = False,
) -> dict[str, Any] | tuple[list[Any], list[Any], list[Any]]:
"""Exports the model."""
if explain_mode:
msg = "Explain mode is not supported for DETR models yet."
raise NotImplementedError(msg)

return self.postprocess(
results = self.postprocess(
self._forward_features(batch_inputs),
[meta["img_shape"] for meta in batch_img_metas],
deploy_mode=True,
)

if explain_mode:
# TODO(Eugene): Implement explain mode for DETR model.
warnings.warn("Explain mode is not supported for DETR model. Return dummy values.", stacklevel=2)
xai_output = {
"feature_vector": torch.zeros(1, 1),
"saliency_map": torch.zeros(1),
}
results.update(xai_output) # type: ignore[union-attr]
return results

def postprocess(
self,
outputs: dict[str, Tensor],
Expand Down
2 changes: 1 addition & 1 deletion src/otx/algo/utils/xai_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -225,7 +225,7 @@ def _get_image_data_name(
subset = datamodule.subsets[subset_name]
item = subset.dm_subset[img_id]
img = item.media_as(Image)
img_data, _ = subset._get_img_data_and_shape(img) # noqa: SLF001
img_data, _, _ = subset._get_img_data_and_shape(img) # noqa: SLF001
image_save_name = "".join([char if char.isalnum() else "_" for char in item.id])
return img_data, image_save_name

Expand Down
2 changes: 1 addition & 1 deletion src/otx/core/data/dataset/anomaly.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ def _get_item_impl(
datumaro_item = self.dm_subset[index]
img = datumaro_item.media_as(Image)
# returns image in RGB format if self.image_color_channel is RGB
img_data, img_shape = self._get_img_data_and_shape(img)
img_data, img_shape, _ = self._get_img_data_and_shape(img)

label = self._get_label(datumaro_item)

Expand Down
59 changes: 48 additions & 11 deletions src/otx/core/data/dataset/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
from abc import abstractmethod
from collections.abc import Iterable
from contextlib import contextmanager
from typing import TYPE_CHECKING, Callable, Generic, Iterator, List, Union
from typing import TYPE_CHECKING, Any, Callable, Generic, Iterator, List, Union

import cv2
import numpy as np
Expand Down Expand Up @@ -92,6 +92,7 @@ def __init__(
self.image_color_channel = image_color_channel
self.stack_images = stack_images
self.to_tv_image = to_tv_image

if self.dm_subset.categories():
self.label_info = LabelInfo.from_dm_label_groups(self.dm_subset.categories()[AnnotationType.label])
else:
Expand Down Expand Up @@ -141,11 +142,30 @@ def __getitem__(self, index: int) -> T_OTXDataEntity:
msg = f"Reach the maximum refetch number ({self.max_refetch})"
raise RuntimeError(msg)

def _get_img_data_and_shape(self, img: Image) -> tuple[np.ndarray, tuple[int, int]]:
key = img.path if isinstance(img, ImageFromFile) else id(img)
def _get_img_data_and_shape(
self,
img: Image,
roi: dict[str, Any] | None = None,
) -> tuple[np.ndarray, tuple[int, int], dict[str, Any] | None]:
"""Get image data and shape.
This method is used to get image data and shape from Datumaro image object.
If ROI is provided, the image data is extracted from the ROI.
Args:
img (Image): Image object from Datumaro.
roi (dict[str, Any] | None, Optional): Region of interest.
Represented by dict with coordinates and some meta information.
if (img_data := self.mem_cache_handler.get(key=key)[0]) is not None:
return img_data, img_data.shape[:2]
Returns:
The image data, shape, and ROI meta information
"""
key = img.path if isinstance(img, ImageFromFile) else id(img)
roi_meta = None
# check if the image is already in the cache
img_data, roi_meta = self.mem_cache_handler.get(key=key)
if img_data is not None:
return img_data, img_data.shape[:2], roi_meta

with image_decode_context():
img_data = (
Expand All @@ -158,11 +178,28 @@ def _get_img_data_and_shape(self, img: Image) -> tuple[np.ndarray, tuple[int, in
msg = "Cannot get image data"
raise RuntimeError(msg)

img_data = self._cache_img(key=key, img_data=img_data.astype(np.uint8))
if roi and isinstance(roi, dict):
# extract ROI from image
shape = roi["shape"]
h, w = img_data.shape[:2]
x1, y1, x2, y2 = (
int(np.clip(np.trunc(shape["x1"] * w), 0, w)),
int(np.clip(np.trunc(shape["y1"] * h), 0, h)),
int(np.clip(np.ceil(shape["x2"] * w), 0, w)),
int(np.clip(np.ceil(shape["y2"] * h), 0, h)),
)
if (x2 - x1) * (y2 - y1) <= 0:
msg = f"ROI has zero or negative area. ROI coordinates: {x1}, {y1}, {x2}, {y2}"
raise ValueError(msg)

img_data = img_data[y1:y2, x1:x2]
roi_meta = {"x1": x1, "y1": y1, "x2": x2, "y2": y2, "orig_image_shape": (h, w)}

img_data = self._cache_img(key=key, img_data=img_data.astype(np.uint8), meta=roi_meta)

return img_data, img_data.shape[:2]
return img_data, img_data.shape[:2], roi_meta

def _cache_img(self, key: str | int, img_data: np.ndarray) -> np.ndarray:
def _cache_img(self, key: str | int, img_data: np.ndarray, meta: dict[str, Any] | None = None) -> np.ndarray:
"""Cache an image after resizing.
If there is available space in the memory pool, the input image is cached.
Expand All @@ -182,14 +219,14 @@ def _cache_img(self, key: str | int, img_data: np.ndarray) -> np.ndarray:
return img_data

if self.mem_cache_img_max_size is None:
self.mem_cache_handler.put(key=key, data=img_data, meta=None)
self.mem_cache_handler.put(key=key, data=img_data, meta=meta)
return img_data

height, width = img_data.shape[:2]
max_height, max_width = self.mem_cache_img_max_size

if height <= max_height and width <= max_width:
self.mem_cache_handler.put(key=key, data=img_data, meta=None)
self.mem_cache_handler.put(key=key, data=img_data, meta=meta)
return img_data

# Preserve the image size ratio and fit to max_height or max_width
Expand All @@ -206,7 +243,7 @@ def _cache_img(self, key: str | int, img_data: np.ndarray) -> np.ndarray:
self.mem_cache_handler.put(
key=key,
data=resized_img,
meta=None,
meta=meta,
)
return resized_img

Expand Down
28 changes: 14 additions & 14 deletions src/otx/core/data/dataset/classification.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,18 +32,18 @@ class OTXMulticlassClsDataset(OTXDataset[MulticlassClsDataEntity]):
def _get_item_impl(self, index: int) -> MulticlassClsDataEntity | None:
item = self.dm_subset[index]
img = item.media_as(Image)
img_data, img_shape = self._get_img_data_and_shape(img)
roi = item.attributes.get("roi", None)
img_data, img_shape, _ = self._get_img_data_and_shape(img, roi)
if roi:
# extract labels from ROI
labels_ids = [
label["label"]["_id"] for label in roi["labels"] if label["label"]["domain"] == "CLASSIFICATION"
]
label_anns = [self.label_info.label_names.index(label_id) for label_id in labels_ids]
else:
# extract labels from annotations
label_anns = [ann.label for ann in item.annotations if isinstance(ann, Label)]

label_anns = []
for ann in item.annotations:
if isinstance(ann, Label):
label_anns.append(ann)
else:
# If the annotation is not Label, it should be converted to Label.
# For Chained Task: Detection (Bbox) -> Classification (Label)
label = Label(label=ann.label)
if label not in label_anns:
label_anns.append(label)
if len(label_anns) > 1:
msg = f"Multi-class Classification can't use the multi-label, currently len(labels) = {len(label_anns)}"
raise ValueError(msg)
Expand All @@ -56,7 +56,7 @@ def _get_item_impl(self, index: int) -> MulticlassClsDataEntity | None:
ori_shape=img_shape,
image_color_channel=self.image_color_channel,
),
labels=torch.as_tensor([ann.label for ann in label_anns]),
labels=torch.as_tensor(label_anns),
)

return self._apply_transforms(entity)
Expand All @@ -78,7 +78,7 @@ def _get_item_impl(self, index: int) -> MultilabelClsDataEntity | None:
item = self.dm_subset[index]
img = item.media_as(Image)
ignored_labels: list[int] = [] # This should be assigned form item
img_data, img_shape = self._get_img_data_and_shape(img)
img_data, img_shape, _ = self._get_img_data_and_shape(img)

label_anns = []
for ann in item.annotations:
Expand Down Expand Up @@ -195,7 +195,7 @@ def _get_item_impl(self, index: int) -> HlabelClsDataEntity | None:
item = self.dm_subset[index]
img = item.media_as(Image)
ignored_labels: list[int] = [] # This should be assigned form item
img_data, img_shape = self._get_img_data_and_shape(img)
img_data, img_shape, _ = self._get_img_data_and_shape(img)

label_anns = []
for ann in item.annotations:
Expand Down
2 changes: 1 addition & 1 deletion src/otx/core/data/dataset/detection.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ def _get_item_impl(self, index: int) -> DetDataEntity | None:
item = self.dm_subset[index]
img = item.media_as(Image)
ignored_labels: list[int] = [] # This should be assigned form item
img_data, img_shape = self._get_img_data_and_shape(img)
img_data, img_shape, _ = self._get_img_data_and_shape(img)

bbox_anns = [ann for ann in item.annotations if isinstance(ann, Bbox)]

Expand Down
2 changes: 1 addition & 1 deletion src/otx/core/data/dataset/diffusion.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ def _get_item_impl(self, idx: int) -> DiffusionDataEntity | None:
item = self.dm_subset[idx]
caption = item.annotations[0].caption
img = item.media_as(Image)
img_data, img_shape = self._get_img_data_and_shape(img)
img_data, img_shape, _ = self._get_img_data_and_shape(img)
entity = DiffusionDataEntity(
image=img_data,
img_info=ImageInfo(
Expand Down
2 changes: 1 addition & 1 deletion src/otx/core/data/dataset/instance_segmentation.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ def _get_item_impl(self, index: int) -> InstanceSegDataEntity | None:
item = self.dm_subset[index]
img = item.media_as(Image)
ignored_labels: list[int] = []
img_data, img_shape = self._get_img_data_and_shape(img)
img_data, img_shape, _ = self._get_img_data_and_shape(img)

gt_bboxes, gt_labels, gt_masks, gt_polygons = [], [], [], []

Expand Down
2 changes: 1 addition & 1 deletion src/otx/core/data/dataset/keypoint_detection.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ def _get_item_impl(self, index: int) -> KeypointDetDataEntity | None:
item = self.dm_subset[index]
img = item.media_as(Image)
ignored_labels: list[int] = [] # This should be assigned form item
img_data, img_shape = self._get_img_data_and_shape(img)
img_data, img_shape, _ = self._get_img_data_and_shape(img)

bbox_anns = [ann for ann in item.annotations if isinstance(ann, Bbox)]
bboxes = (
Expand Down
2 changes: 1 addition & 1 deletion src/otx/core/data/dataset/object_detection_3d.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ def __init__(
def _get_item_impl(self, index: int) -> Det3DDataEntity | None:
entity = self.dm_subset[index]
image = entity.media_as(Image)
image, ori_img_shape = self._get_img_data_and_shape(image)
image, ori_img_shape, _ = self._get_img_data_and_shape(image)
calib = self.get_calib_from_file(entity.attributes["calib_path"])
annotations_copy = deepcopy(entity.annotations)
datumaro_kitti_format = [obj.attributes for obj in annotations_copy]
Expand Down
9 changes: 7 additions & 2 deletions src/otx/core/data/dataset/segmentation.py
Original file line number Diff line number Diff line change
Expand Up @@ -203,9 +203,14 @@ def _get_item_impl(self, index: int) -> SegDataEntity | None:
item = self.dm_subset[index]
img = item.media_as(Image)
ignored_labels: list[int] = []
img_data, img_shape = self._get_img_data_and_shape(img)
roi = item.attributes.get("roi", None)
img_data, img_shape, roi_meta = self._get_img_data_and_shape(img, roi)
if item.annotations:
extracted_mask = _extract_class_mask(item=item, img_shape=img_shape, ignore_index=self.ignore_index)
ori_shape = roi_meta["orig_image_shape"] if roi_meta else img_shape
extracted_mask = _extract_class_mask(item=item, img_shape=ori_shape, ignore_index=self.ignore_index)
if roi_meta:
extracted_mask = extracted_mask[roi_meta["y1"] : roi_meta["y2"], roi_meta["x1"] : roi_meta["x2"]]

masks = tv_tensors.Mask(extracted_mask[None])
else:
# semi-supervised learning, unlabeled dataset
Expand Down
6 changes: 3 additions & 3 deletions src/otx/core/data/dataset/tile.py
Original file line number Diff line number Diff line change
Expand Up @@ -414,7 +414,7 @@ def _get_item_impl(self, index: int) -> TileDetDataEntity: # type: ignore[overr
"""
item = self.dm_subset[index]
img = item.media_as(Image)
img_data, img_shape = self._get_img_data_and_shape(img)
img_data, img_shape, _ = self._get_img_data_and_shape(img)

bbox_anns = [ann for ann in item.annotations if isinstance(ann, Bbox)]

Expand Down Expand Up @@ -505,7 +505,7 @@ def _get_item_impl(self, index: int) -> TileInstSegDataEntity: # type: ignore[o
"""
item = self.dm_subset[index]
img = item.media_as(Image)
img_data, img_shape = self._get_img_data_and_shape(img)
img_data, img_shape, _ = self._get_img_data_and_shape(img)

gt_bboxes, gt_labels, gt_masks, gt_polygons = [], [], [], []

Expand Down Expand Up @@ -607,7 +607,7 @@ def _get_item_impl(self, index: int) -> TileSegDataEntity: # type: ignore[overr
"""
item = self.dm_subset[index]
img = item.media_as(Image)
img_data, img_shape = self._get_img_data_and_shape(img)
img_data, img_shape, _ = self._get_img_data_and_shape(img)

extracted_mask = _extract_class_mask(item=item, img_shape=img_shape, ignore_index=self.ignore_index)
masks = tv_tensors.Mask(extracted_mask[None])
Expand Down
4 changes: 2 additions & 2 deletions src/otx/core/data/dataset/visual_prompting.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ def __init__(
def _get_item_impl(self, index: int) -> VisualPromptingDataEntity | None:
item = self.dm_subset[index]
img = item.media_as(dmImage)
img_data, img_shape = self._get_img_data_and_shape(img)
img_data, img_shape, _ = self._get_img_data_and_shape(img)

gt_bboxes, gt_points = [], []
gt_masks = defaultdict(list)
Expand Down Expand Up @@ -229,7 +229,7 @@ def __init__(
def _get_item_impl(self, index: int) -> ZeroShotVisualPromptingDataEntity | None:
item = self.dm_subset[index]
img = item.media_as(dmImage)
img_data, img_shape = self._get_img_data_and_shape(img)
img_data, img_shape, _ = self._get_img_data_and_shape(img)

prompts: list[ZeroShotPromptType] = []
gt_masks: list[tvMask] = []
Expand Down
1 change: 1 addition & 0 deletions src/otx/core/data/transform_libs/torchvision.py
Original file line number Diff line number Diff line change
Expand Up @@ -2650,6 +2650,7 @@ def forward(self, *_inputs: T_OTXDataEntity) -> T_OTXDataEntity | None:
if not keep.any() and self.keep_empty:
return self.convert(inputs)

keep = list(keep)
keys = ("bboxes", "labels", "masks", "polygons")
for key in keys:
if hasattr(inputs, key):
Expand Down
Loading

0 comments on commit 45e79b6

Please sign in to comment.