Merge back 2.2.0.rc3 to develop (#3963)

* update for releases 2.2.0rc0 * Fix Classification explain forward issue (#3867) Fix bug * Fix e2e code error (#3871) * Update test_cli.py * Update tests/e2e/cli/test_cli.py Co-authored-by: Eunwoo Shin <eunwoo.shin@intel.com> * Update test_cli.py * Update test_cli.py --------- Co-authored-by: Eunwoo Shin <eunwoo.shin@intel.com> * Add documentation about configurable input size (#3870) * add docs about configurable input size * update api usecase and fix bug * Fix zero-shot e2e (#3876) Fix * Fix DeiT for multi-label classification (#3881) Remove init_args * Fix Semi-SL for ViT accuracy drop (#3883) Remove init_args * Update docs for 2.2 (#3884) Update docs * Fix mean and scale for segmentation task (#3885) fix mean and scale * Update MAPI in 2.2 (#3889) * Bump MAPI * Update exportable code requirements * Improve Semi-SL for LiteHRNet (small-medium case) (#3891) * change drop pixels value * go safe, change only tested models * minor * Improve h-cls for eff models (#3893) * Update step size for eff v2 * Update effb0 recipe * Fix maskrcnn swin nncf acc drop (#3900) update maskrcnn swimt model type to transformer * Add keypoint detection recipe for single object cases (#3903) * add rtmpose_tiny for single obj * add rtmpose_tiny for single obj * modify test subset name * fix unit test * update recipe with reset * Improve acc drop of efficientnetv2 for h-label cls (#3907) * Add warmup_iters for effv2 * Update max_epochs * Fix pretrained weight cached dir for timm (#3909) * Fix pretrained_weight for timm * Fix unit-test * Fix keypoint detection single obj recipe (#3915) * add rtmpose_tiny for single obj * modify test subset name * fix unit test * property for pck * Fix cached dir for timm & hugging-face (#3914) * Fix cached dir * Pretrained weight download unit-test * Fix pre-commit * Fix wrong template id mapping for anomaly (#3916) * Update script to allow setting otx version using env. variable (#3913) * Fix Datamodule creation for OV in AutoConfigurator (#3920) Fix datamodule for ov * Update tpp file for 2.2.0 (#3921) * Fix names for ignored scope [HOT-FIX, 2.2.0] (#3924) fix names for ignored scope * Fix classification rt_info (#3922) * Restore output_raw_scores for classificaiton * Add uts * Fix linter * Update label info (#3925) add label info to init Signed-off-by: Ashwin Vaidya <ashwinnitinvaidya@gmail.com> * Fix binary classification metric task (#3928) * Fix binary classification * Add unit-tests * Improve MaskRCNN SwinT NNCF (#3929) * ignore heads and disable smooth quant * add activations_range_estimator_params * update changelog * Fix get_item for Chained Tasks in Classification (#3931) * Fix Task Chain * Add multi-label case as well * Add multi-label case as well2 * Add H-label case * Correct Keyerror for h-label cls in label_groups for dm_label_categories using label's id/key (#3932) Modify label_groups for dm_label_categories with id/key of label * Remove datumaro attribute id from tiling, add subset names (#3933) * remove datumaro attribute id from tiling * add subset names * Fix soft predictions for Semantic Segmentation (#3934) fix soft preds * Update STFPM config (#3935) * Add missing pretrained weights when creating a docker image (#3938) * Fix pre-trained weight downloader * Remove if condition for pretrained wiehgt download * Change default option 'full' to 'base' in otx install (#3937) * Change option full to base for otx install * Fix wrong code * Fix issue * Fix docs * Fix auto adapt batch size in Converter (#3939) * Enable auto adapt batch size into converter * Fix wrong * Fix hpo converter (#3940) * save best hp after hpo * add test * Fix tiling XAI out of range (#3943) - Fix tile merge XAI out of range * enable model export (#3952) Signed-off-by: Ashwin Vaidya <ashwinnitinvaidya@gmail.com> * Move templates from OTX1.X to OTX2.X (#3951) * add otx1.6 templates * added new models * delete entrypoints and nncf cfg * updated some hyperparams * fix for rtmdet_tiny * updated converter * Update classification templates * Update det, r-det, vpm * Update template.yaml * changed warmaup value in train.yaml --------- Co-authored-by: Kang, Harim <harim.kang@intel.com> Co-authored-by: Kim, Sungchul <sungchul.kim@intel.com> * Add missing tile recipes and various tile recipe changes (#3942) * add missing tile recipes * Fix tiling XAI out of range (#3943) - Fix tile merge XAI out of range * update xai tile merge * update rtdetr * update tile recipes * update rtdetr tile postprocess * update rtdetr recipes and tile recipes * update tile recipes * fix rtdetr unittest * update recipes * refactor tile unit test * address pr reviews * remove unnecessary files * update color channel * fix image channel passing * include tiling in cli integration test * remove transform_bbox --------- Co-authored-by: Vladislav Sovrasov <sovrasov.vlad@gmail.com> * Support ImageFromBytes (#3948) * add image_from_bytes Signed-off-by: Ashwin Vaidya <ashwinnitinvaidya@gmail.com> * refactor code Signed-off-by: Ashwin Vaidya <ashwinnitinvaidya@gmail.com> * allow empty anomalous masks Signed-off-by: Ashwin Vaidya <ashwinnitinvaidya@gmail.com> --------- Signed-off-by: Ashwin Vaidya <ashwinnitinvaidya@gmail.com> * Change categories mapping logic (#3946) * change pre-filtering logic * Update src/otx/core/data/pre_filtering.py Co-authored-by: Eunwoo Shin <eunwoo.shin@intel.com> --------- Co-authored-by: Eunwoo Shin <eunwoo.shin@intel.com> * Update for 2.2.0rc1 (#3956) * Include Geti arrow dataset subset names (#3962) * restrited number of output masks by tiling * add geti subset name * update num of max pred * Include full image with anno in case there's no tile in tile dataset (#3964) * include full image with anno incase there's no tile in dataset * update test * Add type checker in converter for callable functions (optimizer, scheduler) (#3968) Fix converter callable functions (optimizer, scheduler) * Update for 2.2.0rc2 (#3969) update for 2.2.0rc2 * Update CHANGELOG.md Co-authored-by: Kim, Sungchul <sungchul.kim@intel.com> * fix semantic seg tests * fix detection tiling * Update test_tiling.py * Update test_tiling.py * fix unit test --------- Signed-off-by: Ashwin Vaidya <ashwinnitinvaidya@gmail.com> Co-authored-by: Yunchu Lee <yunchu.lee@intel.com> Co-authored-by: Harim Kang <harim.kang@intel.com> Co-authored-by: Emily Chun <emily.chun@intel.com> Co-authored-by: Eunwoo Shin <eunwoo.shin@intel.com> Co-authored-by: Kim, Sungchul <sungchul.kim@intel.com> Co-authored-by: Vladislav Sovrasov <sovrasov.vlad@gmail.com> Co-authored-by: Sooah Lee <sooah.lee@intel.com> Co-authored-by: Eugene Liu <eugene.liu@intel.com> Co-authored-by: Wonju Lee <wonju.lee@intel.com> Co-authored-by: Ashwin Vaidya <ashwin.vaidya@intel.com>
openvinotoolkit · Sep 25, 2024 · 3249377 · 3249377
1 parent 0e2ab6c
commit 3249377
Show file tree

Hide file tree

Showing 162 changed files with 9,954 additions and 884 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,7 +2,7 @@
 
 All notable changes to this project will be documented in this file.
 
-## \[unreleased\]
+## \[2.2.0\]
 
 ### New features
 
@@ -45,15 +45,31 @@ All notable changes to this project will be documented in this file.
   (<https://github.com/openvinotoolkit/training_extensions/pull/3769>)
 - Refactoring `ConvModule` by removing `conv_cfg`, `norm_cfg`, and `act_cfg`
   (<https://github.com/openvinotoolkit/training_extensions/pull/3783>, <https://github.com/openvinotoolkit/training_extensions/pull/3816>, <https://github.com/openvinotoolkit/training_extensions/pull/3809>)
+- Support ImageFromBytes
+  (<https://github.com/openvinotoolkit/training_extensions/pull/3948>)
+- Enable model export
+  (<https://github.com/openvinotoolkit/training_extensions/pull/3952>)
+- Move templates from OTX1.X to OTX2.X
+  (<https://github.com/openvinotoolkit/training_extensions/pull/3951>)
+- Include Geti arrow dataset subset names
+  (<https://github.com/openvinotoolkit/training_extensions/pull/3962>)
+- Include full image with anno in case there's no tile in tile dataset
+  (<https://github.com/openvinotoolkit/training_extensions/pull/3964>)
+- Add type checker in converter for callable functions (optimizer, scheduler)
+  (<https://github.com/openvinotoolkit/training_extensions/pull/3968>)
 
 ### Bug fixes
 
 - Fix Combined Dataloader & unlabeled warmup loss in Semi-SL
-  (https://github.com/openvinotoolkit/training_extensions/pull/3723)
+  (<https://github.com/openvinotoolkit/training_extensions/pull/3723>)
 - Revert #3579 to fix issues with replacing coco_instance with a different format in some dataset
-  (https://github.com/openvinotoolkit/training_extensions/pull/3753)
+  (<https://github.com/openvinotoolkit/training_extensions/pull/3753>)
 - Add num_devices in Engine for multi-gpu training
-  (https://github.com/openvinotoolkit/training_extensions/pull/3778)
+  (<https://github.com/openvinotoolkit/training_extensions/pull/3778>)
+- Add missing tile recipes and various tile recipe changes
+  (<https://github.com/openvinotoolkit/training_extensions/pull/3942>)
+- Change categories mapping logic
+  (<https://github.com/openvinotoolkit/training_extensions/pull/3946>)
 
 ## \[v2.1.0\]
 
@@ -191,6 +207,8 @@ All notable changes to this project will be documented in this file.
   (<https://github.com/openvinotoolkit/training_extensions/pull/3684>)
 - Fix MaskRCNN SwinT NNCF Accuracy Drop
   (<https://github.com/openvinotoolkit/training_extensions/pull/3685>)
+- Fix MaskRCNN SwinT NNCF Accuracy Drop By Adding More PTQ Configs
+  (<https://github.com/openvinotoolkit/training_extensions/pull/3929>)
 
 ### Known issues
 

diff --git a/README.md b/README.md
@@ -166,83 +166,44 @@ In addition to the examples above, please refer to the documentation for tutoria
 
 ---
 
-## Updates
-
-### v2.1.0 (3Q24)
-
-> _**NOTES**_
->
-> OpenVINO™ Training Extensions, version 2.1.0 does not include the latest functional and security updates. OpenVINO™ Training Extensions, version 2.2.0 is targeted to be released in September 2024 and will include additional functional and security updates. Customers should update to the latest version as it becomes available.
+## Updates - v2.2.0 (3Q24)
 
 ### New features
 
-- Add a flag to enable OV inference on dGPU
-- Add early stopping with warmup. Remove mandatory background label in semantic segmentation task
-- RTMDet-tiny enablement for detection task
-- Add data_format validation and update in OTXDataModule
-- Add torchvision.MaskRCNN
-- Add Semi-SL for Multi-class Classification (EfficientNet-B0)
-- Decoupling mmaction for action classification (MoviNet, X3D)
-- Add Semi-SL Algorithms for mv3-large, effnet-v2, deit-tiny, dino-v2
-- RTMDet-tiny enablement for detection task (export/optimize)
-- Enable ruff & ruff-format into otx/algo/classification/backbones
-- Add TV MaskRCNN Tile Recipe
-- Add rotated det OV recipe
+- Add RT-DETR model for Object Detection
+- Add Multi-Label & H-label Classification with torchvision models
+- Add Hugging-Face Model Wrapper for Classification
+- Add LoRA finetuning capability for ViT Architectures
+- Add Hugging-Face Model Wrapper for Object Detection
+- Add Hugging-Face Model Wrapper for Semantic Segmentation
+- Enable torch.compile to work with classification
+- Add `otx benchmark` subcommand
+- Add RTMPose for Keypoint Detection Task
+- Add Semi-SL MeanTeacher algorithm for Semantic Segmentation
+- Update head and h-label format for hierarchical label classification
+- Support configurable input size
 
 ### Enhancements
 
-- Change load_stat_dict to on_load_checkpoint
-- Add try - except to keep running the remaining tests
-- Update instance_segmentation.py to resolve conflict with 2.0.0
-- Update XPU install
-- Sync rgb order between torch and ov inference of action classification task
-- Make Perf test available to load pervious Perf test to skip training stage
-- Reenable e2e classification XAI tests
-- Remove action detection task support
-- Increase readability of pickling error log during HPO & fix minor bug
-- Update RTMDet checkpoint url
-- Refactor Torchvision Model for Classification Semi-SL
-- Add coverage omit mm-related code
-- Add docs semi-sl part
-- Refactor docs design & Add contents
-- Add execution example of auto batch size in docs
-- Add Semi-SL for cls Benchmark Test
-- Move value to device before logging for metric
-- Add .codecov.yaml
-- Update benchmark tool for otx2.1
-- Collect pretrained weight binary files in one place
-- Minimize compiled dependency files
-- Update README & CODEOWNERS
-- Update Engine's docstring & CLI --help outputs
-- Align integration test to exportable code interface update for release branch
-- Refactor exporter for anomaly task and fix a bug with exportable code
-- Update pandas version constraint
-- Include more models to export test into test_otx_e2e
-- Move assigning tasks to Models from Engine to Anomaly Model Classes
-- Refactoring detection modules
+- Reimplement of ViT Architecture following TIMM
+- Enable to override data configurations
+- Enable to use input_size at transforms in recipe
+- Enable to use polygon and bitmap mask as prompt inputs for zero-shot learning
+- Refactoring `ConvModule` by removing `conv_cfg`, `norm_cfg`, and `act_cfg`
+- Support ImageFromBytes
+- enable model export
+- Move templates from OTX1.X to OTX2.X
+- Include Geti arrow dataset subset names
+- Include full image with anno in case there's no tile in tile dataset
+- Add type checker in converter for callable functions (optimizer, scheduler)
 
 ### Bug fixes
 
-- Fix conflicts between develop and 2.0.0
-- Fix polygon mask
-- Fix vpm intg test error
-- Fix anomaly
-- Bug fix in Semantic Segmentation + enable DINOV2 export in ONNX
-- Fix some export issues. Remove EXPORTABLE_CODE as export parameter.
-- Fix `load_from_checkpoint` to apply original model's hparams
-- Fix `load_from_checkpoint` args to apply original model's hparams
-- Fix zero-shot `learn` for ov model
-- Various fixes for XAI in 2.1
-- Fix tests to work in a mm-free environment
-- Fix a bug in benchmark code
-- Update exportable code dependency & fix a bug
-- Fix getting wrong shape during resizing
-- Fix detection prediction outputs
-- Fix RTMDet PTQ performance
-- Fix segmentation fault on VPM PTQ
-- Fix NNCF MaskRCNN-Eff accuracy drop
-- Fix optimize with Semi-SL data pipeline
-- Fix MaskRCNN SwinT NNCF Accuracy Drop
+- Fix Combined Dataloader & unlabeled warmup loss in Semi-SL
+- Revert #3579 to fix issues with replacing coco_instance with a different format in some dataset
+- Add num_devices in Engine for multi-gpu training
+- Add missing tile recipes and various tile recipe changes
+- Change categories mapping logic
 
 ### Known issues
 

diff --git a/docker/build.sh b/docker/build.sh
@@ -1,7 +1,9 @@
 #!/bin/bash
-# shellcheck disable=SC2154
+# shellcheck disable=SC2154,SC2035,SC2046
 
-OTX_VERSION=$(python -c 'import otx; print(otx.__version__)')
+if [ "$OTX_VERSION" == "" ]; then
+    OTX_VERSION=$(python -c 'import otx; print(otx.__version__)')
+fi
 THIS_DIR=$(dirname "$0")
 
 echo "Build OTX ${OTX_VERSION} CUDA Docker image..."

diff --git a/docker/download_pretrained_weights.py b/docker/download_pretrained_weights.py
@@ -32,10 +32,6 @@ def download_all() -> None:
             msg = f"Skip {config_path} since it is not a PyTorch config."
             logger.warning(msg)
             continue
-        if "anomaly_" in str(config_path) or "dino_v2" in str(config_path) or "h_label_cls" in str(config_path):
-            msg = f"Skip {config_path} since those models show errors on instantiation."
-            logger.warning(msg)
-            continue
 
         config = OmegaConf.load(config_path)
         init_model = next(iter(partial_instantiate_class(config.model)))

diff --git a/docs/source/guide/explanation/additional_features/configurable_input_size.rst b/docs/source/guide/explanation/additional_features/configurable_input_size.rst
@@ -0,0 +1,116 @@
+Configurable Input Size
+=======================
+
+The Configurable Input Size feature allows users to adjust the input resolution of their deep learning models
+to balance between training and inference speed and model performance.
+This flexibility enables users to tailor the input size to their specific needs without manually altering
+the data pipeline configurations.
+
+To utilize this feature, simply specify the desired input size as an argument during the train command.
+Additionally, OTX ensures compatibility with model trained on non-default input sizes by automatically adjusting
+the data pipeline to match the input size during other engine entry points.
+
+Usage example:
+
+.. code-block::
+
+    $ otx train \
+        --config ... \
+
+.. tab-set::
+
+    .. tab-item:: API 1
+
+        .. code-block:: python
+
+            from otx.algo.detection.yolox import YOLOXS
+            from otx.core.data.module import OTXDataModule
+            from otx.engine import Engine
+
+            input_size = (512, 512)
+            model = YOLOXS(label_info=5, input_size=input_size)  # should be tuple[int, int]
+            datamodule = OTXDataModule(..., input_size=input_size)
+            engine = Engine(model=model, datamodule=datamodule)
+            engine.train()
+
+    .. tab-item:: API 2
+
+        .. code-block:: python
+
+            from otx.core.data.module import OTXDataModule
+            from otx.engine import Engine
+
+            datamodule = OTXDataModule(..., input_size=(512, 512))
+            engine = Engine(model="yolox_s", datamodule=datamodule)  # model input size will be aligned with the datamodule input size
+            engine.train()
+
+    .. tab-item:: CLI
+
+        .. code-block:: bash
+
+            (otx) ...$ otx train ... --data.input_size 512
+
+.. _adaptive-input-size:
+
+Adaptive Input Size
+-------------------
+
+The Adaptive Input Size feature intelligently determines an optimal input size for the model
+by analyzing the dataset's statistics.
+It operates in two distinct modes: "auto" and "downscale".
+In "auto" mode, the input size may increase or decrease based on the dataset's characteristics.
+In "downscale" mode, the input size will either decrease or remain unchanged, ensuring that the model training or inference speed deosn't drop.
+
+
+To activate this feature, use the following command with the desired mode:
+
+.. tab-set::
+
+    .. tab-item:: API
+
+        .. code-block:: python
+
+            from otx.algo.detection.yolox import YOLOXS
+            from otx.core.data.module import OTXDataModule
+            from otx.engine import Engine
+
+            datamodule = OTXDataModule(
+                ...
+                adaptive_input_size="auto",  # auto or downscale
+                input_size_multiplier=YOLOXS.input_size_multiplier, # should set the input_size_multiplier of the model
+            )
+            model = YOLOXS(label_info=5, input_size=datamodule.input_size)
+            engine = Engine(model=model, datamodule=datamodule)
+            engine.train()
+
+    .. tab-item:: CLI
+
+        .. code-block:: bash
+
+            (otx) ...$ otx train ... --data.adaptive_input_size "auto | downscale"
+
+The adaptive process includes the following steps:
+
+1. OTX computes robust statistics from the input dataset.
+
+2. The initial input size is set based on the typical large image size within the dataset.
+
+3. (Optional) The input size may be further refined to account for the sizes of objects present in the dataset.
+   The model's minimum recognizable object size, typically ranging from 16x16 to 32x32 pixels, serves as a reference to
+   proportionally adjust the input size relative to the average small object size observed in the dataset.
+   For instance, if objects are generally 64x64 pixels in a 512x512 image, the input size would be adjusted
+   to 256x256 to maintain detectability.
+
+   Adjustments are subject to the following constraints:
+
+   * If the recalculated input size exceeds the maximum image size determined in the previous step, it will be capped at that maximum size.
+   * If the recalculated input size falls below the minimum threshold defined by MIN_DETECTION_INPUT_SIZE, the input size will be scaled up. This is done by increasing the smaller dimension (width or height) to MIN_DETECTION_INPUT_SIZE while maintaining the aspect ratio, ensuring that the model's minimum criteria for object detection are met.
+
+4. (downscale only) Any scale-up beyond the default model input size is restricted.
+
+
+.. Note::
+    Opting for a smaller input size can be advantageous for datasets with lower-resolution images or larger objects,
+    as it may improve speed with minimal impact on model accuracy. However, it is important to consider that selecting
+    a smaller input size could affect model performance depending on the task, model architecture, and dataset
+    properties.
diff --git a/docs/source/guide/explanation/additional_features/hpo.rst b/docs/source/guide/explanation/additional_features/hpo.rst
@@ -143,10 +143,16 @@ Here is explanation of all HPO configuration.
 
 - **mode** (*str*, *default='max'*) - Optimization mode for the metric. It determines whether the metric should be maximized or minimized. The possible values are 'max' and 'min', respectively.
 
-- **num_workers** (*int*, *default=1*) How many trials will be executed in parallel.
+- **num_trials** (*int*, *default=None*) The number of training trials to perform during HPO. If not provided, the number of trials will be determined based on the expected time ratio. Defaults to None.
+
+- **num_workers** (*int*, *default=None*) The number of trials that will be run concurrently.
 
 - **expected_time_ratio** (*int*, *default=4*) How many times to use for HPO compared to training time.
 
+- **metric_name** (*str*, *default=None*) The name of the performance metric to be optimized during HPO. If not specified, the metric will be selected based on the configured callbacks. Defaults to None.
+
+- **adapt_bs_search_space_max_val** (*Literal["None", "Safe", "Full"]*, *default="None"*) Whether to execute `Auto-adapt batch size` prior to HPO. This step finds the maximum batch size value, which then serves as the upper limit for the batch size search space during HPO. For further information on `Auto-adapt batch size`, please refer to the `Auto-configuration` documentation. Defaults to "None".
+
 - **maximum_resource** (*int*, *default=None*) - Maximum number of training epochs for each trial. When the training epochs reaches this value, the trial stop to train.
 
 - **minimum_resource** (*int*, *default=None*) - Minimum number of training epochs for each trial. Each trial will run at least this epochs, even if the performance of the model is not improving.

diff --git a/docs/source/guide/explanation/additional_features/index.rst b/docs/source/guide/explanation/additional_features/index.rst
@@ -14,3 +14,4 @@ Additional Features
    fast_data_loading
    tiling
    class_incremental_sampler
+   configurable_input_size
diff --git a/docs/source/guide/explanation/algorithms/action/action_detection.rst b/docs/source/guide/explanation/algorithms/action/action_detection.rst
diff --git a/docs/source/guide/explanation/algorithms/action/index.rst b/docs/source/guide/explanation/algorithms/action/index.rst
@@ -6,4 +6,3 @@ Action Recognition
 
 
    action_classification
-   action_detection
diff --git a/docs/source/guide/get_started/cli_commands.rst b/docs/source/guide/get_started/cli_commands.rst
@@ -339,11 +339,11 @@ The results will be saved in ``./otx-workspace/`` folder by default. The output
 
             (otx) ...$ otx train --model <model-class-path-or-name> --task <task-type> --data_root <dataset-root>
 
-        For example, if you want to use the ``otx.algo.detection.atss.ATSS`` model class, you can train it as shown below.
+        For example, if you want to use the ``otx.algo.classification.torchvision_model.TVModelForMulticlassCls`` model class, you can train it as shown below.
 
         .. code-block:: shell
 
-            (otx) ...$ otx train --model otx.algo.detection.atss.ATSS --model.variant mobilenetv2 --task DETECTION ...
+            (otx) ...$ otx train --model otx.algo.classification.torchvision_model.TVModelForMulticlassCls --model.backbone mobilenet_v3_small ...
 
 .. note::
     You also can visualize the training using ``Tensorboard`` as these logs are located in ``<work_dir>/tensorboard``.
Original file line number	Diff line number	Diff line change
Expand Up		@@ -6,4 +6,3 @@ Action Recognition


		action_classification
		action_detection