how to convert yoloNAS model to int8? #2598

MinGiSa · 2024-03-25T06:31:25Z

MinGiSa
Mar 25, 2024

i was trying to follow notebooks with yolo v8 and v7.
but i encountered erros with

Traceback (most recent call last):
File "C:\Users\hello\Desktop\yoloNAS\torch_vino.py", line 134, in
quantizedModel = nncf.quantize(torchModel, calibrationDataset, preset=nncf.QuantizationPreset.MIXED)
File "C:\Users\hello\miniconda3\envs\yoloNAS\lib\site-packages\nncf\quantization\quantize_model.py", line 145, in quantize
return quantize_impl(
File "C:\Users\hello\miniconda3\envs\yoloNAS\lib\site-packages\nncf\torch\quantization\quantize_model.py", line 72, in quantize_impl
quantized_model = quantization_algorithm.apply(
File "C:\Users\hello\miniconda3\envs\yoloNAS\lib\site-packages\nncf\quantization\algorithms\post_training\algorithm.py", line 112, in apply
return self._pipeline.run_from_step(model, dataset, graph, 0, step_index_to_statistics)
File "C:\Users\hello\miniconda3\envs\yoloNAS\lib\site-packages\nncf\quantization\algorithms\pipeline.py", line 161, in run_from_step
step_statistics = collect_statistics(statistic_points, step_model, step_graph, dataset)
File "C:\Users\hello\miniconda3\envs\yoloNAS\lib\site-packages\nncf\quantization\algorithms\pipeline.py", line 49, in collect_statistics
statistics_aggregator.collect_statistics(model, graph)
File "C:\Users\hello\miniconda3\envs\yoloNAS\lib\site-packages\nncf\torch\statistics\aggregator.py", line 34, in collect_statistics
super().collect_statistics(model, graph)
File "C:\Users\hello\miniconda3\envs\yoloNAS\lib\site-packages\nncf\common\tensor_statistics\aggregator.py", line 54, in collect_statistics
model_with_outputs = model_transformer.transform(transformation_layout)
File "C:\Users\hello\miniconda3\envs\yoloNAS\lib\site-packages\nncf\torch\model_transformer.py", line 73, in transform
model = transformation_fn(model, transformations)
File "C:\Users\hello\miniconda3\envs\yoloNAS\lib\site-packages\nncf\torch\model_transformer.py", line 109, in _apply_insertion_transformations
model.nncf.insert_at_point(pt_ip, fn, command.hooks_group_name)
File "C:\Users\hello\miniconda3\envs\yoloNAS\lib\site-packages\nncf\torch\nncf_network.py", line 432, in insert_at_point
raise nncf.ValidationError(
nncf.errors.ValidationError: Failed to insert pre/post op for not registered custom module YoloNAS_L/NDFLHeads[heads]. NNCF only supports native PyTorch modules with respect to trainable parameter (weight) compressed, such as torch.nn.Conv2d. If your model contains a custom, non-PyTorch standard module with trainable weights that should be compressed, you can register it using the @nncf.register_module decorator. Please refer to Compression of custom modules section in docs/Usage.md for more details.

How can I fix this error?

I suppose I need to apply ignore_scope with:
quantizedModel = nncf.quantize(torchModel, calibrationDataset, preset=nncf.QuantizationPreset.MIXED).

Here is my code.
additionally, I used yoloFormat dataset for calibration Data.

import openvino as ov
import os
import re
import subprocess
from functools import partial
from typing import List, Optional, Tuple
import numpy as np
import torch
from sklearn.metrics import accuracy_score
from torchvision import models
import nncf
from nncf.common.logging.track_progress import track
from super_gradients.training.datasets.detection_datasets.yolo_format_detection import YoloDarknetFormatDetectionDataset
from super_gradients.training import models
from super_gradients.stuff.misc import findClassNameYOLO
from super_gradients.training.transforms.transforms import (DetectionPaddedRescale, DetectionStandardize, DetectionTargetsFormatTransform)
from super_gradients.training.utils.collate_fn import DetectionCollateFN
from torch.utils.data import DataLoader

def validate(model: ov.Model, val_loader: torch.utils.data.DataLoader) -> float:
predictions = []
references = []

compiled_model = ov.compile_model(model)
output = compiled_model.outputs[0]

for images, target in track(val_loader, description="Validating"):
    pred = compiled_model(images)[output]
    predictions.append(np.argmax(pred, axis=1))
    references.append(target)

predictions = np.concatenate(predictions, axis=0)
references = np.concatenate(references, axis=0)
return accuracy_score(predictions, references)

def run_benchmark(model_path: str, shape: Optional[List[int]] = None, verbose: bool = True) -> float:
command = f"benchmark_app -m {model_path} -d CPU -api async -t 15"
if shape is not None:
command += f' -shape [{",".join(str(x) for x in shape)}]'
cmd_output = subprocess.check_output(command, shell=True) # nosec
if verbose:
print(*str(cmd_output).split("\n")[-9:-1], sep="\n")
match = re.search(r"Throughput: (.+?) FPS", str(cmd_output))
return float(match.group(1))

def get_model_size(ir_path: str, m_type: str = "Mb", verbose: bool = True) -> float:
xml_size = os.path.getsize(ir_path)
bin_size = os.path.getsize(os.path.splitext(ir_path)[0] + ".bin")
for t in ["bytes", "Kb", "Mb"]:
if m_type == t:
break
xml_size /= 1024
bin_size /= 1024
model_size = xml_size + bin_size
if verbose:
print(f"Model graph (xml): {xml_size:.3f} {m_type}")
print(f"Model weights (bin): {bin_size:.3f} {m_type}")
print(f"Model size: {model_size:.3f} {m_type}")
return model_size

def transform_fn(data_item: Tuple[torch.Tensor, int], device: torch.device) -> torch.Tensor:
images, _ = data_item
return images.to(device)

==============================

if name == "main":
DATASET_DIR = r"C:\Users\hello\Desktop\yoloNAS\datasets\bolt_yolo"
CHECKPOINT_DIR = r"C:\Users\hello\Desktop\yoloNAS\checkPoint\bolt_yolo"
PRETRAINED_PATH = r"C:\Users\hello\Desktop\yoloNAS\checkPoint\ckpt_best.pth"
THRESHOLD = 0.7
BASE_MODEL = "L"
MAX_EPOCH = 300
NUM_WORKER = 0
BATCH_SIZE = 8
INPUT_SIZE = [640, 640]

CLASS_NAME = findClassNameYOLO(os.path.join(DATASET_DIR, 'class.txt'))
NUM_CLASS = len(CLASS_NAME)

os.makedirs(CHECKPOINT_DIR, exist_ok=True)

validDatasetParams={
        'data_dir': DATASET_DIR,
        'images_dir': os.path.join(DATASET_DIR, 'valid', 'images'),
        'labels_dir': os.path.join(DATASET_DIR, 'valid', 'labels'),
        'classes': CLASS_NAME,
        'input_dim': INPUT_SIZE,
        'transforms':[
            DetectionPaddedRescale(input_dim=INPUT_SIZE, max_targets=300),
            DetectionStandardize(max_value=255),
            DetectionTargetsFormatTransform(max_targets=300, input_dim=INPUT_SIZE,
                                            output_format="LABEL_CXCYWH")
        ]}
validDataloaderParams={
        "shuffle": False,
        "batch_size": BATCH_SIZE,
        "num_workers": NUM_WORKER,
        "drop_last": False,
        "pin_memory": True,
        "collate_fn": DetectionCollateFN(),
    }

valSet = YoloDarknetFormatDetectionDataset(**validDatasetParams)
validLoader = DataLoader(valSet, **validDataloaderParams)

modelMapping = {"L": 'yolo_nas_l', "M": 'yolo_nas_m', "S": 'yolo_nas_s'}
torchModel = models.get(modelMapping[BASE_MODEL], num_classes=NUM_CLASS, checkpoint_path=PRETRAINED_PATH)

device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
torchModel.to(device)
torchModel.eval()

calibrationDataset = nncf.Dataset(validLoader, partial(transform_fn, device=device))
quantizedModel = nncf.quantize(torchModel, calibrationDataset, preset=nncf.QuantizationPreset.MIXED)
dummyInput = torch.randn(1, 3, INPUT_SIZE[0], INPUT_SIZE[1])
ov_model = ov.convert_model(torchModel.cpu(), example_input=dummyInput)
ov_quantized_model = ov.convert_model(quantizedModel.cpu(), example_input=dummyInput)

int8_ir_path = f"{CHECKPOINT_DIR}/int8.xml"
ov.save_model(ov_quantized_model, int8_ir_path, compress_to_fp16=False)
print(f"Save INT8 model: {int8_ir_path}")
int8_model_size = get_model_size(int8_ir_path, verbose=True)

print("Benchmark INT8 model:")
int8_fps = run_benchmark(int8_ir_path, shape=[1, 3, 640, 640], verbose=True)

print("Validate OpenVINO INT8 model:")
int8_top1 = validate(ov_quantized_model, validLoader)
print(f"Accuracy @ top1: {int8_top1:.3f}")

===========================

Answered by alexsu52

Mar 26, 2024

Hi @MinGiSa,

NNCF does not support quantization of custom PyTorch modules with weights. For example, "yolo_nas_l" has NDFLHeads module which calls conv2d function with self.proj_conv in forward which can not be quantized automaticly: https://github.com/Deci-AI/super-gradients/blob/7067736cb9062245aa4f118d91b03bf8de898ef7/src/super_gradients/training/models/detection_models/yolo_nas/dfl_heads.py#L210.

Taking into account that this module is the head of the model I would recommend to ignore it to preserve accuracy of the quantized model:

    quantized_model = nncf.quantize(
        torchModel,
        calibrationDataset,
        preset=nncf.QuantizationPreset.MIXED,
        ignored_scope=nn…

View full answer

alexsu52 · 2024-03-26T10:10:44Z

alexsu52
Mar 26, 2024
Maintainer

Hi @MinGiSa,

NNCF does not support quantization of custom PyTorch modules with weights. For example, "yolo_nas_l" has NDFLHeads module which calls conv2d function with self.proj_conv in forward which can not be quantized automaticly: https://github.com/Deci-AI/super-gradients/blob/7067736cb9062245aa4f118d91b03bf8de898ef7/src/super_gradients/training/models/detection_models/yolo_nas/dfl_heads.py#L210.

Taking into account that this module is the head of the model I would recommend to ignore it to preserve accuracy of the quantized model:

    quantized_model = nncf.quantize(
        torchModel,
        calibrationDataset,
        preset=nncf.QuantizationPreset.MIXED,
        ignored_scope=nncf.IgnoredScope(patterns=[".*NDFLHeads.*"]),
    )

BTW: We are working on the fix of this issue #2461. Stay tune!

4 replies

MinGiSa Mar 26, 2024
Author

@alexsu52 thank you! I think I always bring the problems. Sorry for that.

alexsu52 Mar 26, 2024
Maintainer

Thanks for your feedback! This helps us improve NNCF.

jzdcf Jul 10, 2024

"NNCF does not support quantization of custom PyTorch modules with weights."
Can you give a simple example? Is the following code not supported?

class ECB(nn.Module):
    def __init__(self, inp_planes, out_planes):
        super(ECB, self).__init__()
        self.inp_planes = inp_planes
        self.out_planes = out_planes
        self.conv3x3 = torch.nn.Conv2d(self.inp_planes, self.out_planes, kernel_size=3, padding=1)
        
    def forward(self, x):
        y = F.conv2d(input=x, weight=self.conv3x3.weight, bias=self.conv3x3.bias, stride=1, padding=1)
        return y

alexsu52 Jul 19, 2024
Maintainer

Hi @jzdcf, @MinGiSa!

I would like to inform you that since 2.11 release, NNCF supports quantization of custom PyTorch modules. The provided models is quantized correctly with NNCF 2.11.

For example, the graph of quantized ECB model is following:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to convert yoloNAS model to int8? #2598

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

how to convert yoloNAS model to int8? #2598

MinGiSa Mar 25, 2024

Here is my code. additionally, I used yoloFormat dataset for calibration Data.

Replies: 1 comment · 4 replies

alexsu52 Mar 26, 2024 Maintainer

MinGiSa Mar 26, 2024 Author

alexsu52 Mar 26, 2024 Maintainer

jzdcf Jul 10, 2024

alexsu52 Jul 19, 2024 Maintainer

MinGiSa
Mar 25, 2024

Here is my code.
additionally, I used yoloFormat dataset for calibration Data.

Replies: 1 comment 4 replies

alexsu52
Mar 26, 2024
Maintainer

MinGiSa Mar 26, 2024
Author

alexsu52 Mar 26, 2024
Maintainer

alexsu52 Jul 19, 2024
Maintainer