Add model export for QAT #3458

linbinskn · 2021-03-19T02:48:32Z

Add export_model function for quantization algorithm QAT. Users can save quantized weight and calibration parameters to specific path. What' s more, this function will be a prerequisite for #3356 (Support mixed-precision quantization speed up by using tensorrt).

QuanluZhang · 2021-03-19T05:10:13Z

nni/algorithms/compression/pytorch/quantization/quantizers.py

+    if hasattr(module, 'weight_bit'):
+        delattr(module, 'weight_bit')
+    if hasattr(module, 'activation_bit'):
+        delattr(module, 'activation_bit')


how do you choose these attributes? different quantization algorithms have different subset of these attributes? is it possible that a new quantization algorithm has more attributes?

This function is only for QAT now, other quantization algorithms might have other attributes and we need to define another function to handle it.

then it is better to put it in QAT_Quantizer

BTW, why only support export for QAT?

Because QAT is enough for linear quantization simulation. Binarized quantization like BNN needs no calibration and users can save its weights as normal.

could you refer to model export implemented in pruners, and try to align the export feature of pruner and quantizer?

Have aligned export_model() function in pruners and quantizers.

QuanluZhang · 2021-03-19T05:22:45Z

nni/algorithms/compression/pytorch/quantization/quantizers.py

+        assert model_path is not None, 'model_path must be specified'
+        self._unwrap_model()
+        calibration_config = {}
+        support_op = [torch.nn.Conv2d, torch.nn.Linear, torch.nn.ReLU6]


why ReLU6?

Because activation layers also need to be set bit width in inference framework like tensorrt, we have to choose activation layers which we can support during calibration. I have tested ReLU6 in some examples and it can be fully supported by tensorrt. But I am not sure whether other activation ops can be supported.

what types of ops does QAT support in our current implementation?

Removed support_op constrain in export_model().

QuanluZhang · 2021-03-22T01:42:38Z

nni/algorithms/compression/pytorch/quantization/quantizers.py

                layer.module.register_buffer('ema_decay', torch.Tensor([0.99]))
                layer.module.register_buffer('tracked_min_biased', torch.zeros(1))
                layer.module.register_buffer('tracked_min', torch.zeros(1))
                layer.module.register_buffer('tracked_max_biased', torch.zeros(1))
                layer.module.register_buffer('tracked_max', torch.zeros(1))

+    def del_simulated_attr(self, module):


-> _del_simulated_attr

QuanluZhang · 2021-03-22T01:44:25Z

nni/algorithms/compression/pytorch/quantization/quantizers.py

+        if hasattr(module, 'old_weight'):
+            delattr(module, 'old_weight')
+        if hasattr(module, 'ema_decay'):
+            delattr(module, 'ema_decay')
+        if hasattr(module, 'tracked_min_biased'):
+            delattr(module, 'tracked_min_biased')
+        if hasattr(module, 'tracked_max_biased'):
+            delattr(module, 'tracked_max_biased')
+        if hasattr(module, 'tracked_min'):
+            delattr(module, 'tracked_min')
+        if hasattr(module, 'tracked_max'):
+            delattr(module, 'tracked_max')
+        if hasattr(module, 'scale'):
+            delattr(module, 'scale')
+        if hasattr(module, 'zero_point'):
+            delattr(module, 'zero_point')
+        if hasattr(module, 'weight_bit'):
+            delattr(module, 'weight_bit')
+        if hasattr(module, 'activation_bit'):
+            delattr(module, 'activation_bit')


suggest to use for.

to_del = ['old_weight', 'ema_decay', ...] for each in to_del: if hasattr(): delattr()

Good point! Modified.

QuanluZhang · 2021-03-22T01:56:16Z

nni/algorithms/compression/pytorch/quantization/quantizers.py

+                device = torch.device('cpu')
+            input_data = torch.Tensor(*input_shape)
+            torch.onnx.export(self.bound_model, input_data.to(device), onnx_path)
+            logger.info('Model in onnx with input shape %s saved to %s', input_data.shape, onnx_path)


have you tested export onnx? and better to write test for this feature

Has tested export_model function including pytorch state_dict() and onnx in three algorithms.

QuanluZhang · 2021-03-22T01:56:47Z

nni/algorithms/compression/pytorch/quantization/quantizers.py

+            if "weight" in config.get("quant_types", []):
+                layer.module.register_buffer('weight_bit', torch.zeros(1))
+
+    def del_simulated_attr(self, module):


-> _del_simulated_attr

QuanluZhang · 2021-03-22T02:01:11Z

nni/algorithms/compression/pytorch/quantization/quantizers.py

+        torch.save(self.bound_model.state_dict(), model_path)
+        logger.info('Model state_dict saved to %s', model_path)
+        if calibration_path is not None:
+            logger.info('No calibration config will be saved because no calibration data in BNN quantizer')


i think we should export bit number even they are all 1 bit. Because the speedup module should know this information to use 1 bit. the speedup module does not know you are using BNNQuantizer

Make sense. Have added.

QuanluZhang · 2021-03-22T02:02:24Z

nni/algorithms/compression/pytorch/quantization/quantizers.py

+        for name, module in self.bound_model.named_modules():
+            if hasattr(module, 'weight_bit'):
+                calibration_config[name] = {}
+                calibration_config[name]['weight_bit'] = int(module.weight_bit)


so this quantizer does not calibrate activation?

In our current implementation of Dorefa, it does not quantize activation, so we don't need to calibrate activation.

could you double check the paper, does the paper mention how to calibrate activation?

After discussion, we reach an agreement that the refactor of Dorefa should starts after survey and it will be done in another PR. What' s more, ut related to export_model() has been added into code.

J-shang · 2021-03-22T02:15:12Z

nni/algorithms/compression/pytorch/quantization/quantizers.py

+                device = torch.device('cpu')
+            input_data = torch.Tensor(*input_shape)
+            torch.onnx.export(self.bound_model, input_data.to(device), onnx_path)
+            logger.info('Model in onnx with input shape %s saved to %s', input_data.shape, onnx_path)


For other implementations return a dict, add return {} is better?

And all export model() implementations seem to have mostly the same logic, can we use the implementation in QAT_Quantizer for all Quantizer? Or just specify how to construct calibration_config in different Quantizer?

Useful suggestions! Have modified.

QuanluZhang · 2021-03-24T11:14:58Z

test/ut/sdk/test_compressor_torch.py

+
+        for config, quantize_algorithm in zip(config_set, quantize_algorithm_set):
+            model = TorchModel()
+            model.relu = torch.nn.ReLU()


what is this line used for?

Follow the code of test_torch_QAT_quantizer() in ut. Change the op relu type to ReLU and it will match the type in config_list.

QuanluZhang · 2021-03-24T11:16:37Z

@linbinskn , looks good! please update doc accordingly

linbinskn added 3 commits March 19, 2021 10:42

add model export for QAT

b282c6a

delete redundant code

72d72bb

add some annotation

d549fcd

QuanluZhang requested review from J-shang and QuanluZhang March 19, 2021 04:51

QuanluZhang reviewed Mar 19, 2021

View reviewed changes

SparkSnail mentioned this pull request Mar 19, 2021

NNI 2021 Mar~Apr Iteration Planning #3445

Closed

78 tasks

linbinskn added 2 commits March 20, 2021 14:11

add export_model() function for Dorefa and BNN

62d8049

register buffer of weight bit in dorefa

83e0d74

linbinskn requested a review from QuanluZhang March 20, 2021 06:22

QuanluZhang reviewed Mar 22, 2021

View reviewed changes

J-shang reviewed Mar 22, 2021

View reviewed changes

linbinskn added 2 commits March 22, 2021 14:20

resolve problems mentioned in comments

6d7a133

pass the pipeline

92b78f3

linbinskn requested review from QuanluZhang and J-shang March 22, 2021 06:30

J-shang approved these changes Mar 22, 2021

View reviewed changes

add ut for export_model()

14ea0a3

QuanluZhang reviewed Mar 24, 2021

View reviewed changes

QuanluZhang approved these changes Mar 24, 2021

View reviewed changes

QuanluZhang merged commit f51d985 into microsoft:master Mar 24, 2021

linbinskn deleted the qat_export branch March 24, 2021 12:11

linbinskn mentioned this pull request Mar 24, 2021

Add doc for quantizer export_model() #3473

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add model export for QAT #3458

Add model export for QAT #3458

linbinskn commented Mar 19, 2021

QuanluZhang Mar 19, 2021

linbinskn Mar 19, 2021

QuanluZhang Mar 19, 2021

QuanluZhang Mar 19, 2021

linbinskn Mar 19, 2021 •

edited

Loading

QuanluZhang Mar 19, 2021

linbinskn Mar 20, 2021

QuanluZhang Mar 19, 2021

linbinskn Mar 19, 2021

QuanluZhang Mar 19, 2021

linbinskn Mar 20, 2021

QuanluZhang Mar 22, 2021

linbinskn Mar 22, 2021

QuanluZhang Mar 22, 2021 •

edited

Loading

linbinskn Mar 22, 2021

QuanluZhang Mar 22, 2021

linbinskn Mar 22, 2021

QuanluZhang Mar 22, 2021

linbinskn Mar 22, 2021

QuanluZhang Mar 22, 2021

linbinskn Mar 22, 2021

QuanluZhang Mar 22, 2021

linbinskn Mar 22, 2021

QuanluZhang Mar 24, 2021

linbinskn Mar 24, 2021

J-shang Mar 22, 2021

linbinskn Mar 22, 2021

QuanluZhang Mar 24, 2021

linbinskn Mar 24, 2021

QuanluZhang commented Mar 24, 2021

Add model export for QAT #3458

Add model export for QAT #3458

Conversation

linbinskn commented Mar 19, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

linbinskn Mar 19, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

QuanluZhang Mar 22, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

QuanluZhang commented Mar 24, 2021

linbinskn Mar 19, 2021 •

edited

Loading

QuanluZhang Mar 22, 2021 •

edited

Loading