Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PTQ][OV] FP8 implementation #2283

Merged

Conversation

nikita-malininn
Copy link
Collaborator

@nikita-malininn nikita-malininn commented Nov 22, 2023

Changes

  • Added FP8 implementation
  • Added Mode parameter

Reason for changes

  • New FP8 implementation

Related tickets

  • 119805

Tests

  • tests/openvino/native/quantization/test_graphs.py
  • tests/openvino/native/test_model_transformer.py

On top of openvinotoolkit/openvino#21034 - Merged

@github-actions github-actions bot added NNCF PT Pull requests that updates NNCF PyTorch experimental NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF ONNX Pull requests that updates NNCF ONNX NNCF PTQ Pull requests that updates NNCF PTQ labels Nov 22, 2023
@openvino-nncf-ci openvino-nncf-ci added the API Public API-impacting changes label Nov 22, 2023
Copy link

codecov bot commented Nov 22, 2023

Codecov Report

Merging #2283 (d6dedc2) into develop (0c389c3) will decrease coverage by 0.15%.
The diff coverage is 67.04%.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #2283      +/-   ##
===========================================
- Coverage    90.74%   90.60%   -0.15%     
===========================================
  Files          489      489              
  Lines        43975    44196     +221     
===========================================
+ Hits         39906    40044     +138     
- Misses        4069     4152      +83     
Files Coverage Δ
nncf/__init__.py 97.14% <100.00%> (+0.08%) ⬆️
nncf/common/hardware/config.py 94.40% <100.00%> (ø)
nncf/common/quantization/initialization/range.py 95.52% <100.00%> (ø)
...common/quantization/quantizer_propagation/graph.py 89.36% <100.00%> (ø)
...ommon/quantization/quantizer_propagation/solver.py 93.64% <100.00%> (ø)
nncf/common/quantization/quantizer_setup.py 91.42% <100.00%> (ø)
nncf/common/quantization/structs.py 96.12% <100.00%> (ø)
nncf/openvino/graph/metatypes/groups.py 100.00% <100.00%> (ø)
...ncf/openvino/graph/metatypes/openvino_metatypes.py 99.43% <100.00%> (+<0.01%) ⬆️
nncf/openvino/quantization/backend_parameters.py 100.00% <100.00%> (ø)
... and 28 more

... and 2 files with indirect coverage changes

Flag Coverage Δ
COMMON 45.07% <53.90%> (+0.07%) ⬆️
ONNX 33.88% <31.06%> (-0.05%) ⬇️
OPENVINO 38.79% <46.59%> (-0.04%) ⬇️
TENSORFLOW 29.84% <24.62%> (-0.07%) ⬇️
TORCH 62.61% <29.92%> (-0.12%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
common 93.35% <100.00%> (ø)
torch 93.22% <84.61%> (-0.02%) ⬇️
tensorflow 94.01% <87.50%> (-0.02%) ⬇️
onnx 96.91% <84.61%> (-0.09%) ⬇️
openvino 90.84% <48.57%> (-1.19%) ⬇️
ptq 88.37% <76.66%> (-0.34%) ⬇️

@nikita-malininn nikita-malininn marked this pull request as ready for review November 23, 2023 14:45
@nikita-malininn nikita-malininn requested a review from a team as a code owner November 23, 2023 14:45
@l-bat
Copy link
Collaborator

l-bat commented Nov 24, 2023

@KodiaqQ Are you planning to add a new mode parameter to the documentation?

@nikita-malininn
Copy link
Collaborator Author

@KodiaqQ Are you planning to add a new mode parameter to the documentation?

Good catch, thanks. I'll add it.
But first, we need to discuss this parameter first.

Copy link
Contributor

@alexsu52 alexsu52 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some general comments:

  • FBC and BC support FakeConvert operations?
  • What test plan do you propose for FP8 quantization?

tests/openvino/tools/calibrate.py Outdated Show resolved Hide resolved
nncf/quantization/fake_quantize.py Outdated Show resolved Hide resolved
@github-actions github-actions bot removed the documentation Improvements or additions to documentation label Dec 7, 2023
:return: Parameters of the FakeConvert layer.
"""

destination_type_maximum = {"HF8": 448, "BF8": 57344}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@alexsu52 alexsu52 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this PR requires additional validation. I would suggest to run ptq conformance test and DLB calibrate test on small model scope.

nncf/quantization/algorithms/min_max/algorithm.py Outdated Show resolved Hide resolved
@alexsu52
Copy link
Contributor

The code ov_quantized_model = nncf.quantize(ov_model, calibration_dataset, mode=nncf.QuantizationMode.FP8_E5M2) produces the following warnings:
image

What do you think about printing only warning about the fact that FP8 is experimental option and raise runtime error if user specified parameters is not compatible with FP8 mode?

@nikita-malininn
Copy link
Collaborator Author

nikita-malininn commented Dec 12, 2023

The code ov_quantized_model = nncf.quantize(ov_model, calibration_dataset, mode=nncf.QuantizationMode.FP8_E5M2) produces the following warnings: image

What do you think about printing only warning about the fact that FP8 is experimental option and raise runtime error if user specified parameters is not compatible with FP8 mode?

Ok, I'll remove these warnings and add RuntimeError instead of the redefining provided options.
upd., Removed warnings & added errors.
upd 2., As a result, we must define almost all MinMax options properly to run optimization with any mode.

@nikita-malininn
Copy link
Collaborator Author

@alexsu52, please, review.

Copy link
Contributor

@alexsu52 alexsu52 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @KodiaqQ please merge it after passing additional validation.

@@ -154,6 +169,7 @@ def get_start_nodes_for_activation_path_tracing(nncf_graph: NNCFGraph) -> List[N

:param nncf_graph: NNCFGraph to get the start nodes.
:return: List of NNCFNodes to use as start nodes for activation path tracing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo

@nikita-malininn
Copy link
Collaborator Author

LGTM, @KodiaqQ please merge it after passing additional validation.

Conformance validation run - 229 (manual).
Observed errors were fixed with the latest commit (for per-channel activations aka depth-wise layers).
Is it enough, @alexsu52?

@nikita-malininn nikita-malininn merged commit 5f2c20e into openvinotoolkit:develop Dec 15, 2023
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Public API-impacting changes experimental NNCF Common Pull request that updates NNCF Common NNCF ONNX Pull requests that updates NNCF ONNX NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF PT Pull requests that updates NNCF PyTorch NNCF PTQ Pull requests that updates NNCF PTQ NNCF TF Pull requests that updates NNCF TensorFlow
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants