Fix WOQ int8 unpack weight #1393

changwangss · 2024-03-19T07:42:35Z

Type of Change

fix #1375 remaining int8 issue.

Description

detail description
JIRA ticket: xxx

Expected Behavior & Potential Risk

the expected behavior that triggered by this PR

How has this PR been tested?

how to reproduce the test (including hardware information)

Dependency Change?

any library dependency introduced or removed

github-actions · 2024-03-19T07:43:26Z

⚡ Required checks status: All passing 🟢

Groups summary

🟢 Format Scan Tests workflow

Check ID	Status
format-scan (pylint)	success	✅
format-scan (bandit)	success	✅
format-scan (cloc)	success	✅
format-scan (cpplint)	success	✅

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py.

🟢 Optimize Unit Test workflow

Check ID	Status
optimize-unit-test-baseline	success	✅
optimize-unit-test-PR-test	success	✅
Genreate-OptimizeUT-Report	success	✅

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py, tests/CI/test_quantization.py, tests/CI/test_weight_only.py.

🟢 NeuralChat Unit Test

Check ID	Status
neuralchat-unit-test-baseline	success	✅
neuralchat-unit-test-PR-test	success	✅
Generate-NeuralChat-Report	success	✅

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py.

🟢 Engine Unit Test workflow

Check ID	Status
engine-unit-test-baseline	success	✅
engine-unit-test-PR-test	success	✅
Genreate-Engine-Report	success	✅

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py.

🟢 Chat Bot Test workflow

Check ID	Status	Error details
call-inference-llama-2-7b-chat-hf / inference test	success		✅
call-inference-mpt-7b-chat / inference test	success		✅

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py.

Thank you for your contribution! 💜

Note
This comment is automatically generated and updates for 360 minutes every 180 seconds. If you have any other questions, contact VincyZhang or XuehaoSun for help.

Signed-off-by: changwangss <chang1.wang@intel.com>

Signed-off-by: Wang, Chang <chang1.wang@intel.com>

changwangss requested a review from PenghuiCheng as a code owner March 19, 2024 07:42

fix int8 unpack weight

a1d95dc

Signed-off-by: changwangss <chang1.wang@intel.com>

changwangss force-pushed the wangchang/int8 branch from a2d9b86 to a1d95dc Compare March 19, 2024 09:12

changwangss added 2 commits March 19, 2024 03:42

fix reshape issue

6702e45

Signed-off-by: changwangss <chang1.wang@intel.com>

Update test_weight_only.py

92786b1

Signed-off-by: Wang, Chang <chang1.wang@intel.com>

kevinintel requested a review from a32543254 March 19, 2024 13:31

kevinintel approved these changes Mar 19, 2024

View reviewed changes

VincyZhang merged commit edede40 into main Mar 20, 2024
18 checks passed

VincyZhang deleted the wangchang/int8 branch March 20, 2024 00:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix WOQ int8 unpack weight #1393

Fix WOQ int8 unpack weight #1393

changwangss commented Mar 19, 2024 •

edited

Loading

github-actions bot commented Mar 19, 2024 •

edited

Loading

Fix WOQ int8 unpack weight #1393

Fix WOQ int8 unpack weight #1393

Conversation

changwangss commented Mar 19, 2024 • edited Loading

Type of Change

Description

Expected Behavior & Potential Risk

How has this PR been tested?

Dependency Change?

github-actions bot commented Mar 19, 2024 • edited Loading

⚡ Required checks status: All passing 🟢

Groups summary

changwangss commented Mar 19, 2024 •

edited

Loading

github-actions bot commented Mar 19, 2024 •

edited

Loading