Skip to content
This repository has been archived by the owner on Oct 25, 2024. It is now read-only.

Fix WOQ int8 unpack weight #1393

Merged
merged 3 commits into from
Mar 20, 2024
Merged

Fix WOQ int8 unpack weight #1393

merged 3 commits into from
Mar 20, 2024

Conversation

changwangss
Copy link
Contributor

@changwangss changwangss commented Mar 19, 2024

Type of Change

fix #1375 remaining int8 issue.

Description

detail description
JIRA ticket: xxx

Expected Behavior & Potential Risk

the expected behavior that triggered by this PR

How has this PR been tested?

how to reproduce the test (including hardware information)

Dependency Change?

any library dependency introduced or removed

Copy link

github-actions bot commented Mar 19, 2024

⚡ Required checks status: All passing 🟢

Groups summary

🟢 Format Scan Tests workflow
Check ID Status Error details
format-scan (pylint) success
format-scan (bandit) success
format-scan (cloc) success
format-scan (cpplint) success

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py.

🟢 Optimize Unit Test workflow
Check ID Status Error details
optimize-unit-test-baseline success
optimize-unit-test-PR-test success
Genreate-OptimizeUT-Report success

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py, tests/CI/test_quantization.py, tests/CI/test_weight_only.py.

🟢 NeuralChat Unit Test
Check ID Status Error details
neuralchat-unit-test-baseline success
neuralchat-unit-test-PR-test success
Generate-NeuralChat-Report success

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py.

🟢 Engine Unit Test workflow
Check ID Status Error details
engine-unit-test-baseline success
engine-unit-test-PR-test success
Genreate-Engine-Report success

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py.

🟢 Chat Bot Test workflow
Check ID Status Error details
call-inference-llama-2-7b-chat-hf / inference test success
call-inference-mpt-7b-chat / inference test success

These checks are required after the changes to intel_extension_for_transformers/transformers/llm/quantization/nn/modules.py, intel_extension_for_transformers/transformers/llm/quantization/utils.py, intel_extension_for_transformers/transformers/modeling/modeling_auto.py.


Thank you for your contribution! 💜

Note
This comment is automatically generated and updates for 360 minutes every 180 seconds. If you have any other questions, contact VincyZhang or XuehaoSun for help.

Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: changwangss <chang1.wang@intel.com>
Signed-off-by: Wang, Chang <chang1.wang@intel.com>
@kevinintel kevinintel requested a review from a32543254 March 19, 2024 13:31
@VincyZhang VincyZhang merged commit edede40 into main Mar 20, 2024
18 checks passed
@VincyZhang VincyZhang deleted the wangchang/int8 branch March 20, 2024 00:54
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants