Enable text-generation with new API #318

changwangss · 2023-09-14T11:00:54Z

Type of Change

    from intel_extension_for_transformers.transformers import (
        MixedPrecisionConfig,
        WeightOnlyQuantConfig,
        SmoothQuantConfig,
        BitsAndBytesConfig

    ) 
    from intel_extension_for_transformers.transformers import AutoModelForCausalLM
    # smooth-quant
    sq_config = SmoothQuantConfig(
                                tokenizer=tokenizer,  # either two of one, tokenizer or calib_func
                               )
    q_model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
                                                   quantization_config=sq_config
                                               )
    
    # weight-only
    woq_config = WeightOnlyQuantConfig()
    woq_model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
                                                quantization_config=woq_config
                                            )
    
    # mp
    mp_config = MixedPrecisionConfig() 
    amp_model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
                                                quantization_config=mp_config
                                            )
  
    # bitsandbytes
    bab_config = BitsAndBytesConfig()
    bab_model = AutoModelForCausalLM.from_pretrained(model_name_or_path,
                                                quantization_config=bab_config
                                            )

Description

detail description
JIRA ticket: https://jira.devtools.intel.com/browse/NLPTOOLKIU-878

Expected Behavior & Potential Risk

the expected behavior that triggered by this PR

How has this PR been tested?

how to reproduce the test (including hardware information)

Dependency Change?

any library dependency introduced or removed

examples/huggingface/pytorch/text-generation/quantization/run_generation.py

intel_extension_for_transformers/neural_chat/config.py

examples/huggingface/pytorch/text-generation/quantization/run_generation.py

intel_extension_for_transformers/transformers/utils/quantization_config.py

changwangss · 2023-09-15T11:11:58Z

confilct with #297, need align with @PenghuiCheng

hshen14 · 2023-09-15T15:02:01Z

confilct with #297, need align with @PenghuiCheng

what's the conflict?

Signed-off-by: changwangss <chang1.wang@intel.com>

examples/huggingface/pytorch/text-generation/quantization/run_generation.py

intel_extension_for_transformers/transformers/utils/utility.py

intel_extension_for_transformers/transformers/utils/quantization_config.py

Signed-off-by: changwangss <chang1.wang@intel.com>

Signed-off-by: Cheng, Penghui <penghui.cheng@intel.com>

Signed-off-by: changwangss <chang1.wang@intel.com>

examples/huggingface/pytorch/text-generation/quantization/README.md

Signed-off-by: changwangss <chang1.wang@intel.com>

changwangss added the draft label Sep 14, 2023

hshen14 requested review from XinyuYe-Intel and lvliang-intel September 14, 2023 11:05

hshen14 reviewed Sep 14, 2023

View reviewed changes

examples/huggingface/pytorch/text-generation/quantization/run_generation.py Outdated Show resolved Hide resolved

hshen14 reviewed Sep 14, 2023

View reviewed changes

intel_extension_for_transformers/neural_chat/config.py Outdated Show resolved Hide resolved

hshen14 approved these changes Sep 14, 2023

View reviewed changes

kevinintel approved these changes Sep 14, 2023

View reviewed changes

changwangss changed the title ~~Enable text-generation with NeuralChat API~~ Enable text-generation with new API Sep 15, 2023

XinyuYe-Intel approved these changes Sep 15, 2023

View reviewed changes

ftian1 reviewed Sep 15, 2023

View reviewed changes

examples/huggingface/pytorch/text-generation/quantization/run_generation.py Show resolved Hide resolved

ftian1 reviewed Sep 15, 2023

View reviewed changes

intel_extension_for_transformers/transformers/utils/quantization_config.py Show resolved Hide resolved

hshen14 approved these changes Sep 15, 2023

View reviewed changes

VincyZhang added the ITREX v1.2 label Sep 15, 2023

changwangss requested a review from PenghuiCheng as a code owner September 15, 2023 11:04

changwangss added 7 commits September 17, 2023 21:03

enable text-generation with NeuralChat API

6068d06

Signed-off-by: changwangss <chang1.wang@intel.com>

fix wrong typing and hide import

fccc16a

Signed-off-by: changwangss <chang1.wang@intel.com>

improve import check

4e989a5

rebase main

40d20eb

Signed-off-by: changwangss <chang1.wang@intel.com>

remove the outdated code

2dab421

Signed-off-by: changwangss <chang1.wang@intel.com>

update order

7ba2aed

improve sqconfig and add ut

205f8ec

Signed-off-by: changwangss <chang1.wang@intel.com>

changwangss force-pushed the wangchang/neuralchat branch from e8757be to 205f8ec Compare September 18, 2023 04:05