[LoftQConfig + LoraConfig] throws size matmul mismatch error #1240

SoundProvider · 2023-12-08T07:42:37Z

System Info

docker image: pytorch/pytorch:2.1.1-cuda12.1-cudnn8-devel
pip list

Package                 Version
----------------------- ------------
accelerate              0.25.0
aiohttp                 3.9.1
aiosignal               1.3.1
asttokens               2.0.5
astunparse              1.6.3
async-timeout           4.0.3
attrs                   23.1.0
backcall                0.2.0
beautifulsoup4          4.12.2
blessed                 1.20.0
boltons                 23.0.0
Brotli                  1.0.9
certifi                 2023.7.22
cffi                    1.15.1
chardet                 4.0.0
charset-normalizer      2.0.4
click                   8.1.7
comm                    0.2.0
conda                   23.9.0
conda-build             3.27.0
conda-content-trust     0.2.0
conda_index             0.3.0
conda-libmamba-solver   23.7.0
conda-package-handling  2.2.0
conda_package_streaming 0.9.0
cryptography            41.0.3
datasets                2.15.0
debugpy                 1.8.0
decorator               5.1.1
dill                    0.3.7
dnspython               2.4.2
evaluate                0.4.1
exceptiongroup          1.0.4
executing               0.8.3
expecttest              0.1.6
filelock                3.9.0
frozenlist              1.4.0
fsspec                  2023.10.0
gmpy2                   2.1.2
gpustat                 1.1.1
huggingface-hub         0.19.4
hypothesis              6.88.4
idna                    3.4
ipykernel               6.27.1
ipython                 8.15.0
jedi                    0.18.1
Jinja2                  3.1.2
joblib                  1.3.2
jsonpatch               1.32
jsonpointer             2.1
jupyter_client          8.6.0
jupyter_core            5.5.0
libarchive-c            2.9
libmambapy              1.5.1
MarkupSafe              2.1.1
matplotlib-inline       0.1.6
mkl-fft                 1.3.8
mkl-random              1.2.4
mkl-service             2.4.0
more-itertools          8.12.0
mpmath                  1.3.0
multidict               6.0.4
multiprocess            0.70.15
nest-asyncio            1.5.8
networkx                3.1
numpy                   1.26.0
nvidia-ml-py            12.535.133
packaging               23.1
pandas                  2.1.3
parso                   0.8.3
peft                    0.7.0
pexpect                 4.8.0
pickleshare             0.7.5
Pillow                  10.0.1
pip                     23.3
pkginfo                 1.9.6
platformdirs            4.1.0
pluggy                  1.0.0
prompt-toolkit          3.0.36
protobuf                4.25.1
psutil                  5.9.0
ptyprocess              0.7.0
pure-eval               0.2.2
pyarrow                 14.0.1
pyarrow-hotfix          0.6
pycosat                 0.6.6
pycparser               2.21
Pygments                2.15.1
pynvml                  11.5.0
pyOpenSSL               23.2.0
PySocks                 1.7.1
python-dateutil         2.8.2
python-etcd             0.4.5
pytz                    2023.3.post1
PyYAML                  6.0.1
pyzmq                   25.1.2
regex                   2023.10.3
requests                2.31.0
responses               0.18.0
ruamel.yaml             0.17.21
ruamel.yaml.clib        0.2.6
safetensors             0.4.1
scikit-learn            1.3.2
scipy                   1.11.4
sentencepiece           0.1.99
setuptools              68.0.0
six                     1.16.0
sortedcontainers        2.4.0
soupsieve               2.5
stack-data              0.2.0
sympy                   1.11.1
threadpoolctl           3.2.0
tokenizers              0.15.0
tomli                   2.0.1
toolz                   0.12.0
torch                   2.1.1
torchaudio              2.1.1
torchelastic            0.2.2
torchvision             0.16.1
tornado                 6.4
tqdm                    4.65.0
traitlets               5.7.1
transformers            4.35.2
triton                  2.1.0
truststore              0.8.0
types-dataclasses       0.6.6
typing_extensions       4.7.1
tzdata                  2023.3
urllib3                 1.26.18
wcwidth                 0.2.5
wheel                   0.41.2
xxhash                  3.4.1
yarl                    1.9.4
zstandard               0.19.0

Who can help?

@pac

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder
My own task or dataset (give details below)

Reproduction

I'm testing PEFT Lora Initialization options.

from peft import LoftQConfig, LoraConfig, get_peft_model

base_model = AutoModelForCausalLM.from_pretrained(...)  # don't quantize here
loftq_config = LoftQConfig(loftq_bits=4, ...)           # set 4bit quantization
lora_config = LoraConfig(..., init_lora_weights="loftq", loftq_config=loftq_config)
peft_model = get_peft_model(base_model, lora_config)

The script I'm testing is official huggingface run_clm.py script and I pasted the lora config part. Nothing else is added or deleted from the original file.

embedding_size = model.get_input_embeddings().weight.shape[0]
if len(tokenizer) > embedding_size:
    model.resize_token_embeddings(len(tokenizer))

##################################################
if model_args.is_lora:
    print("[*] loading lora config")
    loftq_config = LoftQConfig(loftq_bits=8)     
    lora_config = LoraConfig(
        task_type = TaskType.CAUSAL_LM, # TaskType: CAUSAL_LM, SEQ_CLS,,,
        inference_mode = False,
        # r = 4, 
        # lora_alpha = 8, 
        r = 8,
        lora_alpha = 16,
        lora_dropout = 0.1,
        target_modules = [ # for EXAONE v2.0
            "c_attn",
            "c_proj",
            "c_fc"
            # "out_proj",
            # "c_fc_0",
            # "c_fc_1",
            # "c_proj",
        ],
        init_lora_weights="loftq",
        loftq_config=loftq_config
    )
    
    # lora_config = AdaLoraConfig(
    #     peft_type="ADALORA",
    #     task_type="CAUSAL_LM",
    #     r=8,
    #     lora_alpha=16,
    #     target_modules=[
    #         "c_attn", 
    #         "c_proj", 
    #         "c_fc"
    #     ],
    #     lora_dropout=0.1,
    #     )
    
    print("[*] lora_config")
    print(lora_config)
    
    model = get_peft_model(model, lora_config)
    model.print_trainable_parameters()
##################################################

# Preprocessing the datasets.
# First we tokenize all the texts.
if training_args.do_train:
    column_names = list(raw_datasets["train"].features)
else:
    column_names = list(raw_datasets["validation"].features)
text_column_name = "text" if "text" in column_names else column_names[0]

Here is my running script.

python -u run_clm_lora.py \
    --model_name_or_path gpt2 \
    --dataset_name wikitext   \
    --dataset_config_name wikitext-2-raw-v1 \
    --per_device_train_batch_size 1 \
    --per_device_eval_batch_size 1 \
    --do_train \
    --output_dir /tmp/test-clm \
    --num_train_epochs 5 \
    --overwrite_output_dir \
    --trust_remote_code False \
    --is_lora True

Expected behavior

I expected LoftQ Initialization

The text was updated successfully, but these errors were encountered:

BenjaminBossan · 2023-12-08T14:57:56Z

Thanks for reporting. Do you get an error like this?

quantized_weight, max_abs, shape = quantizer.quantize_block(res)
UnboundLocalError: local variable 'quantizer' referenced before assignment

This is because of the bug mentioned here:

#1150 (comment)

If I change this line:

peft/src/peft/utils/loftq_utils.py

Line 201 in fc9f4b3

if not is_bnb_4bit_available():

to:

if not is_bnb_4bit_available() or num_bits == 8:

I get some progress with your example, but unfortunately encounter another issue:

return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1024x768 and 2304x8)
Uncaught exception. Entering post mortem debugging
Running 'cont' or 'step' will restart the program
~/anaconda3/envs/peft/lib/python3.10/site-packages/torch/nn/modules/linear.py(114)forward()
-> return F.linear(input, self.weight, self.bias)

Similar thing happens when I try to use 4bit instead of 8bit (remember to send the model to cuda).

Interestingly, it works for me when using a different architecture (bloomz-560m), both with 4bit and 8bit (when applying the fix above). Therefore, I suspect it's somehow related to the model architecture (we had some issues with gpt2 in the past).

Ping @yxli2123

yxli2123 · 2023-12-08T15:48:23Z

Hi, gpt2 uses nn.Conv1D() instead of nn.Linear(). I'm not sure if this is the reason. We haven't test it on GPT2 yet.

SoundProvider · 2023-12-09T08:08:33Z

Hi, gpt2 uses nn.Conv1D() instead of nn.Linear(). I'm not sure if this is the reason. We haven't test it on GPT2 yet.

I guess this is the reason. I tested on bloomz-560m as @BenjaminBossan mentioned and It worked just fine!

BenjaminBossan · 2023-12-11T11:16:31Z

I think the question has been answered, if something new comes up, feel free to re-open.

adampauls · 2024-01-09T18:51:58Z

I get a similar error on meta-llama/Llama-2-7b-chat-hf. My LoraConfig is

LoraConfig(peft_type=<PeftType.LORA: 'LORA'>, auto_mapping=None, base_model_name_or_path=None, revision=None, task_type=None, inference_mode=False, r=8, target_modules=['o_proj', 'up_proj', 'gate_proj', 'k_proj', 'down_proj', 'q_proj', 'v_proj'], lora_alpha=8, lora_dropout=0.0, fan_in_fan_out=False, bias='none', modules_to_save=None, init_lora_weights='loftq', layers_to_transform=None, layers_pattern=None, rank_pattern={}, alpha_pattern={}, megatron_config=None, megatron_core='megatron.core', loftq_config={'loftq_bits': 4, 'loftq_iter': 1})

and my quantization config is

BitsAndBytesConfig {
  "bnb_4bit_compute_dtype": "bfloat16",
  "bnb_4bit_quant_type": "nf4",
  "bnb_4bit_use_double_quant": true,
  "llm_int8_enable_fp32_cpu_offload": false,
  "llm_int8_has_fp16_weight": false,
  "llm_int8_skip_modules": null,
  "llm_int8_threshold": 6.0,
  "load_in_4bit": true,
  "load_in_8bit": false,
  "quant_method": "bitsandbytes"
}

The error is File "/home/nonroot/precog/.venv/lib/python3.11/site-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: mat1 and mat2 shapes cannot be multiplied (24576x4096 and 1x1)
Has it been tested with Llama 2? I would have assumed so. I'm on peft 0.7.1 and transformers 4.36.2.

adampauls · 2024-01-09T19:03:17Z

It's also happening with 01-ai/Yi-6B-Chat, so perhaps this is something I'm doing wrong. Any ideas what it could be?

adampauls · 2024-01-09T20:02:02Z

I see the issue. I was quantization the model on load with AutoModelForCausalLM.from_pretrained(quantization_config=quantization_config), but loftq has to take the unquantized model and quantize it. It might be worth adding a warning to make this clear to the user.

BenjaminBossan · 2024-01-10T12:48:18Z

It might be worth adding a warning to make this clear to the user.

We have documented this here, here, and here. Is there anywhere else you looked where this info could be added?

BenjaminBossan closed this as completed Dec 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LoftQConfig + LoraConfig] throws size matmul mismatch error #1240

[LoftQConfig + LoraConfig] throws size matmul mismatch error #1240

SoundProvider commented Dec 8, 2023 •

edited

Loading

BenjaminBossan commented Dec 8, 2023

yxli2123 commented Dec 8, 2023

SoundProvider commented Dec 9, 2023

BenjaminBossan commented Dec 11, 2023

adampauls commented Jan 9, 2024 •

edited

Loading

adampauls commented Jan 9, 2024

adampauls commented Jan 9, 2024

BenjaminBossan commented Jan 10, 2024

[LoftQConfig + LoraConfig] throws size matmul mismatch error #1240

[LoftQConfig + LoraConfig] throws size matmul mismatch error #1240

Comments

SoundProvider commented Dec 8, 2023 • edited Loading

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

BenjaminBossan commented Dec 8, 2023

yxli2123 commented Dec 8, 2023

SoundProvider commented Dec 9, 2023

BenjaminBossan commented Dec 11, 2023

adampauls commented Jan 9, 2024 • edited Loading

adampauls commented Jan 9, 2024

adampauls commented Jan 9, 2024

BenjaminBossan commented Jan 10, 2024

SoundProvider commented Dec 8, 2023 •

edited

Loading

adampauls commented Jan 9, 2024 •

edited

Loading