-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mplug-owl3-7b-chat fine-tuning document #1969
Comments
image fine-tuningThe format of the custom dataset is as follows (single image, multiple images, and no image):
Fine-tuning script:
If you want to use a custom dataset, simply specify as follows: --dataset train.jsonl \
--val_dataset val.jsonl \ Here is the inference script after fine-tuning, we perform inference on the automatically segmented validation set:
video fine-tuningThe format of the custom dataset is as follows:
Fine-tuning script: # ModelScope
CUDA_VISIBLE_DEVICES=0,1,2,3 NPROC_PER_NODE=4 swift sft \
--model_type mplug-owl3-7b-chat \
--model_id_or_path iic/mPLUG-Owl3-7B-240728 \
--sft_type lora \
--dataset video-chatgpt \
--deepspeed default-zero2 \
--output_dir output \
--num_train_epochs 5 |
你好,试了下image fine-tuning的示例代码,发现训练过程中模型并没有用到image和media offset,data_collator貌似忽略了这两个值,导致模型在forward的过程中并没有用到图像 |
我修复一下 |
fixed |
你好,感谢修复。 |
确实,我也试了,不能开起batchsize = 2 |
是的 不支持batch_size=2, 我也不知道该如何支持。我尝试使用padding,但会在owl3的代码中抛错误. |
我把代码改成可以支持并行了,但是我不太确信效果如何,我正在验证我的代码和batch size=1时候在你提供的那个训练集上面的效果,如果可以的话,我再来和你交流哈
…---Original---
From: ***@***.***>
Date: Sun, Sep 22, 2024 23:35 PM
To: ***@***.***>;
Cc: ***@***.******@***.***>;
Subject: Re: [modelscope/ms-swift] mplug-owl3-7b-chat fine-tuning document(Issue #1969)
确实,我也试了,不能开起batchsize = 2 [rank0]: Original Traceback (most recent call last): [rank0]: File "/home/project/tools/anaconda3/envs/owl3/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop [rank0]: data = fetcher.fetch(index) # type: ignore[possibly-undefined] [rank0]: File "/home/project/tools/anaconda3/envs/owl3/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 55, in fetch [rank0]: return self.collate_fn(data) [rank0]: File "/home/project/ruohangxu/ms-swift/swift/llm/utils/template.py", line 3318, in data_collator [rank0]: res['media_offset'] = torch.concat(media_offset) [rank0]: RuntimeError: Sizes of tensors must match except in dimension 0. Expected size 122 but got size 124 for tensor number 1 in the list.
是的 不支持batch_size=2, 我也不知道该如何支持。我尝试使用padding,但会在owl3的代码中抛错误.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
所以那个padding标记到底是什么呢?就是你在训练的时候用的这个media offset的padding
…---Original---
From: ***@***.***>
Date: Sun, Sep 22, 2024 23:35 PM
To: ***@***.***>;
Cc: ***@***.******@***.***>;
Subject: Re: [modelscope/ms-swift] mplug-owl3-7b-chat fine-tuning document(Issue #1969)
确实,我也试了,不能开起batchsize = 2 [rank0]: Original Traceback (most recent call last): [rank0]: File "/home/project/tools/anaconda3/envs/owl3/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 309, in _worker_loop [rank0]: data = fetcher.fetch(index) # type: ignore[possibly-undefined] [rank0]: File "/home/project/tools/anaconda3/envs/owl3/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 55, in fetch [rank0]: return self.collate_fn(data) [rank0]: File "/home/project/ruohangxu/ms-swift/swift/llm/utils/template.py", line 3318, in data_collator [rank0]: res['media_offset'] = torch.concat(media_offset) [rank0]: RuntimeError: Sizes of tensors must match except in dimension 0. Expected size 122 but got size 124 for tensor number 1 in the list.
是的 不支持batch_size=2, 我也不知道该如何支持。我尝试使用padding,但会在owl3的代码中抛错误.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
如果发现效果可以,欢迎提供PR哈 |
你那边有你提供的这个脚本的实验结果吗?可以分享一下coco-en-mini#20000数据集的结果来做一个比较吗?
我说的脚本是指:
CUDA_VISIBLE_DEVICES=0,1,2,3 NPROC_PER_NODE=4 swift sft \ --model_type mplug-owl3-7b-chat \ --model_id_or_path iic/mPLUG-Owl3-7B-240728 \ --sft_type lora \ --dataset coco-en-mini#20000 \ --deepspeed default-zero2 \ --output_dir output \ --num_train_epochs 5
onlysword
***@***.***
…------------------ 原始邮件 ------------------
发件人: ***@***.***>;
发送时间: 2024年9月22日(星期天) 晚上11:51
收件人: ***@***.***>;
抄送: ***@***.***>; ***@***.***>;
主题: Re: [modelscope/ms-swift] mplug-owl3-7b-chat fine-tuning document (Issue #1969)
我把代码改成可以支持并行了,但是我不太确信效果如何,我正在验证我的代码和batch size=1时候在你提供的那个训练集上面的效果,如果可以的话,我再来和你交流哈
如果发现效果可以,欢迎提供PR哈
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
有图的时候padding [0, 0],无图的时候则全为[0, -1000000](这里任意负值都可以),是这样吗? |
论文里有提到0,0 padding吗?我咋没读到哎。而且就算在有图的时候给了0,0的padding,会报错的,我已经试过了
我现在做了就是把最后一个token扩充到padding的长度,也就是batch的最长长度,代码可以正常运行,但是实验效果很不好。所以我猜测作者原来训练肯定不是复制序列末尾元素的方式来padding,但是现在无论是负数padding还是0padding我都试了,都会报错。只能问问作者了。
onlysword
***@***.***
…------------------ 原始邮件 ------------------
发件人: "Huayu ***@***.***>;
发送时间: 2024年9月23日(星期一) 上午10:26
收件人: ***@***.***>;
抄送: ***@***.***>; ***@***.***>;
主题: Re: [modelscope/ms-swift] mplug-owl3-7b-chat fine-tuning document (Issue #1969)
按照论文的说法,有图的时候padding [0, 0],无图的时候则全为[0, -1000000](这里任意负值都可以],是这样吗?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
PR: #2100 |
Found some bugs when I do full parameters finetune. |
solved! |
#2172 (comment) |
Model:
Usually, fine-tuning a multimodal large model involves using a custom dataset for fine-tuning. Here, we will demonstrate a runnable demo.
Fine-tuned Dataset:
Before starting the fine-tuning, please ensure that your environment is properly prepared.
git clone https://github.com/modelscope/ms-swift.git cd swift pip install -e .[llm] pip install decord icecream
Inference
Results
GPU Memory:
The text was updated successfully, but these errors were encountered: