500条数据训练完全无效果
#6339
Replies: 1 comment
-
看起来是数据集太小导致欠拟合 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
不知为何训练数据足够但是效果差几乎没有起到任何影响,求大佬指教!
System Info
llamafactory
version: 0.9.1.dev0以下是我的一些参数和数据集内容
以下是我的训练loss图
训练脚本:
日志:```
训练完毕。
[INFO|2024-12-15 22:34:50] parser.py:355 >> Process rank: 0, device: cuda:0, n_gpu: 1, distributed training: False, compute dtype: torch.bfloat16
[INFO|2024-12-15 22:34:50] configuration_utils.py:677 >> loading configuration file /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f/config.json
[INFO|2024-12-15 22:34:50] configuration_utils.py:746 >> Model config Qwen2VLConfig { "_name_or_path": "/root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f", "architectures": [ "Qwen2VLForConditionalGeneration" ], "attention_dropout": 0.0, "bos_token_id": 151643, "eos_token_id": 151645, "hidden_act": "silu", "hidden_size": 3584, "image_token_id": 151655, "initializer_range": 0.02, "intermediate_size": 18944, "max_position_embeddings": 32768, "max_window_layers": 28, "model_type": "qwen2_vl", "num_attention_heads": 28, "num_hidden_layers": 28, "num_key_value_heads": 4, "rms_norm_eps": 1e-06, "rope_scaling": { "mrope_section": [ 16, 24, 24 ], "rope_type": "default", "type": "default" }, "rope_theta": 1000000.0, "sliding_window": 32768, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.46.1", "use_cache": true, "use_sliding_window": false, "video_token_id": 151656, "vision_config": { "in_chans": 3, "model_type": "qwen2_vl", "spatial_patch_size": 14 }, "vision_end_token_id": 151653, "vision_start_token_id": 151652, "vision_token_id": 151654, "vocab_size": 152064 }
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file vocab.json
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file merges.txt
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file tokenizer.json
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file added_tokens.json
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file special_tokens_map.json
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file tokenizer_config.json
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2475 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[INFO|2024-12-15 22:34:50] image_processing_base.py:373 >> loading configuration file /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f/preprocessor_config.json
[INFO|2024-12-15 22:34:50] image_processing_base.py:373 >> loading configuration file /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f/preprocessor_config.json
[INFO|2024-12-15 22:34:50] image_processing_base.py:429 >> Image processor Qwen2VLImageProcessor { "do_convert_rgb": true, "do_normalize": true, "do_rescale": true, "do_resize": true, "image_mean": [ 0.48145466, 0.4578275, 0.40821073 ], "image_processor_type": "Qwen2VLImageProcessor", "image_std": [ 0.26862954, 0.26130258, 0.27577711 ], "max_pixels": 12845056, "merge_size": 2, "min_pixels": 3136, "patch_size": 14, "processor_class": "Qwen2VLProcessor", "resample": 3, "rescale_factor": 0.00392156862745098, "size": { "max_pixels": 12845056, "min_pixels": 3136 }, "temporal_patch_size": 2 }
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file vocab.json
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file merges.txt
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file tokenizer.json
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file added_tokens.json
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file special_tokens_map.json
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2209 >> loading file tokenizer_config.json
[INFO|2024-12-15 22:34:50] tokenization_utils_base.py:2475 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[INFO|2024-12-15 22:34:51] processing_utils.py:755 >> Processor Qwen2VLProcessor:
image_processor: Qwen2VLImageProcessor { "do_convert_rgb": true, "do_normalize": true, "do_rescale": true, "do_resize": true, "image_mean": [ 0.48145466, 0.4578275, 0.40821073 ], "image_processor_type": "Qwen2VLImageProcessor", "image_std": [ 0.26862954, 0.26130258, 0.27577711 ], "max_pixels": 12845056, "merge_size": 2, "min_pixels": 3136, "patch_size": 14, "processor_class": "Qwen2VLProcessor", "resample": 3, "rescale_factor": 0.00392156862745098, "size": { "max_pixels": 12845056, "min_pixels": 3136 }, "temporal_patch_size": 2 }
tokenizer: Qwen2TokenizerFast(name_or_path='/root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f', vocab_size=151643, model_max_length=32768, is_fast=True, padding_side='left', truncation_side='right', special_tokens={'eos_token': '<|im_end|>', 'pad_token': '<|endoftext|>', 'additional_special_tokens': ['<|im_start|>', '<|im_end|>', '<|object_ref_start|>', '<|object_ref_end|>', '<|box_start|>', '<|box_end|>', '<|quad_start|>', '<|quad_end|>', '<|vision_start|>', '<|vision_end|>', '<|vision_pad|>', '<|image_pad|>', '<|video_pad|>']}, clean_up_tokenization_spaces=False), added_tokens_decoder={ 151643: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151644: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151645: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151646: AddedToken("<|object_ref_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151647: AddedToken("<|object_ref_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151648: AddedToken("<|box_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151649: AddedToken("<|box_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151650: AddedToken("<|quad_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151651: AddedToken("<|quad_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151652: AddedToken("<|vision_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151653: AddedToken("<|vision_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151654: AddedToken("<|vision_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151655: AddedToken("<|image_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True), 151656: AddedToken("<|video_pad|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
}
{ "processor_class": "Qwen2VLProcessor" }
[INFO|2024-12-15 22:34:51] logging.py:157 >> Replace eos token: <|im_end|>
[INFO|2024-12-15 22:34:51] logging.py:157 >> Loading dataset data_no_history.json...
[INFO|2024-12-15 22:34:55] configuration_utils.py:677 >> loading configuration file /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f/config.json
[INFO|2024-12-15 22:34:55] configuration_utils.py:746 >> Model config Qwen2VLConfig { "_name_or_path": "/root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f", "architectures": [ "Qwen2VLForConditionalGeneration" ], "attention_dropout": 0.0, "bos_token_id": 151643, "eos_token_id": 151645, "hidden_act": "silu", "hidden_size": 3584, "image_token_id": 151655, "initializer_range": 0.02, "intermediate_size": 18944, "max_position_embeddings": 32768, "max_window_layers": 28, "model_type": "qwen2_vl", "num_attention_heads": 28, "num_hidden_layers": 28, "num_key_value_heads": 4, "rms_norm_eps": 1e-06, "rope_scaling": { "mrope_section": [ 16, 24, 24 ], "rope_type": "default", "type": "default" }, "rope_theta": 1000000.0, "sliding_window": 32768, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.46.1", "use_cache": true, "use_sliding_window": false, "video_token_id": 151656, "vision_config": { "in_chans": 3, "model_type": "qwen2_vl", "spatial_patch_size": 14 }, "vision_end_token_id": 151653, "vision_start_token_id": 151652, "vision_token_id": 151654, "vocab_size": 152064 }
[INFO|2024-12-15 22:34:55] modeling_utils.py:3934 >> loading weights file /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f/model.safetensors.index.json
[INFO|2024-12-15 22:34:55] modeling_utils.py:1670 >> Instantiating Qwen2VLForConditionalGeneration model under default dtype torch.bfloat16.
[INFO|2024-12-15 22:34:55] configuration_utils.py:1096 >> Generate config GenerationConfig { "bos_token_id": 151643, "eos_token_id": 151645 }
[INFO|2024-12-15 22:34:55] modeling_utils.py:1670 >> Instantiating Qwen2VisionTransformerPretrainedModel model under default dtype torch.bfloat16.
[WARNING|2024-12-15 22:34:55] logging.py:168 >> Qwen2VLRotaryEmbedding can now be fully parameterized by passing the model config through the config argument. All other arguments will be removed in v4.46
[INFO|2024-12-15 22:35:00] modeling_utils.py:4800 >> All model checkpoint weights were used when initializing Qwen2VLForConditionalGeneration.
[INFO|2024-12-15 22:35:00] modeling_utils.py:4808 >> All the weights of Qwen2VLForConditionalGeneration were initialized from the model checkpoint at /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f. If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2VLForConditionalGeneration for predictions without further training.
[INFO|2024-12-15 22:35:00] configuration_utils.py:1049 >> loading configuration file /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f/generation_config.json
[INFO|2024-12-15 22:35:00] configuration_utils.py:1096 >> Generate config GenerationConfig { "bos_token_id": 151643, "do_sample": true, "eos_token_id": [ 151645, 151643 ], "pad_token_id": 151643, "temperature": 0.01, "top_k": 1, "top_p": 0.001 }
[INFO|2024-12-15 22:35:00] logging.py:157 >> Gradient checkpointing enabled.
[INFO|2024-12-15 22:35:00] logging.py:157 >> Using torch SDPA for faster training and inference.
[INFO|2024-12-15 22:35:00] logging.py:157 >> Upcasting trainable params to float32.
[INFO|2024-12-15 22:35:00] logging.py:157 >> Fine-tuning method: LoRA
[INFO|2024-12-15 22:35:00] logging.py:157 >> Found linear modules: gate_proj,up_proj,q_proj,v_proj,o_proj,down_proj,k_proj
[INFO|2024-12-15 22:35:01] logging.py:157 >> trainable params: 20,185,088 || all params: 8,311,560,704 || trainable%: 0.2429
[INFO|2024-12-15 22:35:01] trainer.py:698 >> Using auto half precision backend
[INFO|2024-12-15 22:35:01] trainer.py:2313 >> ***** Running training *****
[INFO|2024-12-15 22:35:01] trainer.py:2314 >> Num examples = 450
[INFO|2024-12-15 22:35:01] trainer.py:2315 >> Num Epochs = 3
[INFO|2024-12-15 22:35:01] trainer.py:2316 >> Instantaneous batch size per device = 2
[INFO|2024-12-15 22:35:01] trainer.py:2319 >> Total train batch size (w. parallel, distributed & accumulation) = 16
[INFO|2024-12-15 22:35:01] trainer.py:2320 >> Gradient Accumulation steps = 8
[INFO|2024-12-15 22:35:01] trainer.py:2321 >> Total optimization steps = 84
[INFO|2024-12-15 22:35:01] trainer.py:2322 >> Number of trainable parameters = 20,185,088
[INFO|2024-12-15 22:35:58] logging.py:157 >> {'loss': 0.6811, 'learning_rate': 4.9564e-05, 'epoch': 0.18}
[INFO|2024-12-15 22:36:49] logging.py:157 >> {'loss': 0.6575, 'learning_rate': 4.8272e-05, 'epoch': 0.36}
[INFO|2024-12-15 22:37:45] logging.py:157 >> {'loss': 0.6249, 'learning_rate': 4.6168e-05, 'epoch': 0.53}
[INFO|2024-12-15 22:38:37] logging.py:157 >> {'loss': 0.5894, 'learning_rate': 4.3326e-05, 'epoch': 0.71}
[INFO|2024-12-15 22:39:30] logging.py:157 >> {'loss': 0.6103, 'learning_rate': 3.9846e-05, 'epoch': 0.89}
[INFO|2024-12-15 22:40:16] logging.py:157 >> {'loss': 0.5776, 'learning_rate': 3.5847e-05, 'epoch': 1.07}
[INFO|2024-12-15 22:41:05] logging.py:157 >> {'loss': 0.5526, 'learning_rate': 3.1470e-05, 'epoch': 1.24}
[INFO|2024-12-15 22:41:55] logging.py:157 >> {'loss': 0.5321, 'learning_rate': 2.6868e-05, 'epoch': 1.42}
[INFO|2024-12-15 22:42:51] logging.py:157 >> {'loss': 0.5318, 'learning_rate': 2.2201e-05, 'epoch': 1.60}
[INFO|2024-12-15 22:43:43] logging.py:157 >> {'loss': 0.5415, 'learning_rate': 1.7631e-05, 'epoch': 1.78}
[INFO|2024-12-15 22:44:39] logging.py:157 >> {'loss': 0.5456, 'learning_rate': 1.3318e-05, 'epoch': 1.96}
[INFO|2024-12-15 22:45:35] logging.py:157 >> {'loss': 0.5419, 'learning_rate': 9.4128e-06, 'epoch': 2.13}
[INFO|2024-12-15 22:46:25] logging.py:157 >> {'loss': 0.5615, 'learning_rate': 6.0507e-06, 'epoch': 2.31}
[INFO|2024-12-15 22:47:09] logging.py:157 >> {'loss': 0.5684, 'learning_rate': 3.3494e-06, 'epoch': 2.49}
[INFO|2024-12-15 22:48:09] logging.py:157 >> {'loss': 0.4966, 'learning_rate': 1.4029e-06, 'epoch': 2.67}
[INFO|2024-12-15 22:48:58] logging.py:157 >> {'loss': 0.5051, 'learning_rate': 2.7923e-07, 'epoch': 2.84}
[INFO|2024-12-15 22:49:40] trainer.py:3801 >> Saving model checkpoint to saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/checkpoint-84
[INFO|2024-12-15 22:49:40] configuration_utils.py:677 >> loading configuration file /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f/config.json
[INFO|2024-12-15 22:49:40] configuration_utils.py:746 >> Model config Qwen2VLConfig { "architectures": [ "Qwen2VLForConditionalGeneration" ], "attention_dropout": 0.0, "bos_token_id": 151643, "eos_token_id": 151645, "hidden_act": "silu", "hidden_size": 3584, "image_token_id": 151655, "initializer_range": 0.02, "intermediate_size": 18944, "max_position_embeddings": 32768, "max_window_layers": 28, "model_type": "qwen2_vl", "num_attention_heads": 28, "num_hidden_layers": 28, "num_key_value_heads": 4, "rms_norm_eps": 1e-06, "rope_scaling": { "mrope_section": [ 16, 24, 24 ], "rope_type": "default", "type": "default" }, "rope_theta": 1000000.0, "sliding_window": 32768, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.46.1", "use_cache": true, "use_sliding_window": false, "video_token_id": 151656, "vision_config": { "in_chans": 3, "model_type": "qwen2_vl", "spatial_patch_size": 14 }, "vision_end_token_id": 151653, "vision_start_token_id": 151652, "vision_token_id": 151654, "vocab_size": 152064 }
[INFO|2024-12-15 22:49:41] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/checkpoint-84/tokenizer_config.json
[INFO|2024-12-15 22:49:41] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/checkpoint-84/special_tokens_map.json
[INFO|2024-12-15 22:49:41] image_processing_base.py:258 >> Image processor saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/checkpoint-84/preprocessor_config.json
[INFO|2024-12-15 22:49:41] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/checkpoint-84/tokenizer_config.json
[INFO|2024-12-15 22:49:41] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/checkpoint-84/special_tokens_map.json
[INFO|2024-12-15 22:49:42] processing_utils.py:541 >> chat template saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/checkpoint-84/chat_template.json
[INFO|2024-12-15 22:49:42] trainer.py:2584 >>
Training completed. Do not forget to share your model on huggingface.co/models =)
[INFO|2024-12-15 22:49:42] image_processing_base.py:258 >> Image processor saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/preprocessor_config.json
[INFO|2024-12-15 22:49:42] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/tokenizer_config.json
[INFO|2024-12-15 22:49:42] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/special_tokens_map.json
[INFO|2024-12-15 22:49:42] processing_utils.py:541 >> chat template saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/chat_template.json
[INFO|2024-12-15 22:49:42] trainer.py:3801 >> Saving model checkpoint to saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22
[INFO|2024-12-15 22:49:42] configuration_utils.py:677 >> loading configuration file /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f/config.json
[INFO|2024-12-15 22:49:42] configuration_utils.py:746 >> Model config Qwen2VLConfig { "architectures": [ "Qwen2VLForConditionalGeneration" ], "attention_dropout": 0.0, "bos_token_id": 151643, "eos_token_id": 151645, "hidden_act": "silu", "hidden_size": 3584, "image_token_id": 151655, "initializer_range": 0.02, "intermediate_size": 18944, "max_position_embeddings": 32768, "max_window_layers": 28, "model_type": "qwen2_vl", "num_attention_heads": 28, "num_hidden_layers": 28, "num_key_value_heads": 4, "rms_norm_eps": 1e-06, "rope_scaling": { "mrope_section": [ 16, 24, 24 ], "rope_type": "default", "type": "default" }, "rope_theta": 1000000.0, "sliding_window": 32768, "tie_word_embeddings": false, "torch_dtype": "bfloat16", "transformers_version": "4.46.1", "use_cache": true, "use_sliding_window": false, "video_token_id": 151656, "vision_config": { "in_chans": 3, "model_type": "qwen2_vl", "spatial_patch_size": 14 }, "vision_end_token_id": 151653, "vision_start_token_id": 151652, "vision_token_id": 151654, "vocab_size": 152064 }
[INFO|2024-12-15 22:49:42] tokenization_utils_base.py:2646 >> tokenizer config file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/tokenizer_config.json
[INFO|2024-12-15 22:49:42] tokenization_utils_base.py:2655 >> Special tokens file saved in saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22/special_tokens_map.json
[WARNING|2024-12-15 22:49:42] logging.py:162 >> No metric eval_loss to plot.
[WARNING|2024-12-15 22:49:42] logging.py:162 >> No metric eval_accuracy to plot.
[INFO|2024-12-15 22:49:42] trainer.py:4117 >> ***** Running Evaluation *****
[INFO|2024-12-15 22:49:42] trainer.py:4119 >> Num examples = 50
[INFO|2024-12-15 22:49:42] trainer.py:4122 >> Batch size = 2
[INFO|2024-12-15 22:49:51] modelcard.py:449 >> Dropping the following result as it does not have all the necessary fields: {'task': {'name': 'Causal Language Modeling', 'type': 'text-generation'}}
[INFO|modeling_utils.py:4808] 2024-12-15 22:51:50,813 >> All the weights of Qwen2VLForConditionalGeneration were initialized from the model checkpoint at /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f.
If your task is similar to the task the model of the checkpoint was trained on, you can already use Qwen2VLForConditionalGeneration for predictions without further training.
[INFO|configuration_utils.py:1049] 2024-12-15 22:51:50,817 >> loading configuration file /root/autodl-tmp/snapshots/51c47430f97dd7c74aa1fa6825e68a813478097f/generation_config.json
[INFO|configuration_utils.py:1096] 2024-12-15 22:51:50,817 >> Generate config GenerationConfig {
"bos_token_id": 151643,
"do_sample": true,
"eos_token_id": [
151645,
151643
],
"pad_token_id": 151643,
"temperature": 0.01,
"top_k": 1,
"top_p": 0.001
}
[INFO|2024-12-15 22:51:50] llamafactory.model.model_utils.attention:157 >> Using torch SDPA for faster training and inference.
[INFO|2024-12-15 22:51:51] llamafactory.model.adapter:157 >> Merged 1 adapter(s).
[INFO|2024-12-15 22:51:51] llamafactory.model.adapter:157 >> Loaded adapter(s): saves/Qwen2-VL-7B-Instruct/lora/train_2024-12-15-22-27-22
[INFO|2024-12-15 22:51:51] llamafactory.model.loader:157 >> all params: 8,291,375,616
我确定勾选了训练好的检查点路径,使用huggingface方式加载了模型和检查点,但是效果和没训练一样。我又试了
identity.json
,并勾选了训练好的检查点路径,使用huggingface方式加载了模型和检查点。但结果还是一样的,模型依旧回答自己是千问大模型。自带的identity训练效果(仅供测试就没做改动):
请问大佬,这种是什么问题呢?是数据集问题还是参数问题,抑或是我的打开方式不正确?小白求教🙏
Beta Was this translation helpful? Give feedback.
All reactions