优化finetune.py为混合精度训练. #33

mellivoraPKU · 2025-01-23T07:52:21Z

1.优化finetune.py为混合精度训练.
2.将finetune_distributed.py分布式训练方式改为deepspeed,同时相应开启混合训练.
3.更新requirements.txt到最新版本

zhangfaen · 2025-01-24T04:55:50Z

finetune.py

-            outputs = model(**inputs, labels=labels)
+
+            # Automatic Mixed Precision (AMP) context manager for efficient training.
+            # Inside this context, `torch.is_autocast_enabled()` returns True; outside the context, it returns False.


Inside this context, torch.is_autocast_enabled() returns True; outside the context, it returns False.

This line should be changed to below

AMP context manager can be embeded by another AMP context manager, the behavior is shown by below code snippet.

# import torch # from torch.amp import autocast # with autocast(device_type='cuda', dtype=torch.bfloat16): # print(torch.is_autocast_enabled()) # True # with autocast(device_type='cuda', dtype=torch.bfloat16): # print(torch.is_autocast_enabled()) # True # with autocast(device_type='cuda', dtype=torch.bfloat16, enabled=False): # print(torch.is_autocast_enabled()) # False # print(torch.is_autocast_enabled()) # True

2.将finetune_distributed.py分布式训练方式改为deepspeed,同时相应开启混合训练. 3.更新requirements.txt到最新版本

zhangfaen reviewed Jan 24, 2025

View reviewed changes

1.优化finetune.py为混合精度训练.

b7d8bbf

2.将finetune_distributed.py分布式训练方式改为deepspeed,同时相应开启混合训练. 3.更新requirements.txt到最新版本

mellivoraPKU force-pushed the linguosen branch from 5736ee6 to b7d8bbf Compare January 24, 2025 06:08

zhangfaen merged commit 986ff95 into zhangfaen:main Jan 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

优化finetune.py为混合精度训练. #33

优化finetune.py为混合精度训练. #33

mellivoraPKU commented Jan 23, 2025

zhangfaen Jan 24, 2025 •

edited

Loading

优化finetune.py为混合精度训练. #33

优化finetune.py为混合精度训练. #33

Conversation

mellivoraPKU commented Jan 23, 2025

zhangfaen Jan 24, 2025 • edited Loading

Choose a reason for hiding this comment

Inside this context, torch.is_autocast_enabled() returns True; outside the context, it returns False.

This line should be changed to below

zhangfaen Jan 24, 2025 •

edited

Loading

Inside this context, `torch.is_autocast_enabled()` returns True; outside the context, it returns False.