You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fix#4132.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
- Enhanced backend selection for distributed training, allowing for
flexible use of NCCL or Gloo based on availability.
- **Bug Fixes**
- Corrected indentation for improved code clarity.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Summary
Support CPU parallel training in the PyTorch backend.
Detailed Description
PyTorch does support gloo for distributed training, but the following lines seem to limit the backend to be nccl.
deepmd-kit/deepmd/pt/entrypoints/main.py
Lines 109 to 110 in 63e4a25
Further Information, Files, and Links
No response
The text was updated successfully, but these errors were encountered: