-
Notifications
You must be signed in to change notification settings - Fork 863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add omp in env for torchrun and update doc #2320
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2320 +/- ##
==========================================
+ Coverage 69.37% 69.39% +0.02%
==========================================
Files 77 77
Lines 3438 3441 +3
Branches 57 57
==========================================
+ Hits 2385 2388 +3
Misses 1050 1050
Partials 3 3
... and 1 file with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Li for the updates. Few comments:
-
Will be good to move the Download_models.py one level up as a common file after Pippy deferred init #2310 is merged.
-
For the Large model best practices, we should expand and provide more details for model packaging, threads and other configs and troubleshooting tips. This can be done as a separate PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lxning added some minor suggestions.
@@ -1,2 +1,2 @@ | |||
transformers | |||
deepspeed | |||
transformers==4.28.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for pinning the version numbers? Can we do >= instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The version combination of DS and transformer is not stable. Some combination does not work. So far this combination works for all the models we have tested.
frontend/server/src/main/java/org/pytorch/serve/wlm/WorkerLifeCycle.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left few comments please check
frontend/archive/src/main/java/org/pytorch/serve/archive/model/ModelConfig.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For OMP_NUM_THREADS, please add documentation for how user can set this value
Description
Please read our CONTRIBUTING.md prior to creating your first pull request.
Please include a summary of the feature or issue being fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
Fixes #(issue)
Type of change
Please delete options that are not relevant.
Feature/Issue validation/testing
Please describe the Unit or Integration tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.
Test A
Logs for Test A
Test B
Logs for Test B
Checklist: