-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update docs #588
Update docs #588
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
3aba5aa
to
a4d257c
Compare
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you! |
1 similar comment
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add 2 sections to this guide:
- How to enable Sequence Parallelism?
- How to enable Pipeline Parallelism?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do it in another PR to keep it small. The PR is here: #657
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBD if we keep recommending running notebooks on EC2 instances!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, same, let's make handle that in another PR.
- Mistral models, such as [Mistral 7b (`mistralai/Mistral-7B-Instruct-v0.3`)](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) | ||
- Llama-2 models, such as [Llama-2 7b (`meta-llama/Llama-2-7b-hf`)](https://huggingface.co/meta-llama/Llama-2-7b-hf) | ||
|
||
And many others! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT of adding a section in https://huggingface.co/docs/optimum-neuron/en/package_reference/supported_models for the ones we support for training? and reference it here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not but it is a bit "unclear". Because:
- We do not know which architecture is supported for regular training specifically.
- The supported architectures for distributed training is known, we can add that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except a few nits, thanks for the cleanup !
What does this PR do?