-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add contribution guide for Neuron exporter #461
Conversation
JingyaHuang
commented
Feb 4, 2024
- Exporter guide
- Example for ESM
- Next step: open first good issue
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pull-request. This applies only to models exported using the tracing API. I haven't yet find a good way to reflect that in the documentation, but we need to make sure someone does not try to contribute an unsupported causal model that way.
@dacorvo Added a note to clarify that the contribution guide does not apply causal tasks model supported through transformers_neuronx! |
|
||
Here we take the support of [ESM models](https://huggingface.co/docs/transformers/model_doc/esm#esm) as an example. Let's create an `EsmNeuronConfig` class in the `optimum/exporters/neuron/model_configs.py`. | ||
|
||
When an Esm model interprets as a text encoder, we are able to inherit from the middle-end class [`TextEncoderNeuronConfig`](https://github.com/huggingface/optimum-neuron/blob/v0.0.18/optimum/exporters/neuron/config.py#L36). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about the phrasing here. You want mean "when an ESM model acts a text encoder" because it can act as something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Theoretically, it could act as a decoder as well according to the doc of ESM. But I did not see seq2seq modeling yet implemented in transformers although there seems to be some placeholders for it.
Co-authored-by: David Corvoysier <david@huggingface.co>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>