-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(pipelines): support passing decoder model + tokenizer #319
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
We also remove driver reinstallation step
786e975
to
3d8e2e6
Compare
@@ -123,7 +123,7 @@ def load_pipeline( | |||
model, export=export, **compiler_args, **input_shapes, **hub_kwargs, **kwargs | |||
) | |||
# uses neuron model | |||
elif isinstance(model, NeuronBaseModel): | |||
elif isinstance(model, (NeuronBaseModel, NeuronModelForCausalLM)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QQ: why is NeuronModelForCausalLM
not a sublass of NeuronBseModel
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because NeuronBaseModel
is based on JIT models and implements the corresponding conversion logic. NeuronModelForCausalLM
is a subclass of NeuronDecoderModel
that uses transformers-neuronx
models instead.
We could refactor to add a common class to both though, since the latest subclasses of NeuronBaseModel
are overriding pretty much all its methods now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it make more sense to use NeuronDecoderModel
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it would be good to refactor it if possible. But also it's not pressing and can be postponed to when we have more bandwidth!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will wait until @JingyaHuang 's pull-request on T5 is merged as it is a new subclass, and any change I make before that to the base class will result in nightmarish conflicts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the fix!
When passing explicitly a neuron model to a pipeline, we check the model class. This modifies the check to accept not only NeuronBaseModel but also NeuronModelForCausalLM.
Update: cherry-picked @glegendre01 modifications to github workflows to reactivate INF1 CI.