Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(pipelines): support passing decoder model + tokenizer #319

Merged
merged 2 commits into from
Nov 14, 2023

Conversation

dacorvo
Copy link
Collaborator

@dacorvo dacorvo commented Nov 13, 2023

When passing explicitly a neuron model to a pipeline, we check the model class. This modifies the check to accept not only NeuronBaseModel but also NeuronModelForCausalLM.

Update: cherry-picked @glegendre01 modifications to github workflows to reactivate INF1 CI.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@@ -123,7 +123,7 @@ def load_pipeline(
model, export=export, **compiler_args, **input_shapes, **hub_kwargs, **kwargs
)
# uses neuron model
elif isinstance(model, NeuronBaseModel):
elif isinstance(model, (NeuronBaseModel, NeuronModelForCausalLM)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ: why is NeuronModelForCausalLM not a sublass of NeuronBseModel?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because NeuronBaseModel is based on JIT models and implements the corresponding conversion logic. NeuronModelForCausalLM is a subclass of NeuronDecoderModel that uses transformers-neuronx models instead.
We could refactor to add a common class to both though, since the latest subclasses of NeuronBaseModel are overriding pretty much all its methods now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make more sense to use NeuronDecoderModel here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it would be good to refactor it if possible. But also it's not pressing and can be postponed to when we have more bandwidth!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will wait until @JingyaHuang 's pull-request on T5 is merged as it is a new subclass, and any change I make before that to the base class will result in nightmarish conflicts.

Copy link
Collaborator

@JingyaHuang JingyaHuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the fix!

@dacorvo dacorvo merged commit d47741f into main Nov 14, 2023
7 of 9 checks passed
@dacorvo dacorvo deleted the fix_pipeline_load_model branch November 14, 2023 12:28
@dacorvo dacorvo mentioned this pull request Nov 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants