Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(pipelines): support passing decoder model + tokenizer #319

Merged
merged 2 commits into from
Nov 14, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 1 addition & 75 deletions .github/workflows/test_inf1.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,65 +15,14 @@ concurrency:
cancel-in-progress: true

jobs:
start-runner:
name: Start self-hosted EC2 runner
runs-on: ubuntu-latest
env:
AWS_REGION: us-east-1
EC2_AMI_ID: ${{ vars.INFERENTIA_AMI_ID }}
EC2_INSTANCE_TYPE: inf1.6xlarge
EC2_SUBNET_ID: subnet-859322b4,subnet-b7533b96,subnet-47cfad21,subnet-a396b2ad,subnet-06576a4b,subnet-df0f6180
EC2_SECURITY_GROUP: sg-0bb210cd3ec725a13
EC2_IAM_ROLE: optimum-ec2-github-actions-role
outputs:
label: ${{ steps.start-ec2-runner.outputs.label }}
ec2-instance-id: ${{ steps.start-ec2-runner.outputs.ec2-instance-id }}
steps:
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ env.AWS_REGION }}
- name: Start EC2 runner
id: start-ec2-runner
uses: philschmid/philschmid-ec2-github-runner@main
with:
mode: start
github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
ec2-image-id: ${{ env.EC2_AMI_ID }}
ec2-instance-type: ${{ env.EC2_INSTANCE_TYPE }}
subnet-id: ${{ env.EC2_SUBNET_ID }}
security-group-id: ${{ env.EC2_SECURITY_GROUP }}
iam-role-name: ${{ env.EC2_IAM_ROLE }}
aws-resource-tags: > # optional, requires additional permissions
[
{"Key": "Name", "Value": "ec2-optimum-github-runner"},
{"Key": "GitHubRepository", "Value": "${{ github.repository }}"}
]
do-the-job:
name: Run INF1 tests
needs: start-runner # required to start the main job when the runner is ready
runs-on: ${{ needs.start-runner.outputs.label }} # run the job on the newly created runner
runs-on: [self-hosted, 4-aws-inf1, 24-cpu, ci]
env:
AWS_REGION: us-east-1
steps:
- name: Checkout
uses: actions/checkout@v2
- name: Re-install neuron driver
run: |
sudo apt remove aws-neuronx-dkms -y
sudo apt remove aws-neuronx-tools -y
. /etc/os-release
sudo tee /etc/apt/sources.list.d/neuron.list > /dev/null <<EOF
deb https://apt.repos.neuron.amazonaws.com ${VERSION_CODENAME} main
EOF
wget -qO - https://apt.repos.neuron.amazonaws.com/GPG-PUB-KEY-AMAZON-AWS-NEURON.PUB | sudo apt-key add -
sudo apt-get update -y
sudo apt-get install linux-headers-$(uname -r) -y
sudo apt-get install aws-neuronx-dkms=2.* -y
sudo apt-get install aws-neuronx-tools=2.* -y
export PATH=/opt/aws/neuron/bin:$PATH
- name: Install python dependencies
run: |
sudo apt install python3.8-venv -y
Expand All @@ -88,26 +37,3 @@ jobs:
run: |
source aws_neuron_venv_pytorch/bin/activate
HF_TOKEN_OPTIMUM_NEURON_CI=${{ secrets.HF_TOKEN_OPTIMUM_NEURON_CI }} pytest -m is_inferentia_test tests
stop-runner:
name: Stop self-hosted EC2 runner
needs:
- start-runner # required to get output from the start-runner job
- do-the-job # required to wait when the main job is done
runs-on: ubuntu-latest
env:
AWS_REGION: us-east-1
if: ${{ always() }} # required to stop the runner even if the error happened in the previous jobs
steps:
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ env.AWS_REGION }}
- name: Stop EC2 runner
uses: philschmid/philschmid-ec2-github-runner@main
with:
mode: stop
github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
label: ${{ needs.start-runner.outputs.label }}
ec2-instance-id: ${{ needs.start-runner.outputs.ec2-instance-id }}
2 changes: 1 addition & 1 deletion optimum/neuron/pipelines/transformers/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ def load_pipeline(
model, export=export, **compiler_args, **input_shapes, **hub_kwargs, **kwargs
)
# uses neuron model
elif isinstance(model, NeuronBaseModel):
elif isinstance(model, (NeuronBaseModel, NeuronModelForCausalLM)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QQ: why is NeuronModelForCausalLM not a sublass of NeuronBseModel?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because NeuronBaseModel is based on JIT models and implements the corresponding conversion logic. NeuronModelForCausalLM is a subclass of NeuronDecoderModel that uses transformers-neuronx models instead.
We could refactor to add a common class to both though, since the latest subclasses of NeuronBaseModel are overriding pretty much all its methods now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make more sense to use NeuronDecoderModel here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it would be good to refactor it if possible. But also it's not pressing and can be postponed to when we have more bandwidth!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will wait until @JingyaHuang 's pull-request on T5 is merged as it is a new subclass, and any change I make before that to the base class will result in nightmarish conflicts.

if tokenizer is None and load_tokenizer:
for preprocessor in model.preprocessors:
if isinstance(preprocessor, (PreTrainedTokenizer, PreTrainedTokenizerFast)):
Expand Down
10 changes: 10 additions & 0 deletions tests/pipelines/test_decoder_pipelines.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import itertools

import pytest
from transformers import AutoTokenizer

from optimum.neuron import NeuronModelForCausalLM
from optimum.neuron.pipelines import pipeline
Expand Down Expand Up @@ -54,6 +55,15 @@ def test_load_no_parameters(inf_decoder_path):
_test_generation(p)


@is_inferentia_test
@requires_neuronx
def test_from_model_and_tokenizer(inf_decoder_path):
m = NeuronModelForCausalLM.from_pretrained(inf_decoder_path)
t = AutoTokenizer.from_pretrained(inf_decoder_path)
p = pipeline("text-generation", model=m, tokenizer=t)
_test_generation(p)


@is_inferentia_test
@requires_neuronx
def test_error_already_exported(inf_decoder_path):
Expand Down
Loading