Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ipex doc #828

Merged
merged 32 commits into from
Aug 27, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
d7b0fc4
change readme, source/index, source/installation
jiqing-feng Jul 16, 2024
78f7c61
add ipex doc 1st step
jiqing-feng Jul 17, 2024
b531a72
update readme for command line usage
jiqing-feng Jul 17, 2024
90d9000
fix bug for ipex readme
jiqing-feng Jul 17, 2024
b39be97
add export doc
jiqing-feng Jul 17, 2024
a90cb23
update all ipex docs
jiqing-feng Jul 17, 2024
e884158
rm diffusers
jiqing-feng Jul 17, 2024
2100cd9
change register
jiqing-feng Jul 17, 2024
84305bc
Update README.md
jiqing-feng Jul 17, 2024
23f8756
Update docs/source/installation.mdx
jiqing-feng Jul 17, 2024
644d197
fix readme
jiqing-feng Jul 17, 2024
fde311c
fix ipex exporter args comments
jiqing-feng Jul 17, 2024
a9a2c38
extend ipex export explain
jiqing-feng Jul 17, 2024
4368205
fix ipex reference.mdx
jiqing-feng Jul 18, 2024
c31696d
add comments for auto doc
jiqing-feng Jul 18, 2024
c5412da
rm cli export
jiqing-feng Jul 18, 2024
291c73d
Update optimum/commands/export/ipex.py
jiqing-feng Jul 18, 2024
8772c51
rm commit hash in export command
jiqing-feng Jul 18, 2024
39f27dd
rm export
jiqing-feng Jul 22, 2024
1d8fc29
rm jit
jiqing-feng Jul 22, 2024
02fa235
add ipex on doc's docker file
jiqing-feng Jul 26, 2024
4ed2620
indicate that ipex model only supports for cpu and the export format …
jiqing-feng Jul 26, 2024
770d82f
Update docs/source/ipex/inference.mdx
jiqing-feng Jul 29, 2024
0a9ce3d
explain patching
jiqing-feng Jul 29, 2024
378144e
rm ipex reference
jiqing-feng Aug 6, 2024
be7097d
Update docs/source/ipex/inference.mdx
echarlaix Aug 26, 2024
21f06cf
Update docs/source/ipex/inference.mdx
echarlaix Aug 26, 2024
ae8143a
Update docs/source/ipex/inference.mdx
echarlaix Aug 26, 2024
9a25ac7
Update docs/source/index.mdx
echarlaix Aug 26, 2024
52bec25
Update docs/source/ipex/inference.mdx
echarlaix Aug 26, 2024
d6153b7
Update docs/source/ipex/models.mdx
echarlaix Aug 26, 2024
8115bf6
Update docs/Dockerfile
echarlaix Aug 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -210,8 +210,14 @@ You can find more examples in the [documentation](https://huggingface.co/docs/op


## IPEX
IPEX export can be used through the Optimum command-line interface:
```bash
optimum-cli export ipex -m gpt2 --torch_dtype bfloat16 ipex-gpt2
```

jiqing-feng marked this conversation as resolved.
Show resolved Hide resolved
To load your IPEX model, you can just replace your `AutoModelForXxx` class with the corresponding `IPEXModelForXxx` class. You can set `export=True` to load a PyTorch checkpoint, export your model via TorchScript and apply IPEX optimizations : both operators optimization (replaced with customized IPEX operators) and graph-level optimization (like operators fusion) will be applied on your model.
```diff
import torch
jiqing-feng marked this conversation as resolved.
Show resolved Hide resolved
from transformers import AutoTokenizer, pipeline
- from transformers import AutoModelForCausalLM
+ from optimum.intel import IPEXModelForCausalLM
Expand All @@ -224,14 +230,18 @@ To load your IPEX model, you can just replace your `AutoModelForXxx` class with
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
results = pipe("He's a dreadful magician and")

+ # You can also use the model exported by Optimum command-line interface
+ exported_model = IPEXModelForCausalLM.from_pretrained("ipex-gpt2")
+ pipe.model = exported_model
jiqing-feng marked this conversation as resolved.
Show resolved Hide resolved
+ results = pipe("He's a dreadful magician and")
```

For more details, please refer to the [documentation](https://intel.github.io/intel-extension-for-pytorch/#introduction).


## Running the examples

Check out the [`examples`](https://github.com/huggingface/optimum-intel/tree/main/examples) directory to see how 🤗 Optimum Intel can be used to optimize models and accelerate inference.
Check out the [`examples`](https://github.com/huggingface/optimum-intel/tree/main/examples) and [`notebooks`](https://github.com/huggingface/optimum-intel/tree/main/notebooks) directory to see how 🤗 Optimum Intel can be used to optimize models and accelerate inference.

Do not forget to install requirements for every example:

Expand Down
17 changes: 17 additions & 0 deletions docs/source/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,5 +30,22 @@
title: Tutorials
isExpanded: false
title: OpenVINO
- sections:
- local: ipex/export
title: Export
- local: ipex/inference
title: Inference
- local: ipex/optimization
title: Optimization
- local: ipex/models
title: Supported Models
- local: ipex/reference
title: Reference
- sections:
- local: ipex/tutorials/notebooks
title: Notebooks
title: Tutorials
isExpanded: false
echarlaix marked this conversation as resolved.
Show resolved Hide resolved
title: IPEX
title: Optimum Intel
isExpanded: false
2 changes: 2 additions & 0 deletions docs/source/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ limitations under the License.

🤗 Optimum Intel is the interface between the 🤗 Transformers and Diffusers libraries and the different tools and libraries provided by Intel to accelerate end-to-end pipelines on Intel architectures.

[Intel Extension for PyTorch](https://intel.github.io/intel-extension-for-pytorch/#introduction) is an open-source library which provides optimizations for both eager mode and graph mode, however, compared to eager mode, graph mode in PyTorch* normally yields better performance from optimization techniques, such as operation fusion.
echarlaix marked this conversation as resolved.
Show resolved Hide resolved

[Intel Neural Compressor](https://www.intel.com/content/www/us/en/developer/tools/oneapi/neural-compressor.html) is an open-source library enabling the usage of the most popular compression techniques such as quantization, pruning and knowledge distillation. It supports automatic accuracy-driven tuning strategies in order for users to easily generate quantized model. The users can easily apply static, dynamic and aware-training quantization approaches while giving an expected accuracy criteria. It also supports different weight pruning techniques enabling the creation of pruned model giving a predefined sparsity target.

[OpenVINO](https://docs.openvino.ai) is an open-source toolkit that enables high performance inference capabilities for Intel CPUs, GPUs, and special DL inference accelerators ([see](https://docs.openvino.ai/2024/about-openvino/compatibility-and-support/supported-devices.html) the full list of supported devices). It is supplied with a set of tools to optimize your models with compression techniques such as quantization, pruning and knowledge distillation. Optimum Intel provides a simple interface to optimize your Transformers and Diffusers models, convert them to the OpenVINO Intermediate Representation (IR) format and run inference using OpenVINO Runtime.
Expand Down
3 changes: 2 additions & 1 deletion docs/source/installation.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ To install the latest release of 🤗 Optimum Intel with the corresponding requi
|:-----------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------|
| [Intel Neural Compressor (INC)](https://www.intel.com/content/www/us/en/developer/tools/oneapi/neural-compressor.html) | `pip install --upgrade --upgrade-strategy eager "optimum[neural-compressor]"`|
| [Intel OpenVINO](https://docs.openvino.ai ) | `pip install --upgrade --upgrade-strategy eager "optimum[openvino]"` |
| [Intel Extension for PyTorch](https://intel.github.io/intel-extension-for-pytorch/#introduction) | `pip install --upgrade --upgrade-strategy eager "optimum[ipex]"` |

The `--upgrade-strategy eager` option is needed to ensure `optimum-intel` is upgraded to the latest version.

Expand All @@ -42,4 +43,4 @@ or to install from source including dependencies:
python -m pip install "optimum-intel[extras]"@git+https://github.com/huggingface/optimum-intel.git
```

where `extras` can be one or more of `neural-compressor`, `openvino`, `nncf`.
where `extras` can be one or more of `ipex`, `neural-compressor`, `openvino`, `nncf`.
126 changes: 126 additions & 0 deletions optimum/commands/export/ipex.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
# Copyright 2024 The HuggingFace Team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Defines the command line for the export with IPEX."""

import logging
import sys
from pathlib import Path
from typing import TYPE_CHECKING, Optional

from huggingface_hub.constants import HUGGINGFACE_HUB_CACHE

from ...exporters import TasksManager
from ..base import BaseOptimumCLICommand, CommandInfo


logger = logging.getLogger(__name__)


if TYPE_CHECKING:
from argparse import ArgumentParser, Namespace, _SubParsersAction


def parse_args_ipex(parser: "ArgumentParser"):
required_group = parser.add_argument_group("Required arguments")
required_group.add_argument(
"-m", "--model", type=str, required=True, help="Model ID on huggingface.co or path on disk to load model from."
)
required_group.add_argument(
"output", type=Path, help="Path indicating the directory where to store the generated IPEX model."
)
optional_group = parser.add_argument_group("Optional arguments")
optional_group.add_argument(
"--task",
default="auto",
help=(
"The task to export the model for. If not specified, the task will be auto-inferred based on the model. Available tasks depend on the model, but are among:"
f" {str(TasksManager.get_all_tasks())}. For decoder models, use `xxx-with-past` to export the model using past key values in the decoder."
),
)
optional_group.add_argument(
"--trust_remote_code",
action="store_true",
help=(
"Allows to use custom code for the modeling hosted in the model repository. This option should only be set for repositories you trust and in which "
"you have read the code, as it will execute on your local machine arbitrary code present in the model repository."
),
)
optional_group.add_argument(
"--library",
type=str,
choices=["transformers", "diffusers", "timm", "sentence_transformers"],
default=None,
help="The library used to load the model before export. If not provided, will attempt to infer the local checkpoint's library",
)
optional_group.add_argument("--revision", default=None, help="model kwargs")
optional_group.add_argument("--token", default=None, help="model kwargs")
optional_group.add_argument("--cache_dir", type=str, default=HUGGINGFACE_HUB_CACHE, help="model kwargs")
optional_group.add_argument("--subfolder", type=str, default="", help="model kwargs")
optional_group.add_argument("--local_files_only", type=bool, default=False, help="model kwargs")
optional_group.add_argument("--force_download", type=bool, default=False, help="model kwargs")
optional_group.add_argument("--commit_hash", default=None, help="model kwargs")
optional_group.add_argument("--torch_dtype", type=str, default="float32", help="model kwargs")


class IPEXExportCommand(BaseOptimumCLICommand):
jiqing-feng marked this conversation as resolved.
Show resolved Hide resolved
COMMAND = CommandInfo(name="ipex", help="Export PyTorch models to IPEX IR.")

def __init__(
self,
subparsers: "_SubParsersAction",
args: Optional["Namespace"] = None,
command: Optional["CommandInfo"] = None,
from_defaults_factory: bool = False,
parser: Optional["ArgumentParser"] = None,
):
super().__init__(
subparsers, args=args, command=command, from_defaults_factory=from_defaults_factory, parser=parser
)
self.args_string = " ".join(sys.argv[3:])

@staticmethod
def parse_args(parser: "ArgumentParser"):
return parse_args_ipex(parser)

def run(self):
import torch

from optimum.intel.ipex.utils import _HEAD_TO_AUTOMODELS

if self.args.torch_dtype == "bfloat16":
torch_dtype = torch.bfloat16
elif self.args.torch_dtype == "float16":
torch_dtype = torch.float16
else:
torch_dtype = torch.float32

model_kwargs = {
"revision": self.args.revision,
"token": self.args.token,
"cache_dir": self.args.cache_dir,
"subfolder": self.args.subfolder,
"local_files_only": self.args.local_files_only,
"force_download": self.args.force_download,
"commit_hash": self.args.commit_hash,
"torch_dtype": torch_dtype,
"trust_remote_code": self.args.trust_remote_code,
}

task = TasksManager.infer_task_from_model(self.args.model) if self.args.task == "auto" else self.args.task
if task not in _HEAD_TO_AUTOMODELS:
raise ValueError(f"{task} is not supported, please choose from {_HEAD_TO_AUTOMODELS}")

model_class = _HEAD_TO_AUTOMODELS[task]
model = eval(model_class).from_pretrained(self.args.model, **model_kwargs)
model.save_pretrained(self.args.output)
19 changes: 19 additions & 0 deletions optimum/commands/register/register_ipex.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Copyright 2023 The HuggingFace Team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from ..export import ExportCommand
from ..export.openvino import OVExportCommand


REGISTER_COMMANDS = [(OVExportCommand, ExportCommand)]
4 changes: 2 additions & 2 deletions optimum/commands/register/register_openvino.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
# limitations under the License.

from ..export import ExportCommand
from ..export.openvino import OVExportCommand
from ..export.ipex import IPEXExportCommand


REGISTER_COMMANDS = [(OVExportCommand, ExportCommand)]
REGISTER_COMMANDS = [(IPEXExportCommand, ExportCommand)]
jiqing-feng marked this conversation as resolved.
Show resolved Hide resolved
4 changes: 4 additions & 0 deletions optimum/intel/ipex/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,12 @@


_HEAD_TO_AUTOMODELS = {
"feature-extraction": "IPEXModel",
"text-generation": "IPEXModelForCausalLM",
"text-classification": "IPEXModelForSequenceClassification",
"token-classification": "IPEXModelForTokenClassification",
"question-answering": "IPEXModelForQuestionAnswering",
"fill-mask": "IPEXModelForMaskedLM",
"image-classification": "IPEXModelForImageClassification",
"audio-classification": "IPEXModelForAudioClassification",
}
Loading