huggingface · echarlaix · Aug 27, 2024 · Jul 16, 2024 · Jul 17, 2024 · Jul 17, 2024
diff --git a/README.md b/README.md
@@ -210,8 +210,14 @@ You can find more examples in the [documentation](https://huggingface.co/docs/op
 
 
 ## IPEX
+IPEX export can be used through the Optimum command-line interface:
+```bash
+optimum-cli export ipex -m gpt2 --torch_dtype bfloat16 ipex-gpt2
+```
+
 To load your IPEX model, you can just replace your `AutoModelForXxx` class with the corresponding `IPEXModelForXxx` class. You can set `export=True` to load a PyTorch checkpoint, export your model via TorchScript and apply IPEX optimizations : both operators optimization (replaced with customized IPEX operators) and graph-level optimization (like operators fusion) will be applied on your model.
 ```diff
+  import torch
   from transformers import AutoTokenizer, pipeline
 - from transformers import AutoModelForCausalLM
 + from optimum.intel import IPEXModelForCausalLM
@@ -224,14 +230,18 @@ To load your IPEX model, you can just replace your `AutoModelForXxx` class with
   pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
   results = pipe("He's a dreadful magician and")
 
++ # You can also use the model exported by Optimum command-line interface
++ exported_model = IPEXModelForCausalLM.from_pretrained("ipex-gpt2")
++ pipe.model = exported_model
++ results = pipe("He's a dreadful magician and")
 ```
 
 For more details, please refer to the [documentation](https://intel.github.io/intel-extension-for-pytorch/#introduction).
 
 
 ## Running the examples
 
-Check out the [`examples`](https://github.com/huggingface/optimum-intel/tree/main/examples) directory to see how 🤗 Optimum Intel can be used to optimize models and accelerate inference.
+Check out the [`examples`](https://github.com/huggingface/optimum-intel/tree/main/examples) and [`notebooks`](https://github.com/huggingface/optimum-intel/tree/main/notebooks) directory to see how 🤗 Optimum Intel can be used to optimize models and accelerate inference.
 
 Do not forget to install requirements for every example:
 

diff --git a/docs/source/_toctree.yml b/docs/source/_toctree.yml
@@ -30,5 +30,22 @@
       title: Tutorials
       isExpanded: false
     title: OpenVINO
+    - sections:
+    - local: ipex/export
+      title: Export
+    - local: ipex/inference
+      title: Inference
+    - local: ipex/optimization
+      title: Optimization
+    - local: ipex/models
+      title: Supported Models
+    - local: ipex/reference
+      title: Reference
+    - sections:
+      - local: ipex/tutorials/notebooks
+        title: Notebooks
+      title: Tutorials
+      isExpanded: false
+    title: IPEX
   title: Optimum Intel
   isExpanded: false
diff --git a/docs/source/index.mdx b/docs/source/index.mdx
@@ -19,6 +19,8 @@ limitations under the License.
 
 🤗 Optimum Intel is the interface between the 🤗 Transformers and Diffusers libraries and the different tools and libraries provided by Intel to accelerate end-to-end pipelines on Intel architectures.
 
+[Intel Extension for PyTorch](https://intel.github.io/intel-extension-for-pytorch/#introduction) is an open-source library which provides optimizations for both eager mode and graph mode, however, compared to eager mode, graph mode in PyTorch* normally yields better performance from optimization techniques, such as operation fusion.
+
 [Intel Neural Compressor](https://www.intel.com/content/www/us/en/developer/tools/oneapi/neural-compressor.html) is an open-source library enabling the usage of the most popular compression techniques such as quantization, pruning and knowledge distillation. It supports automatic accuracy-driven tuning strategies in order for users to easily generate quantized model. The users can easily apply static, dynamic and aware-training quantization approaches while giving an expected accuracy criteria. It also supports different weight pruning techniques enabling the creation of pruned model giving a predefined sparsity target.
 
 [OpenVINO](https://docs.openvino.ai) is an open-source toolkit that enables high performance inference capabilities for Intel CPUs, GPUs, and special DL inference accelerators ([see](https://docs.openvino.ai/2024/about-openvino/compatibility-and-support/supported-devices.html) the full list of supported devices). It is supplied with a set of tools to optimize your models with compression techniques such as quantization, pruning and knowledge distillation. Optimum Intel provides a simple interface to optimize your Transformers and Diffusers models, convert them to the OpenVINO Intermediate Representation (IR) format and run inference using OpenVINO Runtime.

diff --git a/docs/source/installation.mdx b/docs/source/installation.mdx
@@ -22,6 +22,7 @@ To install the latest release of 🤗 Optimum Intel with the corresponding requi
 |:-----------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------|
 | [Intel Neural Compressor (INC)](https://www.intel.com/content/www/us/en/developer/tools/oneapi/neural-compressor.html) | `pip install --upgrade --upgrade-strategy eager "optimum[neural-compressor]"`|
 | [Intel OpenVINO](https://docs.openvino.ai )                                                                            | `pip install --upgrade --upgrade-strategy eager "optimum[openvino]"`         |
+| [Intel Extension for PyTorch](https://intel.github.io/intel-extension-for-pytorch/#introduction)                       | `pip install --upgrade --upgrade-strategy eager "optimum[ipex]"`         |
 
 The `--upgrade-strategy eager` option is needed to ensure `optimum-intel` is upgraded to the latest version.
 
@@ -42,4 +43,4 @@ or to install from source including dependencies:
 python -m pip install "optimum-intel[extras]"@git+https://github.com/huggingface/optimum-intel.git
 ```
 
-where `extras` can be one or more of `neural-compressor`, `openvino`, `nncf`.
+where `extras` can be one or more of `ipex`, `neural-compressor`, `openvino`, `nncf`.
diff --git a/optimum/commands/export/ipex.py b/optimum/commands/export/ipex.py
@@ -0,0 +1,126 @@
+# Copyright 2024 The HuggingFace Team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+"""Defines the command line for the export with IPEX."""
+
+import logging
+import sys
+from pathlib import Path
+from typing import TYPE_CHECKING, Optional
+
+from huggingface_hub.constants import HUGGINGFACE_HUB_CACHE
+
+from ...exporters import TasksManager
+from ..base import BaseOptimumCLICommand, CommandInfo
+
+
+logger = logging.getLogger(__name__)
+
+
+if TYPE_CHECKING:
+    from argparse import ArgumentParser, Namespace, _SubParsersAction
+
+
+def parse_args_ipex(parser: "ArgumentParser"):
+    required_group = parser.add_argument_group("Required arguments")
+    required_group.add_argument(
+        "-m", "--model", type=str, required=True, help="Model ID on huggingface.co or path on disk to load model from."
+    )
+    required_group.add_argument(
+        "output", type=Path, help="Path indicating the directory where to store the generated IPEX model."
+    )
+    optional_group = parser.add_argument_group("Optional arguments")
+    optional_group.add_argument(
+        "--task",
+        default="auto",
+        help=(
+            "The task to export the model for. If not specified, the task will be auto-inferred based on the model. Available tasks depend on the model, but are among:"
+            f" {str(TasksManager.get_all_tasks())}. For decoder models, use `xxx-with-past` to export the model using past key values in the decoder."
+        ),
+    )
+    optional_group.add_argument(
+        "--trust_remote_code",
+        action="store_true",
+        help=(
+            "Allows to use custom code for the modeling hosted in the model repository. This option should only be set for repositories you trust and in which "
+            "you have read the code, as it will execute on your local machine arbitrary code present in the model repository."
+        ),
+    )
+    optional_group.add_argument(
+        "--library",
+        type=str,
+        choices=["transformers", "diffusers", "timm", "sentence_transformers"],
+        default=None,
+        help="The library used to load the model before export. If not provided, will attempt to infer the local checkpoint's library",
+    )
+    optional_group.add_argument("--revision", default=None, help="model kwargs")
+    optional_group.add_argument("--token", default=None, help="model kwargs")
+    optional_group.add_argument("--cache_dir", type=str, default=HUGGINGFACE_HUB_CACHE, help="model kwargs")
+    optional_group.add_argument("--subfolder", type=str, default="", help="model kwargs")
+    optional_group.add_argument("--local_files_only", type=bool, default=False, help="model kwargs")
+    optional_group.add_argument("--force_download", type=bool, default=False, help="model kwargs")
+    optional_group.add_argument("--commit_hash", default=None, help="model kwargs")
+    optional_group.add_argument("--torch_dtype", type=str, default="float32", help="model kwargs")
+
+
+class IPEXExportCommand(BaseOptimumCLICommand):
+    COMMAND = CommandInfo(name="ipex", help="Export PyTorch models to IPEX IR.")
+
+    def __init__(
+        self,
+        subparsers: "_SubParsersAction",
+        args: Optional["Namespace"] = None,
+        command: Optional["CommandInfo"] = None,
+        from_defaults_factory: bool = False,
+        parser: Optional["ArgumentParser"] = None,
+    ):
+        super().__init__(
+            subparsers, args=args, command=command, from_defaults_factory=from_defaults_factory, parser=parser
+        )
+        self.args_string = " ".join(sys.argv[3:])
+
+    @staticmethod
+    def parse_args(parser: "ArgumentParser"):
+        return parse_args_ipex(parser)
+
+    def run(self):
+        import torch
+
+        from optimum.intel.ipex.utils import _HEAD_TO_AUTOMODELS
+
+        if self.args.torch_dtype == "bfloat16":
+            torch_dtype = torch.bfloat16
+        elif self.args.torch_dtype == "float16":
+            torch_dtype = torch.float16
+        else:
+            torch_dtype = torch.float32
+
+        model_kwargs = {
+            "revision": self.args.revision,
+            "token": self.args.token,
+            "cache_dir": self.args.cache_dir,
+            "subfolder": self.args.subfolder,
+            "local_files_only": self.args.local_files_only,
+            "force_download": self.args.force_download,
+            "commit_hash": self.args.commit_hash,
+            "torch_dtype": torch_dtype,
+            "trust_remote_code": self.args.trust_remote_code,
+        }
+
+        task = TasksManager.infer_task_from_model(self.args.model) if self.args.task == "auto" else self.args.task
+        if task not in _HEAD_TO_AUTOMODELS:
+            raise ValueError(f"{task} is not supported, please choose from {_HEAD_TO_AUTOMODELS}")
+
+        model_class = _HEAD_TO_AUTOMODELS[task]
+        model = eval(model_class).from_pretrained(self.args.model, **model_kwargs)
+        model.save_pretrained(self.args.output)
diff --git a/optimum/commands/register/register_ipex.py b/optimum/commands/register/register_ipex.py
@@ -0,0 +1,19 @@
+# Copyright 2023 The HuggingFace Team. All rights reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from ..export import ExportCommand
+from ..export.openvino import OVExportCommand
+
+
+REGISTER_COMMANDS = [(OVExportCommand, ExportCommand)]
diff --git a/optimum/commands/register/register_openvino.py b/optimum/commands/register/register_openvino.py
@@ -13,7 +13,7 @@
 # limitations under the License.
 
 from ..export import ExportCommand
-from ..export.openvino import OVExportCommand
+from ..export.ipex import IPEXExportCommand
 
 
-REGISTER_COMMANDS = [(OVExportCommand, ExportCommand)]
+REGISTER_COMMANDS = [(IPEXExportCommand, ExportCommand)]
diff --git a/optimum/intel/ipex/utils.py b/optimum/intel/ipex/utils.py
@@ -14,8 +14,12 @@
 
 
 _HEAD_TO_AUTOMODELS = {
+    "feature-extraction": "IPEXModel",
     "text-generation": "IPEXModelForCausalLM",
     "text-classification": "IPEXModelForSequenceClassification",
     "token-classification": "IPEXModelForTokenClassification",
     "question-answering": "IPEXModelForQuestionAnswering",
+    "fill-mask": "IPEXModelForMaskedLM",
+    "image-classification": "IPEXModelForImageClassification",
+    "audio-classification": "IPEXModelForAudioClassification",
 }