Swin-transformer Inference best known configurations with Intel® Extension for PyTorch.
Use Case | Framework | Model Repo | Branch/Commit/Tag | Optional Patch |
---|---|---|---|---|
Inference | PyTorch | https://github.com/microsoft/Swin-Transformer | main/afeb877fba1139dfbc186276983af2abb02c2196 | - |
-
Host has Intel® Data Center GPU Flex Series
-
Host has installed latest Intel® Data Center GPU Flex Series Drivers https://dgpu-docs.intel.com/driver/installation.html
-
The following Intel® oneAPI Base Toolkit components are required:
- Intel® oneAPI DPC++ Compiler (Placeholder DPCPPROOT as its installation path)
- Intel® oneAPI Math Kernel Library (oneMKL) (Placeholder MKLROOT as its installation path)
- Intel® oneAPI MPI Library
- Intel® oneAPI TBB Library
Follow instructions at Intel® oneAPI Base Toolkit Download page to setup the package manager repository.
ImageNet is recommended, the download link is https://image-net.org/challenges/LSVRC/2012/2012-downloads.php.
We recommend downloading ILSVRC2012_img_train_t3.tar. To create the training director you can extract like so:
mkdir imagenet
cp ILSVRC2012_img_train_t3.tar imagenet/
cd imagenet
tar -xvf ILSVRC2012_img_train_t3.tar --one-top-level=images
find images/ -name '*.tar' -execdir tar -xvf '{}' --one-top-level \;
You will need to reorganize the files in this package into training, testing and validation directories. The easiest way to do that is with the split-packages
python tool.
pip install split-folders
rm images/*.tar
split-folders --ratio .8 .1 .1 --output dataset images
The imagenet/dataset directory is your dataset directory.
Alternatively, you can edit run_models.sh to include the parameter
--dummy
when it invokes main_no_ddp.py.
This flag will use randomly generated data instead. This will not create any actual useful machine learning, but allows you to get up and running quickly and easily without any datasets downloaded and run split-packages or anything like that.
git clone https://github.com/IntelAI/models.git
cd models/models_v2/pytorch/swin-transformer/inference/gpu
- Create virtual environment
venv
and activate it:python3 -m venv venv . ./venv/bin/activate
- Run setup.sh
source ./setup.sh
- Install the latest GPU versions of torch, torchvision and intel_extension_for_pytorch:
python -m pip install torch==<torch_version> torchvision==<torchvision_version> intel-extension-for-pytorch==<ipex_version> --extra-index-url https://pytorch-extension.intel.com/release-whl-aitools/
- Set environment variables for Intel® oneAPI Base Toolkit:
Default installation location
{ONEAPI_ROOT}
is/opt/intel/oneapi
for root account,${HOME}/intel/oneapi
for other accountssource {ONEAPI_ROOT}/compiler/latest/env/vars.sh source {ONEAPI_ROOT}/mkl/latest/env/vars.sh source {ONEAPI_ROOT}/tbb/latest/env/vars.sh source {ONEAPI_ROOT}/mpi/latest/env/vars.sh source {ONEAPI_ROOT}/ccl/latest/env/vars.sh
- Setup required environment paramaters
Parameter | export command |
---|---|
MULTI_TILE | export MULTI_TILE=False |
PLATFORM | export PLATFORM=Flex (Flex) |
DATASET_DIR | export DATASET_DIR= |
OUTPUT_DIR | export OUTPUT_DIR=$PWD |
BATCH_SIZE (optional) | export BATCH_SIZE=512 |
NUM_ITERATIONS (optional) | export NUM_ITERATIONS=500 |
- Run
run_model.sh
Single-tile output will typically looks like:
[2023-12-28 07:17:30 swin_base_patch4_window7_224](main_no_ddp.py 383): INFO Latency: 1.199571
[2023-12-28 07:17:30 swin_base_patch4_window7_224](main_no_ddp.py 384): INFO Throughput: 426.819398
Final results of the inference run can be found in results.yaml
file.
results:
- key: throughput
value: 426.8194
unit: fps
- key: latency
value: 1.1995705968359012
unit: s
- key: accuracy
value: 7.535
unit: loss