This document is used to list steps of reproducing PyTorch DLRM tuning zoo result. and original DLRM README is in DLRM README
Note
Please ensure your PC have >370G memory to run DLRM IPEX version >= 1.10
PyTorch 1.11 or higher version is needed with pytorch_fx backend.
# Install dependency
cd examples/pytorch/recommendation/dlrm/quantization/ptq/ipex
pip install -r requirements.txt
The code supports interface with the Criteo Terabyte Dataset
- download the raw data files day_0.gz, ...,day_23.gz and unzip them.
- Specify the location of the unzipped text files day_0, ...,day_23, using --raw-data-file=<path/day> (the day number will be appended automatically), please refer "Run" command.
Download the DLRM PyTorch weights (tb00_40M.pt
, 90GB) from the
MLPerf repo
cd examples/pytorch/recommendation/dlrm/quantization/ptq/ipex
bash run_tuning.sh --input_model="/path/of/pretrained/model" --dataset_location="/path/of/dataset"
bash run_benchmark.sh --input_model="/path/of/pretrained/model" --dataset_location="/path/of/dataset" --mode=accuracy --int8=true
This is a tutorial of how to enable DLRM model with Intel® Neural Compressor.
Intel® Neural Compressor supports two usages:
-
User specifies fp32 'model', calibration dataset 'q_dataloader', evaluation dataset "eval_dataloader" and metrics in tuning.metrics field of model-specific yaml config file.
-
User specifies fp32 'model', calibration dataset 'q_dataloader' and a custom "eval_func" which encapsulates the evaluation dataset and metrics by itself.
Here we used the second use case.
In examples directory, there is conf.yaml. We could remove most of the items and only keep mandatory item for tuning.
model:
name: dlrm
framework: pytorch_ipex
device: cpu
quantization:
calibration:
sampling_size: 102400
tuning:
accuracy_criterion:
relative: 0.01
exit_policy:
timeout: 0
random_seed: 9527
Here we set accuracy target as tolerating 0.01 relative accuracy loss of baseline. The default tuning strategy is basic strategy. The timeout 0 means early stop as well as a tuning config meet accuracy target.
Note : Intel® Neural Compressor does NOT support "mse" tuning strategy for pytorch framework
We need update dlrm_s_pytorch.py like below
class DLRM_DataLoader(object):
def __init__(self, loader=None):
self.loader = loader
self.batch_size = loader.dataset.batch_size
def __iter__(self):
for X_test, lS_o_test, lS_i_test, T in self.loader:
yield (X_test, lS_o_test, lS_i_test), T
from neural_compressor.experimental import Quantization, common
def eval_func(model):
args.int8 = False if model.ipex_config_path is None else True
args.int8_configure = "" \
if model.ipex_config_path is None else model.ipex_config_path
with torch.no_grad():
return inference(
args,
model,
best_acc_test,
best_auc_test,
test_ld,
trace=args.int8
)
assert args.inference_only, "Please set inference_only in arguments"
quantizer = Quantization("./conf_ipex.yaml")
quantizer.model = common.Model(dlrm)
quantizer.calib_dataloader = DLRM_DataLoader(train_ld)
quantizer.eval_func = eval_func
q_model = quantizer.fit()
q_model.save("/path/to/save/results")
return