diff --git a/docs/en_US/CommunitySharings/ModelCompressionComparison.md b/docs/en_US/CommunitySharings/ModelCompressionComparison.md new file mode 100644 index 0000000000..ba273f9581 --- /dev/null +++ b/docs/en_US/CommunitySharings/ModelCompressionComparison.md @@ -0,0 +1,89 @@ +# Comparison of Filter Pruning Algorithms + +To provide an initial insight into the performance of various filter pruning algorithms, +we conduct extensive experiments with various pruning algorithms on some benchmark models and datasets. +We present the experiment result in this document. +In addition, we provide friendly instructions on the re-implementation of these experiments to facilitate further contributions to this effort. + +## Experiment Setting + +The experiments are performed with the following pruners/datasets/models: + +* Models: [VGG16, ResNet18, ResNet50](https://github.com/microsoft/nni/tree/master/examples/model_compress/models/cifar10) + +* Datasets: CIFAR-10 + +* Pruners: + - These pruners are included: + - Pruners with scheduling : `SimulatedAnnealing Pruner`, `NetAdapt Pruner`, `AutoCompress Pruner`. + Given the overal sparsity requirement, these pruners can automatically generate a sparsity distribution among different layers. + - One-shot pruners: `L1Filter Pruner`, `L2Filter Pruner`, `FPGM Pruner`. + The sparsity of each layer is set the same as the overall sparsity in this experiment. + - Only **filter pruning** performances are compared here. + + For the pruners with scheduling, `L1Filter Pruner` is used as the base algorithm. That is to say, after the sparsities distribution is decided by the scheduling algorithm, `L1Filter Pruner` is used to performn real pruning. + + - All the pruners listed above are implemented in [nni](https://github.com/microsoft/nni/tree/master/docs/en_US/Compressor/Overview.md). + +## Experiment Result + +For each dataset/model/pruner combination, we prune the model to different levels by setting a series of target sparsities for the pruner. + +Here we plot both **Number of Weights - Performances** curve and **FLOPs - Performance** curve. +As a reference, we also plot the result declared in the paper [AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates](http://arxiv.org/abs/1907.03141) for models VGG16 and ResNet18 on CIFAR-10. + +The experiment result are shown in the following figures: + +CIFAR-10, VGG16: + +![](../../../examples/model_compress/comparison_of_pruners/img/performance_comparison_vgg16.png) + +CIFAR-10, ResNet18: + +![](../../../examples/model_compress/comparison_of_pruners/img/performance_comparison_resnet18.png) + +CIFAR-10, ResNet50: + +![](../../../examples/model_compress/comparison_of_pruners/img/performance_comparison_resnet50.png) + +## Analysis + +From the experiment result, we get the following conclusions: + +* Given the constraint on the number of parameters, the pruners with scheduling ( `AutoCompress Pruner` , `SimualatedAnnealing Pruner` ) performs better than the others when the constraint is strict. However, they have no such advantage in FLOPs/Performances comparison since only number of parameters constraint is considered in the optimization process; +* The basic algorithms `L1Filter Pruner` , `L2Filter Pruner` , `FPGM Pruner` performs very similarly in these experiments; +* `NetAdapt Pruner` can not achieve very high compression rate. This is caused by its mechanism that it prunes only one layer each pruning iteration. This leads to un-acceptable complexity if the sparsity per iteration is much lower than the overall sparisity constraint. + +## Experiments Reproduction + +### Implementation Details + +* The experiment results are all collected with the default configuration of the pruners in nni, which means that when we call a pruner class in nni, we don't change any default class arguments. + +* Both FLOPs and the number of parameters are counted with [Model FLOPs/Parameters Counter](https://github.com/microsoft/nni/blob/master/docs/en_US/Compressor/CompressionUtils.md#model-flopsparameters-counter) after [model speed up](https://github.com/microsoft/nni/blob/master/docs/en_US/Compressor/ModelSpeedup.md). This avoids potential issues of counting them of masked models. + +* The experiment code can be found [here]( https://github.com/microsoft/nni/tree/master/examples/model_compress/auto_pruners_torch.py). + +### Experiment Result Rendering + +* If you follow the practice in the [example]( https://github.com/microsoft/nni/tree/master/examples/model_compress/auto_pruners_torch.py), for every single pruning experiment, the experiment result will be saved in JSON format as follows: + ``` json + { + "performance": {"original": 0.9298, "pruned": 0.1, "speedup": 0.1, "finetuned": 0.7746}, + "params": {"original": 14987722.0, "speedup": 167089.0}, + "flops": {"original": 314018314.0, "speedup": 38589922.0} + } + ``` + +* The experiment results are saved [here](https://github.com/microsoft/nni/tree/master/examples/model_compress/experiment_data). +You can refer to [analyze](https://github.com/microsoft/nni/tree/master/examples/model_compress/experiment_data/analyze.py) to plot new performance comparison figures. + +## Contribution + +### TODO Items + +* Pruners constrained by FLOPS/latency +* More pruning algorithms/datasets/models + +### Issues +For algorithm implementation & experiment issues, please [create an issue](https://github.com/microsoft/nni/issues/new/). diff --git a/docs/en_US/CommunitySharings/perf_compare.rst b/docs/en_US/CommunitySharings/perf_compare.rst index b87fd167c8..2b80ccdc6c 100644 --- a/docs/en_US/CommunitySharings/perf_compare.rst +++ b/docs/en_US/CommunitySharings/perf_compare.rst @@ -8,4 +8,5 @@ Performance comparison and analysis can help users decide a proper algorithm (e. :maxdepth: 1 Neural Architecture Search Comparison - Hyper-parameter Tuning Algorithm Comparsion \ No newline at end of file + Hyper-parameter Tuning Algorithm Comparsion + Model Compression Algorithm Comparsion \ No newline at end of file diff --git a/docs/en_US/Compressor/Overview.md b/docs/en_US/Compressor/Overview.md index 2d68496545..73298ee8af 100644 --- a/docs/en_US/Compressor/Overview.md +++ b/docs/en_US/Compressor/Overview.md @@ -42,6 +42,7 @@ Pruning algorithms compress the original network by removing redundant weights o | [SimulatedAnnealing Pruner](https://nni.readthedocs.io/en/latest/Compressor/Pruner.html#simulatedannealing-pruner) | Automatic pruning with a guided heuristic search method, Simulated Annealing algorithm [Reference Paper](https://arxiv.org/abs/1907.03141) | | [AutoCompress Pruner](https://nni.readthedocs.io/en/latest/Compressor/Pruner.html#autocompress-pruner) | Automatic pruning by iteratively call SimulatedAnnealing Pruner and ADMM Pruner [Reference Paper](https://arxiv.org/abs/1907.03141) | +You can refer to this [benchmark](https://github.com/microsoft/nni/tree/master/docs/en_US/Benchmark.md) for the performance of these pruners on some benchmark problems. ### Quantization Algorithms diff --git a/examples/model_compress/auto_pruners_torch.py b/examples/model_compress/auto_pruners_torch.py index 9f0678b6f0..33ecfb8f5f 100644 --- a/examples/model_compress/auto_pruners_torch.py +++ b/examples/model_compress/auto_pruners_torch.py @@ -9,78 +9,81 @@ import json import torch from torch.optim.lr_scheduler import StepLR, MultiStepLR -from torchvision import datasets, transforms, models +from torchvision import datasets, transforms from models.mnist.lenet import LeNet from models.cifar10.vgg import VGG -from nni.compression.torch import L1FilterPruner, SimulatedAnnealingPruner, ADMMPruner, NetAdaptPruner, AutoCompressPruner +from models.cifar10.resnet import ResNet18, ResNet50 +from nni.compression.torch import L1FilterPruner, L2FilterPruner, FPGMPruner +from nni.compression.torch import SimulatedAnnealingPruner, ADMMPruner, NetAdaptPruner, AutoCompressPruner from nni.compression.torch import ModelSpeedup +from nni.compression.torch.utils.counter import count_flops_params -def get_data(args): +def get_data(dataset, data_dir, batch_size, test_batch_size): ''' get data ''' kwargs = {'num_workers': 1, 'pin_memory': True} if torch.cuda.is_available() else { } - if args.dataset == 'mnist': + if dataset == 'mnist': train_loader = torch.utils.data.DataLoader( - datasets.MNIST(args.data_dir, train=True, download=True, + datasets.MNIST(data_dir, train=True, download=True, transform=transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,)) ])), - batch_size=args.batch_size, shuffle=True, **kwargs) + batch_size=batch_size, shuffle=True, **kwargs) val_loader = torch.utils.data.DataLoader( - datasets.MNIST(args.data_dir, train=False, + datasets.MNIST(data_dir, train=False, transform=transforms.Compose([ transforms.ToTensor(), transforms.Normalize((0.1307,), (0.3081,)) ])), - batch_size=args.test_batch_size, shuffle=True, **kwargs) + batch_size=test_batch_size, shuffle=True, **kwargs) criterion = torch.nn.NLLLoss() - elif args.dataset == 'cifar10': + elif dataset == 'cifar10': normalize = transforms.Normalize( (0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)) train_loader = torch.utils.data.DataLoader( - datasets.CIFAR10(args.data_dir, train=True, transform=transforms.Compose([ + datasets.CIFAR10(data_dir, train=True, transform=transforms.Compose([ transforms.RandomHorizontalFlip(), transforms.RandomCrop(32, 4), transforms.ToTensor(), normalize, ]), download=True), - batch_size=args.batch_size, shuffle=True, **kwargs) + batch_size=batch_size, shuffle=True, **kwargs) val_loader = torch.utils.data.DataLoader( - datasets.CIFAR10(args.data_dir, train=False, transform=transforms.Compose([ + datasets.CIFAR10(data_dir, train=False, transform=transforms.Compose([ transforms.ToTensor(), normalize, ])), - batch_size=args.batch_size, shuffle=False, **kwargs) + batch_size=batch_size, shuffle=False, **kwargs) criterion = torch.nn.CrossEntropyLoss() - elif args.dataset == 'imagenet': + elif dataset == 'imagenet': normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) train_loader = torch.utils.data.DataLoader( - datasets.ImageFolder(os.path.join(args.data_dir, 'train'), + datasets.ImageFolder(os.path.join(data_dir, 'train'), transform=transforms.Compose([ transforms.RandomResizedCrop(224), transforms.RandomHorizontalFlip(), transforms.ToTensor(), normalize, ])), - batch_size=args.batch_size, shuffle=True, **kwargs) + batch_size=batch_size, shuffle=True, **kwargs) val_loader = torch.utils.data.DataLoader( - datasets.ImageFolder(os.path.join(args.data_dir, 'val'), + datasets.ImageFolder(os.path.join(data_dir, 'val'), transform=transforms.Compose([ transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), normalize, ])), - batch_size=args.test_batch_size, shuffle=True, **kwargs) + batch_size=test_batch_size, shuffle=True, **kwargs) criterion = torch.nn.CrossEntropyLoss() return train_loader, val_loader, criterion @@ -127,65 +130,91 @@ def test(model, device, criterion, val_loader): return accuracy -def get_trained_model(args, device, train_loader, val_loader, criterion): +def get_trained_model_optimizer(args, device, train_loader, val_loader, criterion): if args.model == 'LeNet': model = LeNet().to(device) - optimizer = torch.optim.Adadelta(model.parameters(), lr=1) - scheduler = StepLR(optimizer, step_size=1, gamma=0.7) - for epoch in range(args.pretrain_epochs): - train(args, model, device, train_loader, - criterion, optimizer, epoch) - scheduler.step() + if args.load_pretrained_model: + model.load_state_dict(torch.load(args.pretrained_model_dir)) + optimizer = torch.optim.Adadelta(model.parameters(), lr=1e-4) + else: + optimizer = torch.optim.Adadelta(model.parameters(), lr=1) + scheduler = StepLR(optimizer, step_size=1, gamma=0.7) elif args.model == 'vgg16': model = VGG(depth=16).to(device) - optimizer = torch.optim.SGD(model.parameters(), lr=0.01, - momentum=0.9, - weight_decay=5e-4) - scheduler = MultiStepLR( - optimizer, milestones=[int(args.pretrain_epochs*0.5), int(args.pretrain_epochs*0.75)], gamma=0.1) - for epoch in range(args.pretrain_epochs): - train(args, model, device, train_loader, - criterion, optimizer, epoch) - scheduler.step() + if args.load_pretrained_model: + model.load_state_dict(torch.load(args.pretrained_model_dir)) + optimizer = torch.optim.SGD(model.parameters(), lr=1e-4, momentum=0.9, weight_decay=5e-4) + else: + optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9, weight_decay=5e-4) + scheduler = MultiStepLR( + optimizer, milestones=[int(args.pretrain_epochs*0.5), int(args.pretrain_epochs*0.75)], gamma=0.1) elif args.model == 'resnet18': - model = models.resnet18(pretrained=False, num_classes=10).to(device) - optimizer = torch.optim.SGD(model.parameters(), lr=0.01, - momentum=0.9, - weight_decay=5e-4) - scheduler = MultiStepLR( - optimizer, milestones=[int(args.pretrain_epochs*0.5), int(args.pretrain_epochs*0.75)], gamma=0.1) + model = ResNet18().to(device) + if args.load_pretrained_model: + model.load_state_dict(torch.load(args.pretrained_model_dir)) + optimizer = torch.optim.SGD(model.parameters(), lr=1e-4, momentum=0.9, weight_decay=5e-4) + else: + optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9, weight_decay=5e-4) + scheduler = MultiStepLR( + optimizer, milestones=[int(args.pretrain_epochs*0.5), int(args.pretrain_epochs*0.75)], gamma=0.1) + elif args.model == 'resnet50': + model = ResNet50().to(device) + if args.load_pretrained_model: + model.load_state_dict(torch.load(args.pretrained_model_dir)) + optimizer = torch.optim.SGD(model.parameters(), lr=1e-4, momentum=0.9, weight_decay=5e-4) + else: + optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9, weight_decay=5e-4) + scheduler = MultiStepLR( + optimizer, milestones=[int(args.pretrain_epochs*0.5), int(args.pretrain_epochs*0.75)], gamma=0.1) + else: + raise ValueError("model not recognized") + + if not args.load_pretrained_model: + best_acc = 0 + best_epoch = 0 for epoch in range(args.pretrain_epochs): - train(args, model, device, train_loader, - criterion, optimizer, epoch) + train(args, model, device, train_loader, criterion, optimizer, epoch) scheduler.step() - elif args.model == 'mobilenet_v2': - model = models.mobilenet_v2(pretrained=True).to(device) - - if args.save_model: - torch.save(model.state_dict(), os.path.join( - args.experiment_data_dir, 'model_trained.pth')) - print('Model trained saved to %s', args.experiment_data_dir) + acc = test(model, device, criterion, val_loader) + if acc > best_acc: + best_acc = acc + best_epoch = epoch + state_dict = model.state_dict() + model.load_state_dict(state_dict) + print('Best acc:', best_acc) + print('Best epoch:', best_epoch) + + if args.save_model: + torch.save(state_dict, os.path.join(args.experiment_data_dir, 'model_trained.pth')) + print('Model trained saved to %s', args.experiment_data_dir) return model, optimizer def get_dummy_input(args, device): if args.dataset == 'mnist': - dummy_input = torch.randn( - [args.test_batch_size, 1, 28, 28]).to(device) + dummy_input = torch.randn([args.test_batch_size, 1, 28, 28]).to(device) elif args.dataset in ['cifar10', 'imagenet']: - dummy_input = torch.randn( - [args.test_batch_size, 3, 32, 32]).to(device) - + dummy_input = torch.randn([args.test_batch_size, 3, 32, 32]).to(device) return dummy_input +def get_input_size(dataset): + if dataset == 'mnist': + input_size = (1, 1, 28, 28) + elif dataset == 'cifar10': + input_size = (1, 3, 32, 32) + elif dataset == 'imagenet': + input_size = (1, 3, 256, 256) + return input_size + + def main(args): # prepare dataset torch.manual_seed(0) device = torch.device("cuda" if torch.cuda.is_available() else "cpu") - train_loader, val_loader, criterion = get_data(args) - model, optimizer = get_trained_model(args, device, train_loader, val_loader, criterion) + train_loader, val_loader, criterion = get_data(args.dataset, args.data_dir, args.batch_size, args.test_batch_size) + model, optimizer = get_trained_model_optimizer(args, device, train_loader, val_loader, criterion) def short_term_fine_tuner(model, epochs=1): for epoch in range(epochs): @@ -198,11 +227,15 @@ def evaluator(model): return test(model, device, criterion, val_loader) # used to save the performance of the original & pruned & finetuned models - result = {} + result = {'flops': {}, 'params': {}, 'performance':{}} + + flops, params = count_flops_params(model, get_input_size(args.dataset)) + result['flops']['original'] = flops + result['params']['original'] = params evaluation_result = evaluator(model) print('Evaluation result (original model): %s' % evaluation_result) - result['original'] = evaluation_result + result['performance']['original'] = evaluation_result # module types to prune, only "Conv2d" supported for channel pruning if args.base_algo in ['l1', 'l2']: @@ -218,6 +251,10 @@ def evaluator(model): if args.pruner == 'L1FilterPruner': pruner = L1FilterPruner(model, config_list) + elif args.pruner == 'L2FilterPruner': + pruner = L2FilterPruner(model, config_list) + elif args.pruner == 'FPGMPruner': + pruner = FPGMPruner(model, config_list) elif args.pruner == 'NetAdaptPruner': pruner = NetAdaptPruner(model, config_list, short_term_fine_tuner=short_term_fine_tuner, evaluator=evaluator, base_algo=args.base_algo, experiment_data_dir=args.experiment_data_dir) @@ -263,99 +300,123 @@ def evaluator(model): experiment_data_dir=args.experiment_data_dir) else: raise ValueError( - "Please use L1FilterPruner, NetAdaptPruner, SimulatedAnnealingPruner, ADMMPruner or AutoCompressPruner in this example.") + "Pruner not supported.") # Pruner.compress() returns the masked model # but for AutoCompressPruner, Pruner.compress() returns directly the pruned model - model_masked = pruner.compress() - evaluation_result = evaluator(model_masked) + model = pruner.compress() + evaluation_result = evaluator(model) print('Evaluation result (masked model): %s' % evaluation_result) - result['pruned'] = evaluation_result + result['performance']['pruned'] = evaluation_result if args.save_model: pruner.export_model( os.path.join(args.experiment_data_dir, 'model_masked.pth'), os.path.join(args.experiment_data_dir, 'mask.pth')) print('Masked model saved to %s', args.experiment_data_dir) + # model speed up + if args.speed_up: + if args.pruner != 'AutoCompressPruner': + if args.model == 'LeNet': + model = LeNet().to(device) + elif args.model == 'vgg16': + model = VGG(depth=16).to(device) + elif args.model == 'resnet18': + model = ResNet18().to(device) + elif args.model == 'resnet50': + model = ResNet50().to(device) + + model.load_state_dict(torch.load(os.path.join(args.experiment_data_dir, 'model_masked.pth'))) + masks_file = os.path.join(args.experiment_data_dir, 'mask.pth') + + m_speedup = ModelSpeedup(model, dummy_input, masks_file, device) + m_speedup.speedup_model() + evaluation_result = evaluator(model) + print('Evaluation result (speed up model): %s' % evaluation_result) + result['performance']['speedup'] = evaluation_result + + torch.save(model.state_dict(), os.path.join(args.experiment_data_dir, 'model_speed_up.pth')) + print('Speed up model saved to %s', args.experiment_data_dir) + flops, params = count_flops_params(model, get_input_size(args.dataset)) + result['flops']['speedup'] = flops + result['params']['speedup'] = params + if args.fine_tune: if args.dataset == 'mnist': - optimizer = torch.optim.Adadelta(model_masked.parameters(), lr=1) - scheduler = StepLR(optimizer, step_size=1, gamma=0.7) - for epoch in range(args.fine_tune_epochs): - train(args, model_masked, device, train_loader, criterion, optimizer, epoch) - scheduler.step() - test(model_masked, device, criterion, val_loader) - elif args.dataset == 'cifar10': - optimizer = torch.optim.SGD(model_masked.parameters(), lr=0.01, momentum=0.9, weight_decay=5e-4) + optimizer = torch.optim.Adadelta(model.parameters(), lr=1) scheduler = StepLR(optimizer, step_size=1, gamma=0.7) - for epoch in range(args.fine_tune_epochs): - train(args, model_masked, device, train_loader, criterion, optimizer, epoch) - scheduler.step() - test(model_masked, device, criterion, val_loader) - elif args.dataset == 'imagenet': - for epoch in range(args.fine_tune_epochs): - optimizer = torch.optim.SGD(model_masked.parameters(), lr=0.05, momentum=0.9, weight_decay=5e-4) - train(args, model_masked, device, train_loader, criterion, optimizer, epoch) - test(model_masked, device, criterion, val_loader) - - evaluation_result = evaluator(model_masked) - print('Evaluation result (fine tuned): %s' % evaluation_result) - result['finetuned'] = evaluation_result + elif args.dataset == 'cifar10' and args.model == 'vgg16': + optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9, weight_decay=5e-4) + scheduler = MultiStepLR( + optimizer, milestones=[int(args.fine_tune_epochs*0.5), int(args.fine_tune_epochs*0.75)], gamma=0.1) + elif args.dataset == 'cifar10' and args.model == 'resnet18': + optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9, weight_decay=5e-4) + scheduler = MultiStepLR( + optimizer, milestones=[int(args.fine_tune_epochs*0.5), int(args.fine_tune_epochs*0.75)], gamma=0.1) + elif args.dataset == 'cifar10' and args.model == 'resnet50': + optimizer = torch.optim.SGD(model.parameters(), lr=0.1, momentum=0.9, weight_decay=5e-4) + scheduler = MultiStepLR( + optimizer, milestones=[int(args.fine_tune_epochs*0.5), int(args.fine_tune_epochs*0.75)], gamma=0.1) + best_acc = 0 + for epoch in range(args.fine_tune_epochs): + train(args, model, device, train_loader, criterion, optimizer, epoch) + scheduler.step() + acc = evaluator(model) + if acc > best_acc: + best_acc = acc + torch.save(model.state_dict(), os.path.join(args.experiment_data_dir, 'model_fine_tuned.pth')) - if args.save_model: - pruner.export_model(os.path.join( - args.experiment_data_dir, 'model_fine_tuned.pth'), os.path.join(args.experiment_data_dir, 'mask.pth')) - print('Fined tuned model saved to %s', args.experiment_data_dir) + print('Evaluation result (fine tuned): %s' % best_acc) + print('Fined tuned model saved to %s', args.experiment_data_dir) + result['performance']['finetuned'] = best_acc - # model speed up - if args.speed_up and args.pruner != 'AutoCompressPruner': - if args.model == 'LeNet': - model = LeNet().to(device) - elif args.model == 'vgg16': - model = VGG(depth=16).to(device) - elif args.model == 'resnet18': - model = models.resnet18(pretrained=False, num_classes=10).to(device) - elif args.model == 'mobilenet_v2': - model = models.mobilenet_v2(pretrained=False).to(device) - - model.load_state_dict(torch.load(os.path.join(args.experiment_data_dir, 'model_fine_tuned.pth'))) - masks_file = os.path.join(args.experiment_data_dir, 'mask.pth') - - m_speedup = ModelSpeedup(model, dummy_input, masks_file, device) - m_speedup.speedup_model() - evaluation_result = evaluator(model) - print('Evaluation result (speed up model): %s' % evaluation_result) - result['speedup'] = evaluation_result - - torch.save(model.state_dict(), os.path.join(args.experiment_data_dir, 'model_speed_up.pth')) - print('Speed up model saved to %s', args.experiment_data_dir) - - with open(os.path.join(args.experiment_data_dir, 'performance.json'), 'w+') as f: + with open(os.path.join(args.experiment_data_dir, 'result.json'), 'w+') as f: json.dump(result, f) if __name__ == '__main__': - def str2bool(v): - if isinstance(v, bool): - return v - if v.lower() in ('yes', 'true', 't', 'y', '1'): + def str2bool(s): + if isinstance(s, bool): + return s + if s.lower() in ('yes', 'true', 't', 'y', '1'): return True - elif v.lower() in ('no', 'false', 'f', 'n', '0'): + if s.lower() in ('no', 'false', 'f', 'n', '0'): return False - else: - raise argparse.ArgumentTypeError('Boolean value expected.') + raise argparse.ArgumentTypeError('Boolean value expected.') parser = argparse.ArgumentParser(description='PyTorch Example for SimulatedAnnealingPruner') + # dataset and model + parser.add_argument('--dataset', type=str, default='cifar10', + help='dataset to use, mnist, cifar10 or imagenet') + parser.add_argument('--data-dir', type=str, default='./data/', + help='dataset directory') + parser.add_argument('--model', type=str, default='vgg16', + help='model to use, LeNet, vgg16, resnet18 or resnet50') + parser.add_argument('--load-pretrained-model', type=str2bool, default=False, + help='whether to load pretrained model') + parser.add_argument('--pretrained-model-dir', type=str, default='./', + help='path to pretrained model') + parser.add_argument('--pretrain-epochs', type=int, default=100, + help='number of epochs to pretrain the model') + parser.add_argument('--batch-size', type=int, default=64, + help='input batch size for training (default: 64)') + parser.add_argument('--test-batch-size', type=int, default=64, + help='input batch size for testing (default: 64)') + parser.add_argument('--fine-tune', type=str2bool, default=True, + help='whether to fine-tune the pruned model') + parser.add_argument('--fine-tune-epochs', type=int, default=5, + help='epochs to fine tune') + parser.add_argument('--experiment-data-dir', type=str, default='./experiment_data', + help='For saving experiment data') + + # pruner parser.add_argument('--pruner', type=str, default='SimulatedAnnealingPruner', - help='pruner to use, L1FilterPruner, NetAdaptPruner, SimulatedAnnealingPruner, ADMMPruner or AutoCompressPruner') + help='pruner to use') parser.add_argument('--base-algo', type=str, default='l1', help='base pruning algorithm. level, l1 or l2') - parser.add_argument('--sparsity', type=float, default=0.3, - help='overall target sparsity') - parser.add_argument('--speed-up', type=str2bool, default=False, - help='Whether to speed-up the pruned model') - + parser.add_argument('--sparsity', type=float, default=0.1, + help='target overall target sparsity') # param for SimulatedAnnealingPruner parser.add_argument('--cool-down-rate', type=float, default=0.9, help='cool down rate') @@ -363,29 +424,16 @@ def str2bool(v): parser.add_argument('--sparsity-per-iteration', type=float, default=0.05, help='sparsity_per_iteration of NetAdaptPruner') - parser.add_argument('--dataset', type=str, default='mnist', - help='dataset to use, mnist, cifar10 or imagenet (default MNIST)') - parser.add_argument('--model', type=str, default='LeNet', - help='model to use, LeNet, vgg16, resnet18 or mobilenet_v2') - parser.add_argument('--fine-tune', type=str2bool, default=True, - help='whether to fine-tune the pruned model') - parser.add_argument('--fine-tune-epochs', type=int, default=10, - help='epochs to fine tune') - parser.add_argument('--data-dir', type=str, default='/datasets/', - help='dataset directory') - parser.add_argument('--experiment-data-dir', type=str, default='./', - help='For saving experiment data') + # speed-up + parser.add_argument('--speed-up', type=str2bool, default=False, + help='Whether to speed-up the pruned model') - parser.add_argument('--batch-size', type=int, default=64, - help='input batch size for training (default: 64)') - parser.add_argument('--test-batch-size', type=int, default=64, - help='input batch size for testing (default: 64)') - parser.add_argument('--pretrain-epochs', type=int, default=1, - help='number of epochs to pretrain the model') + # others parser.add_argument('--log-interval', type=int, default=200, help='how many batches to wait before logging training status') parser.add_argument('--save-model', type=str2bool, default=True, help='For Saving the current Model') + args = parser.parse_args() if not os.path.exists(args.experiment_data_dir): diff --git a/examples/model_compress/comparison_of_pruners/analyze.py b/examples/model_compress/comparison_of_pruners/analyze.py new file mode 100644 index 0000000000..c7cd13f72a --- /dev/null +++ b/examples/model_compress/comparison_of_pruners/analyze.py @@ -0,0 +1,107 @@ +import argparse +import json +import matplotlib.pyplot as plt + + +def plot_performance_comparison(args): + # reference data, performance of the original model and the performance declared in the AutoCompress Paper + references = { + 'original':{ + 'cifar10':{ + 'vgg16':{ + 'performance': 0.9298, + 'params':14987722.0, + 'flops':314018314.0 + }, + 'resnet18':{ + 'performance': 0.9433, + 'params':11173962.0, + 'flops':556651530.0 + }, + 'resnet50':{ + 'performance': 0.9488, + 'params':23520842.0, + 'flops':1304694794.0 + } + } + }, + 'AutoCompressPruner':{ + 'cifar10':{ + 'vgg16':{ + 'performance': 0.9321, + 'params':52.2, # times + 'flops':8.8 + }, + 'resnet18':{ + 'performance': 0.9381, + 'params':54.2, # times + 'flops':12.2 + } + } + } + } + + markers = ['v', '^', '<', '1', '2', '3', '4', '8', '*', '+', 'o'] + + with open('cifar10/comparison_result_{}.json'.format(args.model), 'r') as jsonfile: + result = json.load(jsonfile) + + pruners = result.keys() + + performances = {} + flops = {} + params = {} + sparsities = {} + for pruner in pruners: + performances[pruner] = [val['performance'] for val in result[pruner]] + flops[pruner] = [val['flops'] for val in result[pruner]] + params[pruner] = [val['params'] for val in result[pruner]] + sparsities[pruner] = [val['sparsity'] for val in result[pruner]] + + fig, axs = plt.subplots(2, 1, figsize=(8, 10)) + fig.suptitle('Channel Pruning Comparison on {}/CIFAR10'.format(args.model)) + fig.subplots_adjust(hspace=0.5) + + for idx, pruner in enumerate(pruners): + axs[0].scatter(params[pruner], performances[pruner], marker=markers[idx], label=pruner) + axs[1].scatter(flops[pruner], performances[pruner], marker=markers[idx], label=pruner) + + # references + params_original = references['original']['cifar10'][args.model]['params'] + performance_original = references['original']['cifar10'][args.model]['performance'] + axs[0].plot(params_original, performance_original, 'rx', label='original model') + if args.model in ['vgg16', 'resnet18']: + axs[0].plot(params_original/references['AutoCompressPruner']['cifar10'][args.model]['params'], + references['AutoCompressPruner']['cifar10'][args.model]['performance'], + 'bx', label='AutoCompress Paper') + + axs[0].set_title("Performance v.s. Number of Parameters") + axs[0].set_xlabel("Number of Parameters") + axs[0].set_ylabel('Accuracy') + axs[0].legend() + + # references + flops_original = references['original']['cifar10'][args.model]['flops'] + performance_original = references['original']['cifar10'][args.model]['performance'] + axs[1].plot(flops_original, performance_original, 'rx', label='original model') + if args.model in ['vgg16', 'resnet18']: + axs[1].plot(flops_original/references['AutoCompressPruner']['cifar10'][args.model]['flops'], + references['AutoCompressPruner']['cifar10'][args.model]['performance'], + 'bx', label='AutoCompress Paper') + + axs[1].set_title("Performance v.s. FLOPs") + axs[1].set_xlabel("FLOPs") + axs[1].set_ylabel('Accuracy') + axs[1].legend() + + plt.savefig('img/performance_comparison_{}.png'.format(args.model)) + plt.close() + + +if __name__ == '__main__': + parser = argparse.ArgumentParser(description='PyTorch MNIST Example') + parser.add_argument('--model', type=str, default='vgg16', + help='vgg16, resnet18 or resnet50') + args = parser.parse_args() + + plot_performance_comparison(args) diff --git a/examples/model_compress/comparison_of_pruners/cifar10/comparison_result_resnet18.json b/examples/model_compress/comparison_of_pruners/cifar10/comparison_result_resnet18.json new file mode 100644 index 0000000000..0ef5a6119d --- /dev/null +++ b/examples/model_compress/comparison_of_pruners/cifar10/comparison_result_resnet18.json @@ -0,0 +1,392 @@ +{ + "L1FilterPruner": [ + { + "sparsity": 0.1, + "params": 9642085.0, + "flops": 496882684.0, + "performance": 0.9436 + }, + { + "sparsity": 0.2, + "params": 8149126.0, + "flops": 436381222.0, + "performance": 0.9472 + }, + { + "sparsity": 0.3, + "params": 6705269.0, + "flops": 371666312.0, + "performance": 0.9391 + }, + { + "sparsity": 0.4, + "params": 5335138.0, + "flops": 307050934.0, + "performance": 0.9433 + }, + { + "sparsity": 0.5, + "params": 3998122.0, + "flops": 237900244.0, + "performance": 0.9379 + }, + { + "sparsity": 0.6, + "params": 2767325.0, + "flops": 175308326.0, + "performance": 0.9326 + }, + { + "sparsity": 0.7, + "params": 1617817.0, + "flops": 108532198.0, + "performance": 0.928 + }, + { + "sparsity": 0.8, + "params": 801338.0, + "flops": 53808728.0, + "performance": 0.9145 + }, + { + "sparsity": 0.9, + "params": 229372.0, + "flops": 15304972.0, + "performance": 0.8858 + }, + { + "sparsity": 0.95, + "params": 61337.0, + "flops": 4305146.0, + "performance": 0.8441 + }, + { + "sparsity": 0.975, + "params": 17763.0, + "flops": 1561644.0, + "performance": 0.7294 + } + ], + "L2FilterPruner": [ + { + "sparsity": 0.1, + "params": 9680242.0, + "flops": 497492746.0, + "performance": 0.9423 + }, + { + "sparsity": 0.2, + "params": 8137784.0, + "flops": 436199900.0, + "performance": 0.9471 + }, + { + "sparsity": 0.3, + "params": 6702679.0, + "flops": 369733768.0, + "performance": 0.9415 + }, + { + "sparsity": 0.4, + "params": 5330426.0, + "flops": 305512736.0, + "performance": 0.9411 + }, + { + "sparsity": 0.5, + "params": 3961076.0, + "flops": 236467814.0, + "performance": 0.9349 + }, + { + "sparsity": 0.6, + "params": 2776512.0, + "flops": 175872204.0, + "performance": 0.9393 + }, + { + "sparsity": 0.7, + "params": 1622571.0, + "flops": 107994906.0, + "performance": 0.9295 + }, + { + "sparsity": 0.8, + "params": 797075.0, + "flops": 53534414.0, + "performance": 0.9187 + }, + { + "sparsity": 0.9, + "params": 232153.0, + "flops": 15385078.0, + "performance": 0.8838 + }, + { + "sparsity": 0.95, + "params": 58180.0, + "flops": 4510072.0, + "performance": 0.8396 + }, + { + "sparsity": 0.975, + "params": 16836.0, + "flops": 1429752.0, + "performance": 0.7482 + } + ], + "FPGMPruner": [ + { + "sparsity": 0.1, + "params": 9705680.0, + "flops": 497899454.0, + "performance": 0.9443 + }, + { + "sparsity": 0.2, + "params": 8160468.0, + "flops": 436562544.0, + "performance": 0.946 + }, + { + "sparsity": 0.3, + "params": 6710052.0, + "flops": 367960482.0, + "performance": 0.9452 + }, + { + "sparsity": 0.4, + "params": 5334205.0, + "flops": 306166432.0, + "performance": 0.9412 + }, + { + "sparsity": 0.5, + "params": 4007259.0, + "flops": 237702210.0, + "performance": 0.9385 + }, + { + "sparsity": 0.6, + "params": 2782236.0, + "flops": 175813620.0, + "performance": 0.9304 + }, + { + "sparsity": 0.7, + "params": 1634603.0, + "flops": 108904676.0, + "performance": 0.9249 + }, + { + "sparsity": 0.8, + "params": 799610.0, + "flops": 53645918.0, + "performance": 0.9203 + }, + { + "sparsity": 0.9, + "params": 233644.0, + "flops": 15408784.0, + "performance": 0.8856 + }, + { + "sparsity": 0.95, + "params": 56518.0, + "flops": 4266910.0, + "performance": 0.83 + }, + { + "sparsity": 0.975, + "params": 17610.0, + "flops": 1441836.0, + "performance": 0.7356 + } + ], + "NetAdaptPruner": [ + { + "sparsity": 0.1, + "params": 11173962.0, + "flops": 556651530.0, + "performance": 0.9474 + }, + { + "sparsity": 0.2, + "params": 10454958.0, + "flops": 545147466.0, + "performance": 0.9482 + }, + { + "sparsity": 0.3, + "params": 9299986.0, + "flops": 526681564.0, + "performance": 0.9469 + }, + { + "sparsity": 0.4, + "params": 8137618.0, + "flops": 508087276.0, + "performance": 0.9451 + }, + { + "sparsity": 0.5, + "params": 6267654.0, + "flops": 478185102.0, + "performance": 0.947 + }, + { + "sparsity": 0.6, + "params": 5277444.0, + "flops": 462341742.0, + "performance": 0.9469 + }, + { + "sparsity": 0.7, + "params": 4854190.0, + "flops": 455580628.0, + "performance": 0.9466 + }, + { + "sparsity": 0.8, + "params": 3531098.0, + "flops": 434411156.0, + "performance": 0.9472 + } + ], + "SimulatedAnnealingPruner": [ + { + "sparsity": 0.1, + "params": 10307424.0, + "flops": 537697098.0, + "performance": 0.942 + }, + { + "sparsity": 0.2, + "params": 9264598.0, + "flops": 513101368.0, + "performance": 0.9456 + }, + { + "sparsity": 0.3, + "params": 7999316.0, + "flops": 489260738.0, + "performance": 0.946 + }, + { + "sparsity": 0.4, + "params": 6996176.0, + "flops": 450768626.0, + "performance": 0.9413 + }, + { + "sparsity": 0.5, + "params": 5412616.0, + "flops": 408698434.0, + "performance": 0.9477 + }, + { + "sparsity": 0.6, + "params": 5106924.0, + "flops": 391735326.0, + "performance": 0.9483 + }, + { + "sparsity": 0.7, + "params": 3032105.0, + "flops": 269777978.0, + "performance": 0.9414 + }, + { + "sparsity": 0.8, + "params": 2423230.0, + "flops": 294783862.0, + "performance": 0.9384 + }, + { + "sparsity": 0.9, + "params": 1151046.0, + "flops": 209639226.0, + "performance": 0.939 + }, + { + "sparsity": 0.95, + "params": 394406.0, + "flops": 108776618.0, + "performance": 0.923 + }, + { + "sparsity": 0.975, + "params": 250649.0, + "flops": 84645050.0, + "performance": 0.917 + } + ], + "AutoCompressPruner": [ + { + "sparsity": 0.1, + "params": 10238286.0, + "flops": 536590794.0, + "performance": 0.9406 + }, + { + "sparsity": 0.2, + "params": 9272049.0, + "flops": 512333916.0, + "performance": 0.9392 + }, + { + "sparsity": 0.3, + "params": 8099915.0, + "flops": 485418056.0, + "performance": 0.9398 + }, + { + "sparsity": 0.4, + "params": 6864547.0, + "flops": 449359492.0, + "performance": 0.9406 + }, + { + "sparsity": 0.5, + "params": 6106994.0, + "flops": 430766432.0, + "performance": 0.9397 + }, + { + "sparsity": 0.6, + "params": 5338096.0, + "flops": 415085278.0, + "performance": 0.9384 + }, + { + "sparsity": 0.7, + "params": 3701330.0, + "flops": 351057878.0, + "performance": 0.938 + }, + { + "sparsity": 0.8, + "params": 2229760.0, + "flops": 269058346.0, + "performance": 0.9388 + }, + { + "sparsity": 0.9, + "params": 1108564.0, + "flops": 189355930.0, + "performance": 0.9348 + }, + { + "sparsity": 0.95, + "params": 616893.0, + "flops": 159314256.0, + "performance": 0.93 + }, + { + "sparsity": 0.975, + "params": 297368.0, + "flops": 113398292.0, + "performance": 0.9072 + } + ] +} \ No newline at end of file diff --git a/examples/model_compress/comparison_of_pruners/cifar10/comparison_result_resnet50.json b/examples/model_compress/comparison_of_pruners/cifar10/comparison_result_resnet50.json new file mode 100644 index 0000000000..dcea274149 --- /dev/null +++ b/examples/model_compress/comparison_of_pruners/cifar10/comparison_result_resnet50.json @@ -0,0 +1,356 @@ +{ + "L1FilterPruner": [ + { + "sparsity": 0.1, + "params": 20378141.0, + "flops": 1134740738.0, + "performance": 0.9456 + }, + { + "sparsity": 0.2, + "params": 17286560.0, + "flops": 966734852.0, + "performance": 0.9433 + }, + { + "sparsity": 0.3, + "params": 14403947.0, + "flops": 807114812.0, + "performance": 0.9396 + }, + { + "sparsity": 0.4, + "params": 11558288.0, + "flops": 656314106.0, + "performance": 0.9402 + }, + { + "sparsity": 0.5, + "params": 8826728.0, + "flops": 507965924.0, + "performance": 0.9394 + }, + { + "sparsity": 0.6, + "params": 6319902.0, + "flops": 374211960.0, + "performance": 0.9372 + }, + { + "sparsity": 0.7, + "params": 4063713.0, + "flops": 246788556.0, + "performance": 0.9304 + }, + { + "sparsity": 0.8, + "params": 2120717.0, + "flops": 133614422.0, + "performance": 0.9269 + }, + { + "sparsity": 0.9, + "params": 652524.0, + "flops": 41973714.0, + "performance": 0.9081 + }, + { + "sparsity": 0.95, + "params": 195468.0, + "flops": 13732020.0, + "performance": 0.8723 + }, + { + "sparsity": 0.975, + "params": 58054.0, + "flops": 4268104.0, + "performance": 0.7941 + } + ], + "L2FilterPruner": [ + { + "sparsity": 0.1, + "params": 20378141.0, + "flops": 1134740738.0, + "performance": 0.9442 + }, + { + "sparsity": 0.2, + "params": 17275244.0, + "flops": 966400928.0, + "performance": 0.9463 + }, + { + "sparsity": 0.3, + "params": 14415409.0, + "flops": 807710914.0, + "performance": 0.9367 + }, + { + "sparsity": 0.4, + "params": 11564310.0, + "flops": 656653008.0, + "performance": 0.9391 + }, + { + "sparsity": 0.5, + "params": 8843266.0, + "flops": 508086256.0, + "performance": 0.9381 + }, + { + "sparsity": 0.6, + "params": 6316815.0, + "flops": 373882614.0, + "performance": 0.9368 + }, + { + "sparsity": 0.7, + "params": 4054272.0, + "flops": 246477678.0, + "performance": 0.935 + }, + { + "sparsity": 0.8, + "params": 2129321.0, + "flops": 134527520.0, + "performance": 0.9275 + }, + { + "sparsity": 0.9, + "params": 667500.0, + "flops": 42927060.0, + "performance": 0.9129 + }, + { + "sparsity": 0.95, + "params": 192464.0, + "flops": 13669430.0, + "performance": 0.8757 + }, + { + "sparsity": 0.975, + "params": 58250.0, + "flops": 4365620.0, + "performance": 0.7978 + } + ], + "FPGMPruner": [ + { + "sparsity": 0.1, + "params": 20401570.0, + "flops": 1135114552.0, + "performance": 0.9438 + }, + { + "sparsity": 0.2, + "params": 17321414.0, + "flops": 967137398.0, + "performance": 0.9427 + }, + { + "sparsity": 0.3, + "params": 14418221.0, + "flops": 807755756.0, + "performance": 0.9422 + }, + { + "sparsity": 0.4, + "params": 11565000.0, + "flops": 655412124.0, + "performance": 0.9403 + }, + { + "sparsity": 0.5, + "params": 8829840.0, + "flops": 506715294.0, + "performance": 0.9355 + }, + { + "sparsity": 0.6, + "params": 6308085.0, + "flops": 374231682.0, + "performance": 0.9359 + }, + { + "sparsity": 0.7, + "params": 4054237.0, + "flops": 246511714.0, + "performance": 0.9285 + }, + { + "sparsity": 0.8, + "params": 2134187.0, + "flops": 134456366.0, + "performance": 0.9275 + }, + { + "sparsity": 0.9, + "params": 665931.0, + "flops": 42859752.0, + "performance": 0.9083 + }, + { + "sparsity": 0.95, + "params": 191590.0, + "flops": 13641052.0, + "performance": 0.8762 + }, + { + "sparsity": 0.975, + "params": 57767.0, + "flops": 4350074.0, + "performance": 0.789 + } + ], + "NetAdaptPruner": [ + { + "sparsity": 0.1, + "params": 22348970.0, + "flops": 1275701258.0, + "performance": 0.9404 + }, + { + "sparsity": 0.2, + "params": 21177162.0, + "flops": 1256952330.0, + "performance": 0.9445 + }, + { + "sparsity": 0.3, + "params": 18407434.0, + "flops": 1212636682.0, + "performance": 0.9433 + }, + { + "sparsity": 0.4, + "params": 16061284.0, + "flops": 1175098282.0, + "performance": 0.9401 + } + ], + "SimulatedAnnealingPruner": [ + { + "sparsity": 0.1, + "params": 20551755.0, + "flops": 1230145122.0, + "performance": 0.9438 + }, + { + "sparsity": 0.2, + "params": 17766048.0, + "flops": 1159924128.0, + "performance": 0.9432 + }, + { + "sparsity": 0.3, + "params": 15105146.0, + "flops": 1094478662.0, + "performance": 0.943 + }, + { + "sparsity": 0.4, + "params": 12378092.0, + "flops": 1008801158.0, + "performance": 0.9398 + }, + { + "sparsity": 0.5, + "params": 9890487.0, + "flops": 911941770.0, + "performance": 0.9426 + }, + { + "sparsity": 0.6, + "params": 7638262.0, + "flops": 831218770.0, + "performance": 0.9412 + }, + { + "sparsity": 0.7, + "params": 5469936.0, + "flops": 691881792.0, + "performance": 0.9405 + }, + { + "sparsity": 0.8, + "params": 3668951.0, + "flops": 580850666.0, + "performance": 0.941 + }, + { + "sparsity": 0.9, + "params": 1765284.0, + "flops": 389162310.0, + "performance": 0.9294 + } + ], + "AutoCompressPruner": [ + { + "sparsity": 0.1, + "params": 20660299.0, + "flops": 1228508590.0, + "performance": 0.9337 + }, + { + "sparsity": 0.2, + "params": 17940465.0, + "flops": 1152868146.0, + "performance": 0.9326 + }, + { + "sparsity": 0.3, + "params": 15335831.0, + "flops": 1084996094.0, + "performance": 0.9348 + }, + { + "sparsity": 0.4, + "params": 12821408.0, + "flops": 991305524.0, + "performance": 0.936 + }, + { + "sparsity": 0.5, + "params": 10695425.0, + "flops": 919638860.0, + "performance": 0.9349 + }, + { + "sparsity": 0.6, + "params": 8536821.0, + "flops": 802011678.0, + "performance": 0.9339 + }, + { + "sparsity": 0.7, + "params": 7276898.0, + "flops": 744248114.0, + "performance": 0.9337 + }, + { + "sparsity": 0.8, + "params": 5557721.0, + "flops": 643881710.0, + "performance": 0.9323 + }, + { + "sparsity": 0.9, + "params": 3925140.0, + "flops": 512545272.0, + "performance": 0.9304 + }, + { + "sparsity": 0.95, + "params": 2867004.0, + "flops": 365184762.0, + "performance": 0.9263 + }, + { + "sparsity": 0.975, + "params": 1773257.0, + "flops": 229320266.0, + "performance": 0.9175 + } + ] +} \ No newline at end of file diff --git a/examples/model_compress/comparison_of_pruners/cifar10/comparison_result_vgg16.json b/examples/model_compress/comparison_of_pruners/cifar10/comparison_result_vgg16.json new file mode 100644 index 0000000000..9e476488c1 --- /dev/null +++ b/examples/model_compress/comparison_of_pruners/cifar10/comparison_result_vgg16.json @@ -0,0 +1,392 @@ +{ + "L1FilterPruner": [ + { + "sparsity": 0.1, + "params": 12187336.0, + "flops": 256252606.0, + "performance": 0.9344 + }, + { + "sparsity": 0.2, + "params": 9660216.0, + "flops": 203049930.0, + "performance": 0.9371 + }, + { + "sparsity": 0.3, + "params": 7435417.0, + "flops": 155477470.0, + "performance": 0.9341 + }, + { + "sparsity": 0.4, + "params": 5493954.0, + "flops": 114721578.0, + "performance": 0.9317 + }, + { + "sparsity": 0.5, + "params": 3820010.0, + "flops": 79155722.0, + "performance": 0.9309 + }, + { + "sparsity": 0.6, + "params": 2478632.0, + "flops": 51618494.0, + "performance": 0.9229 + }, + { + "sparsity": 0.7, + "params": 1420600.0, + "flops": 29455306.0, + "performance": 0.9031 + }, + { + "sparsity": 0.8, + "params": 658553.0, + "flops": 13290974.0, + "performance": 0.8756 + }, + { + "sparsity": 0.9, + "params": 186178.0, + "flops": 3574570.0, + "performance": 0.8145 + }, + { + "sparsity": 0.95, + "params": 58680.0, + "flops": 1050570.0, + "performance": 0.6983 + }, + { + "sparsity": 0.975, + "params": 23408.0, + "flops": 329918.0, + "performance": 0.5573 + } + ], + "L2FilterPruner": [ + { + "sparsity": 0.1, + "params": 12187336.0, + "flops": 256252606.0, + "performance": 0.9357 + }, + { + "sparsity": 0.2, + "params": 9660216.0, + "flops": 203049930.0, + "performance": 0.9355 + }, + { + "sparsity": 0.3, + "params": 7435417.0, + "flops": 155477470.0, + "performance": 0.9337 + }, + { + "sparsity": 0.4, + "params": 5493954.0, + "flops": 114721578.0, + "performance": 0.9308 + }, + { + "sparsity": 0.5, + "params": 3820010.0, + "flops": 79155722.0, + "performance": 0.9285 + }, + { + "sparsity": 0.6, + "params": 2478632.0, + "flops": 51618494.0, + "performance": 0.9208 + }, + { + "sparsity": 0.7, + "params": 1420600.0, + "flops": 29455306.0, + "performance": 0.909 + }, + { + "sparsity": 0.8, + "params": 658553.0, + "flops": 13290974.0, + "performance": 0.8698 + }, + { + "sparsity": 0.9, + "params": 186178.0, + "flops": 3574570.0, + "performance": 0.8203 + }, + { + "sparsity": 0.95, + "params": 58680.0, + "flops": 1050570.0, + "performance": 0.7063 + }, + { + "sparsity": 0.975, + "params": 23408.0, + "flops": 329918.0, + "performance": 0.5455 + } + ], + "FPGMPruner": [ + { + "sparsity": 0.1, + "params": 12187336.0, + "flops": 256252606.0, + "performance": 0.937 + }, + { + "sparsity": 0.2, + "params": 9660216.0, + "flops": 203049930.0, + "performance": 0.936 + }, + { + "sparsity": 0.3, + "params": 7435417.0, + "flops": 155477470.0, + "performance": 0.9359 + }, + { + "sparsity": 0.4, + "params": 5493954.0, + "flops": 114721578.0, + "performance": 0.9302 + }, + { + "sparsity": 0.5, + "params": 3820010.0, + "flops": 79155722.0, + "performance": 0.9233 + }, + { + "sparsity": 0.6, + "params": 2478632.0, + "flops": 51618494.0, + "performance": 0.922 + }, + { + "sparsity": 0.7, + "params": 1420600.0, + "flops": 29455306.0, + "performance": 0.9022 + }, + { + "sparsity": 0.8, + "params": 658553.0, + "flops": 13290974.0, + "performance": 0.8794 + }, + { + "sparsity": 0.9, + "params": 186178.0, + "flops": 3574570.0, + "performance": 0.8276 + }, + { + "sparsity": 0.95, + "params": 58680.0, + "flops": 1050570.0, + "performance": 0.6967 + }, + { + "sparsity": 0.975, + "params": 23408.0, + "flops": 329918.0, + "performance": 0.3683 + } + ], + "NetAdaptPruner": [ + { + "sparsity": 0.1, + "params": 13492098.0, + "flops": 308484330.0, + "performance": 0.9376 + }, + { + "sparsity": 0.2, + "params": 11998408.0, + "flops": 297641410.0, + "performance": 0.9374 + }, + { + "sparsity": 0.3, + "params": 10504344.0, + "flops": 281928834.0, + "performance": 0.9369 + }, + { + "sparsity": 0.4, + "params": 8263221.0, + "flops": 272964342.0, + "performance": 0.9382 + }, + { + "sparsity": 0.5, + "params": 6769885.0, + "flops": 249070966.0, + "performance": 0.9388 + }, + { + "sparsity": 0.6, + "params": 6022137.0, + "flops": 237106998.0, + "performance": 0.9383 + }, + { + "sparsity": 0.7, + "params": 4526754.0, + "flops": 222152490.0, + "performance": 0.936 + }, + { + "sparsity": 0.8, + "params": 3032759.0, + "flops": 162401210.0, + "performance": 0.9362 + } + ], + "SimulatedAnnealingPruner": [ + { + "sparsity": 0.1, + "params": 12691704.0, + "flops": 301467870.0, + "performance": 0.9366 + }, + { + "sparsity": 0.2, + "params": 10318461.0, + "flops": 275724450.0, + "performance": 0.9362 + }, + { + "sparsity": 0.3, + "params": 8217127.0, + "flops": 246321046.0, + "performance": 0.9371 + }, + { + "sparsity": 0.4, + "params": 6458368.0, + "flops": 232948294.0, + "performance": 0.9378 + }, + { + "sparsity": 0.5, + "params": 4973079.0, + "flops": 217675254.0, + "performance": 0.9362 + }, + { + "sparsity": 0.6, + "params": 3131526.0, + "flops": 151576878.0, + "performance": 0.9347 + }, + { + "sparsity": 0.7, + "params": 1891036.0, + "flops": 76575574.0, + "performance": 0.9289 + }, + { + "sparsity": 0.8, + "params": 1170751.0, + "flops": 107532322.0, + "performance": 0.9325 + }, + { + "sparsity": 0.9, + "params": 365978.0, + "flops": 46241354.0, + "performance": 0.9167 + }, + { + "sparsity": 0.95, + "params": 167089.0, + "flops": 38589922.0, + "performance": 0.7746 + }, + { + "sparsity": 0.975, + "params": 96779.0, + "flops": 26838230.0, + "performance": 0.1 + } + ], + "AutoCompressPruner": [ + { + "sparsity": 0.1, + "params": 12460277.0, + "flops": 290311730.0, + "performance": 0.9352 + }, + { + "sparsity": 0.2, + "params": 10138147.0, + "flops": 269180938.0, + "performance": 0.9324 + }, + { + "sparsity": 0.3, + "params": 8033350.0, + "flops": 241789714.0, + "performance": 0.9357 + }, + { + "sparsity": 0.4, + "params": 6105156.0, + "flops": 213573294.0, + "performance": 0.9367 + }, + { + "sparsity": 0.5, + "params": 4372604.0, + "flops": 185826362.0, + "performance": 0.9387 + }, + { + "sparsity": 0.6, + "params": 3029629.0, + "flops": 166285498.0, + "performance": 0.9334 + }, + { + "sparsity": 0.7, + "params": 1897060.0, + "flops": 134897806.0, + "performance": 0.9359 + }, + { + "sparsity": 0.8, + "params": 1145509.0, + "flops": 111766450.0, + "performance": 0.9334 + }, + { + "sparsity": 0.9, + "params": 362546.0, + "flops": 50777246.0, + "performance": 0.9261 + }, + { + "sparsity": 0.95, + "params": 149735.0, + "flops": 39201770.0, + "performance": 0.8924 + }, + { + "sparsity": 0.975, + "params": 45378.0, + "flops": 13213974.0, + "performance": 0.8193 + } + ] +} \ No newline at end of file diff --git a/examples/model_compress/comparison_of_pruners/img/performance_comparison_resnet18.png b/examples/model_compress/comparison_of_pruners/img/performance_comparison_resnet18.png new file mode 100644 index 0000000000..87a99e85bd Binary files /dev/null and b/examples/model_compress/comparison_of_pruners/img/performance_comparison_resnet18.png differ diff --git a/examples/model_compress/comparison_of_pruners/img/performance_comparison_resnet50.png b/examples/model_compress/comparison_of_pruners/img/performance_comparison_resnet50.png new file mode 100644 index 0000000000..7214a368b0 Binary files /dev/null and b/examples/model_compress/comparison_of_pruners/img/performance_comparison_resnet50.png differ diff --git a/examples/model_compress/comparison_of_pruners/img/performance_comparison_vgg16.png b/examples/model_compress/comparison_of_pruners/img/performance_comparison_vgg16.png new file mode 100644 index 0000000000..93930561b3 Binary files /dev/null and b/examples/model_compress/comparison_of_pruners/img/performance_comparison_vgg16.png differ diff --git a/examples/model_compress/models/cifar10/resnet.py b/examples/model_compress/models/cifar10/resnet.py new file mode 100644 index 0000000000..386ff8321c --- /dev/null +++ b/examples/model_compress/models/cifar10/resnet.py @@ -0,0 +1,115 @@ +import torch +import torch.nn as nn +import torch.nn.functional as F + + +class BasicBlock(nn.Module): + expansion = 1 + + def __init__(self, in_planes, planes, stride=1): + super(BasicBlock, self).__init__() + self.conv1 = nn.Conv2d( + in_planes, planes, kernel_size=3, stride=stride, padding=1, bias=False) + self.bn1 = nn.BatchNorm2d(planes) + self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, + stride=1, padding=1, bias=False) + self.bn2 = nn.BatchNorm2d(planes) + + self.shortcut = nn.Sequential() + if stride != 1 or in_planes != self.expansion*planes: + self.shortcut = nn.Sequential( + nn.Conv2d(in_planes, self.expansion*planes, + kernel_size=1, stride=stride, bias=False), + nn.BatchNorm2d(self.expansion*planes) + ) + + def forward(self, x): + out = F.relu(self.bn1(self.conv1(x))) + out = self.bn2(self.conv2(out)) + out += self.shortcut(x) + out = F.relu(out) + return out + + +class Bottleneck(nn.Module): + expansion = 4 + + def __init__(self, in_planes, planes, stride=1): + super(Bottleneck, self).__init__() + self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=1, bias=False) + self.bn1 = nn.BatchNorm2d(planes) + self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, + stride=stride, padding=1, bias=False) + self.bn2 = nn.BatchNorm2d(planes) + self.conv3 = nn.Conv2d(planes, self.expansion * + planes, kernel_size=1, bias=False) + self.bn3 = nn.BatchNorm2d(self.expansion*planes) + + self.shortcut = nn.Sequential() + if stride != 1 or in_planes != self.expansion*planes: + self.shortcut = nn.Sequential( + nn.Conv2d(in_planes, self.expansion*planes, + kernel_size=1, stride=stride, bias=False), + nn.BatchNorm2d(self.expansion*planes) + ) + + def forward(self, x): + out = F.relu(self.bn1(self.conv1(x))) + out = F.relu(self.bn2(self.conv2(out))) + out = self.bn3(self.conv3(out)) + out += self.shortcut(x) + out = F.relu(out) + return out + + +class ResNet(nn.Module): + def __init__(self, block, num_blocks, num_classes=10): + super(ResNet, self).__init__() + self.in_planes = 64 + # this layer is different from torchvision.resnet18() since this model adopted for Cifar10 + self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1, bias=False) + self.bn1 = nn.BatchNorm2d(64) + self.layer1 = self._make_layer(block, 64, num_blocks[0], stride=1) + self.layer2 = self._make_layer(block, 128, num_blocks[1], stride=2) + self.layer3 = self._make_layer(block, 256, num_blocks[2], stride=2) + self.layer4 = self._make_layer(block, 512, num_blocks[3], stride=2) + self.linear = nn.Linear(512*block.expansion, num_classes) + + def _make_layer(self, block, planes, num_blocks, stride): + strides = [stride] + [1]*(num_blocks-1) + layers = [] + for stride in strides: + layers.append(block(self.in_planes, planes, stride)) + self.in_planes = planes * block.expansion + return nn.Sequential(*layers) + + def forward(self, x): + out = F.relu(self.bn1(self.conv1(x))) + out = self.layer1(out) + out = self.layer2(out) + out = self.layer3(out) + out = self.layer4(out) + out = F.avg_pool2d(out, 4) + out = out.view(out.size(0), -1) + out = self.linear(out) + return out + + +def ResNet18(): + return ResNet(BasicBlock, [2, 2, 2, 2]) + + +def ResNet34(): + return ResNet(BasicBlock, [3, 4, 6, 3]) + + +def ResNet50(): + return ResNet(Bottleneck, [3, 4, 6, 3]) + + +def ResNet101(): + return ResNet(Bottleneck, [3, 4, 23, 3]) + + +def ResNet152(): + return ResNet(Bottleneck, [3, 8, 36, 3])