Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

Dev compression speedup #1999

Merged
merged 36 commits into from
Feb 10, 2020

Conversation

QuanluZhang
Copy link
Contributor

@QuanluZhang QuanluZhang commented Feb 5, 2020

TODO:

  1. replace print with logging

QuanluZhang and others added 30 commits January 1, 2020 12:03
* add data parallel proposal

* fix mask_weight bug

* add slim pruner support and example

* fix typo

* fix typo

* fix setattr error

* fix buffer update

* rename instrument_layer and prunerLayerWrapper

* fix pylint

* update reverse travsal

* add wrap and unwrap

* add register_buffer API

* update docstring

* update docstring

* add quantizer support

* fix typo

* update MeanActivationPruner, weight_rank_filter_pruner and example
@QuanluZhang QuanluZhang requested a review from chicm-ms February 6, 2020 07:15
@QuanluZhang QuanluZhang marked this pull request as ready for review February 6, 2020 07:15
apply_comp = ApplyCompression(model, masks_file)
apply_comp.compress()

class ApplyCompression(Pruner):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why subclass Pruner?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is for validating the correctness of ModelSpeedup, i.e., compare inference result from ModelSpeedup and that from ApplyCompression. ApplyCompression simply applies the masks, from implementation, it can be seen as a simple pruner.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can simply be a method to go through all masks and apply to the weight accordingly. Because there is no need to hook forward method like pruner does and all cal_mask does is a dict lookup. To sum up, it is not a pruner and subclassing pruner makes it look weird. A simple function would do exactly the same and make much more sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good suggestion, let me fix it in a folow-up pr.

@scarlett2018 scarlett2018 added this to the 2020 Jan - 1.4 candidate milestone Feb 7, 2020
@QuanluZhang QuanluZhang changed the base branch from dev-pruner-dataparallel to master February 8, 2020 09:33
parser.add_argument("--model_checkpoint", type=str, default=None, help="the path of checkpointed model")
args = parser.parse_args()

if args.example_name == 'slim':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems *_speedup have a lot same code, may be they can be put in one method? use dynamic module import based on parser arguments?

print('mask elapsed time: ', time.time() - start)
return
else:
#print("model before: ", model)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be removed this or replace it with logger.debug and use flag to set logging level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, will do it

@QuanluZhang
Copy link
Contributor Author

@Cjkkkk your comments are very helpful, will fix them in the next pr soon.

@QuanluZhang QuanluZhang merged commit eab0da1 into microsoft:master Feb 10, 2020
@QuanluZhang QuanluZhang linked an issue Feb 10, 2020 that may be closed by this pull request
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

为什么模型压缩后的占用空间没有减小?
5 participants