Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Observer Restructure: Remove Observers, calibration, and applying frozen steps from lifecycle #189

Merged
merged 28 commits into from
Oct 31, 2024

Conversation

dsikka
Copy link
Contributor

@dsikka dsikka commented Oct 14, 2024

Summary

  • Remove Observers from compressed tensors as well as the init, calibration and frozen steps responsible for initializing, applying, and removing the observers
  • Updates the forward pass to no longer explicitly apply calibration; expects hooks to calibrate inputs and outputs as part of the forward pass
  • apply_quantization_status will only be able to apply initialize_module_for_quantization and compress_quantized_weights. The FROZEN and CALIBRATION statuses are still valid QuantizationStatus values but compressed tensors will no longer be able run calibration or "apply" frozen statuses (i.e. remove observers)
  • Tests have been updated such that they can mock calibration for per token, per tensor, group and channel quantization
  • No longer wrap the forward pass for attention layers as this was done purely for calibration; the kv cache object has been moved to llm-compressor

Once landed, we'll need to land llm-compressor updates to add in Observers right after. Both PRs should be reviewed together: https://github.com/vllm-project/llm-compressor/pull/837/files#diff-b9472770e3291ffd5f7e0adbc074a5b2f83ca00e8dc937e230dcc365f4e2f954

Testing

  • Tested w4a16, quantized kv cache, and int8 w8a8 with the llm-compressor changes

@dsikka dsikka changed the title [WIP] Remove Observers [WIP] Observer Restructure: Remove Observers Oct 14, 2024
Base automatically changed from update-init to main October 18, 2024 15:15
@dsikka dsikka marked this pull request as ready for review October 22, 2024 14:20
@dsikka dsikka changed the title [WIP] Observer Restructure: Remove Observers Observer Restructure: Remove Observers, calibration, and applying frozen steps from lifecycle Oct 22, 2024
Copy link
Contributor

@kylesayrs kylesayrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks like a clean cut, I'd like to do a little more testing before I approve

Copy link
Member

@rahul-tuli rahul-tuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR makes me happy!

@dsikka dsikka merged commit 2b79056 into main Oct 31, 2024
1 check passed
@dsikka dsikka deleted the remove-observers branch October 31, 2024 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants