Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up observer defaulting logic, better error message #200

Merged
merged 2 commits into from
Nov 1, 2024

Conversation

kylesayrs
Copy link
Contributor

@kylesayrs kylesayrs commented Oct 31, 2024

Purpose

  • Better error message when attempting to load an unregistered class instance
  • Avoid annoying users with configs which have observer=memoryless for dynamic quantization

Changes

  • Clean up observer defaulting logic
  • Return the type name in the case that it is not found in the list of aliases
  • Condition warning on if observer != memoryless

Testing

Registry Error

from llmcompressor.observers import Observer
Observer.load_from_registry("asdf", quantization_args=None)
- KeyError: "Unable to find None registered under type <class 'llmcompressor.observers.base.Observer'>.\nRegistered values for <class 'llmcompressor.observers.base.Observer'>: ['minmax', 'mse']\nRegistered aliases for <class 'llmcompressor.observers.base.Observer'>: []"
+ KeyError: "Unable to find asdf registered under type <class 'llmcompressor.observers.base.Observer'>.\nRegistered values for <class 'llmcompressor.observers.base.Observer'>: ['minmax', 'mse']\nRegistered aliases for <class 'llmcompressor.observers.base.Observer'>: []" 

Observer Warning

from compressed_tensors.quantization import QuantizationArgs
args = QuantizationArgs(dynamic=True, observer="minmax-whatever")
>>> /home/ksayers/compressed-tensors/src/compressed_tensors/quantization/quant_args.py:213: UserWarning: No observer is used for dynamic quantization, setting to None
  warnings.warn(
from compressed_tensors.quantization import QuantizationArgs
args = QuantizationArgs(dynamic=True, observer="memoryless")
>>> 

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
@kylesayrs kylesayrs self-assigned this Oct 31, 2024
@kylesayrs kylesayrs requested review from dsikka and horheynm October 31, 2024 20:33
@kylesayrs kylesayrs assigned rahul-tuli and unassigned rahul-tuli Oct 31, 2024
@kylesayrs kylesayrs requested a review from rahul-tuli October 31, 2024 20:33
Copy link
Contributor

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this LGTM. could you verify this works as expected trying our e2e test or example case with dynamic per token quant?

Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
@kylesayrs kylesayrs requested a review from mgoin November 1, 2024 16:06
Copy link
Contributor

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending we've ran through dynamic examples for a new config (with "memoryless") and new config

@kylesayrs
Copy link
Contributor Author

Ran examples/quantization_w8a8_fp8/llama3_example.py without issue

<|begin_of_text|>Hello my name is Mirella, I am a 25-year-old Brazilian who has been living in the UK for
grep -B8 -A4 'observer' llm-compressor/output/config.json 
    "config_groups": {
      "group_0": {
        "input_activations": {
          "actorder": null,
          "block_structure": null,
          "dynamic": true,
          "group_size": null,
          "num_bits": 8,
          "observer": null,
          "observer_kwargs": {},
          "strategy": "token",
          "symmetric": true,
          "type": "float"
        },
--
          "Linear"
        ],
        "weights": {
          "actorder": null,
          "block_structure": null,
          "dynamic": false,
          "group_size": null,
          "num_bits": 8,
          "observer": "minmax",
          "observer_kwargs": {},
          "strategy": "channel",
          "symmetric": true,
          "type": "float"
        }

@kylesayrs
Copy link
Contributor Author

Test in PR description shows that warning does not trigger for observer=memoryless

Copy link
Member

@rahul-tuli rahul-tuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@kylesayrs kylesayrs merged commit 37df2dd into main Nov 1, 2024
1 check passed
@kylesayrs kylesayrs deleted the kylesayrs/improve-registry-error branch November 1, 2024 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants