mit-han-lab / smoothquant Public

Notifications You must be signed in to change notification settings
Fork 151
Star 1.3k

Code
Issues 66
Pull requests 1
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: mit-han-lab/smoothquant

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

66 Open 25 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

error when getting the scales for my model

#100 opened Dec 19, 2024 by rib12316

How to use this for custom model?

#98 opened Dec 7, 2024 by siddagra

how to visual outier of activation

#97 opened Nov 3, 2024 by harleyszhang

The upper and lower bounds seems to not 8 bits in some cases

#96 opened Oct 30, 2024 by zhangyu68

Why only 4 layers?

#95 opened Sep 7, 2024 by VincentXWD

Support for Qwen2

#94 opened Jul 31, 2024 by JiaXinLI98

How to quantize the out_proj and fc2 module in OPT model family

#93 opened Jul 30, 2024 by yanchenmochen

How to quantize llama3?

#92 opened Jul 22, 2024 by jpyo0803

export_int8_model.py size issue

#91 opened Jul 11, 2024 by ljhyeok123

quantify other models,

#90 opened Jul 9, 2024 by AlexMa0

best Alpha value for Qwen 1.5 72B

#89 opened Jun 26, 2024 by Riskin1999

how to draw this result directly? is there any script?

#88 opened Jun 5, 2024 by foreverpiano

Huggingface_Hub Issue

#87 opened May 23, 2024 by faize5

Can SmoothQuant be used on ViT models?

#86 opened Apr 24, 2024 by n9s8a

Whether it can be supported stable diffusion

#85 opened Apr 11, 2024 by songh11

Inquiry about Int8 BMM overflow

#84 opened Apr 9, 2024 by luzai

Error when running smoothquant_opt_real_int8_demo.ipynb

#83 opened Apr 2, 2024 by kaijun924

how to use model.generate with smoothquant models

#82 opened Mar 31, 2024 by Hao-YunDeng

which version of transformer and datasets package do we need for this repo?

#81 opened Mar 28, 2024 by ghost

adjust activations

#80 opened Mar 28, 2024 by muzi0111

Question: why not need explicit scaling for activation X

#79 opened Mar 18, 2024 by ghost

Weight migration for Llama?

#77 opened Mar 14, 2024 by atyshka

Question about code

#76 opened Mar 6, 2024 by Lucky-Lance

How Can I Peft the Smoothquanted LLM?

#75 opened Mar 5, 2024 by LameloBally

Can I reproduce SmoothQuant on CPU only since I see that torch-int8 requires a GPU, and I am only interested in inference on the CPU?

#73 opened Feb 7, 2024 by WCSY-YG

Previous 1 2 3 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly