qwopqwop200 / GPTQ-for-LLaMa Public

Notifications You must be signed in to change notification settings
Fork 461
Star 3k

Code
Issues 61
Pull requests 1
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: qwopqwop200/GPTQ-for-LLaMa

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

61 Open 157 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Total parameters are less after quantization

#291 opened Dec 21, 2024 by ZN1010

add support for minicpm

#289 opened Jul 4, 2024 by LDLINGLINGLING

GPTQ vs bitsandbytes

#288 opened Apr 5, 2024 by iaoxuesheng

Error when load GPTQ model

#287 opened Feb 12, 2024 by KyrieCui

datasets.utils.info_utils.ExpectedMoreSplits: {'validation'}

#286 opened Jan 10, 2024 by SDcodehub

Syntax changed in triton.testing.do_bench() causing error when running llama_inference.py

#285 opened Dec 10, 2023 by prasanna

Support Mistral.

#284 opened Oct 14, 2023 by nbollman

error: block with no terminator, has llvm.cond_br %5624, ^bb2, ^bb3

#283 opened Sep 19, 2023 by Hukongtao

neox.py needs to add "import math"

#282 opened Aug 14, 2023 by StudyingShao

LoRa and diff with bitsandbytes

#281 opened Aug 3, 2023 by RonanKMcGovern

Transformers broke again (AttributeError: 'GPTQ' object has no attribute 'inp1')

#280 opened Jul 29, 2023 by EyeDeck

Can i quantize HF version of llama model

#279 opened Jul 26, 2023 by akanyaani

Would GPTQ be able to support LLaMa2?

#278 opened Jul 26, 2023 by moonlightian

Why does the model quantization prompt KILLED at the end?

#277 opened Jul 16, 2023 by g558800

Help: Quantized llama-7b model with custom prompt format produces only gibberish

#276 opened Jul 15, 2023 by Glavin001

High PPL when groupsize != -1 for OPT model after replace linear layer with quantlinear.

#275 opened Jul 6, 2023 by hyx1999

Issue with GPTQ

#274 opened Jul 5, 2023 by d0lphin

Could not obtain official perplexity using bloom_eval()

#272 opened Jun 29, 2023 by xingyueye

inference with the saved model error: AttributeError: module 'torch.backends.cuda' has no attribute 'sdp_kernel'

#271 opened Jun 29, 2023 by LuciaIsFine

llama_inference 4bits error

#270 opened Jun 26, 2023 by gjm441

Proposed changes to reduce VRAM usage. Potentially quantize larger models on consumer hardware.

#269 opened Jun 25, 2023 by sigmareaver

SqueezeLLM support?

#264 opened Jun 15, 2023 by nikshepsvn

What is the right perplexity number?

#263 opened Jun 15, 2023 by JianbangZ

The detected CUDA version (12.1) mismatches the version that was used to compile PyTorch (11.7)

#262 opened Jun 15, 2023 by siddhsql

[Question] What is the expected discrepancy between simulated and actually computed values?

#261 opened Jun 13, 2023 by set-soft

Previous 1 2 3 Next

Previous Next

ProTip! Find all open issues with in progress development work with linked:pr.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly