Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement '--keep-split' to quantize model into several shards #6688

Merged
merged 6 commits into from
Apr 25, 2024

Commits on Apr 15, 2024

  1. Configuration menu
    Copy the full SHA
    17519e1 View commit details
    Browse the repository at this point in the history

Commits on Apr 18, 2024

  1. Add test script

    z5269887 committed Apr 18, 2024
    Configuration menu
    Copy the full SHA
    79bbf42 View commit details
    Browse the repository at this point in the history

Commits on Apr 22, 2024

  1. Update examples/quantize/quantize.cpp

    Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
    zj040045 and ggerganov committed Apr 22, 2024
    Configuration menu
    Copy the full SHA
    6d66e60 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    d6e453e View commit details
    Browse the repository at this point in the history
  3. Update llama_model_quantize_params

    z5269887 committed Apr 22, 2024
    Configuration menu
    Copy the full SHA
    141eb51 View commit details
    Browse the repository at this point in the history

Commits on Apr 23, 2024

  1. Fix preci failures

    z5269887 committed Apr 23, 2024
    Configuration menu
    Copy the full SHA
    e0a3679 View commit details
    Browse the repository at this point in the history