Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dev] Convert the quant compress from numpy into tvm runtime #126

Merged
merged 136 commits into from
Aug 5, 2024

Conversation

LeiWang1999
Copy link
Contributor

By introducing the weight propagation stage3 with pr #114 , the weight transform could be bit-level when the weight is 1/2 bits. It's time for us to implement a tvm version of compress thus we can do the conversion on the unquantized weight then we can avoid the bit level permutation.

This pull request includes several important changes to the bitblas/gpu/intrin/lop3.py file to enhance the decoding functions, as well as a minor update to the CI workflow configuration in .github/workflows/benchmark.yml and a submodule update in 3rdparty/tvm.

Enhancements to Decoding Functions:

  • New Decoding Functions: Added multiple new template functions for decoding various integer formats to float16 with scaling and offset capabilities. (bitblas/gpu/intrin/lop3.py) [1] [2] [3] [4]
  • Function Argument Handling: Introduced a helper function get_func_arguments to streamline the passing of arguments to external functions. (bitblas/gpu/intrin/lop3.py)
  • Offset Factor: Added offset_factor to buffer definitions to support the new decoding functions. (bitblas/gpu/intrin/lop3.py) [1] [2] [3] [4] [5] [6]
  • Function Calls: Updated function calls to use the new get_func_arguments helper for improved readability and maintainability. (bitblas/gpu/intrin/lop3.py) [1] [2] [3] [4]

CI Workflow Update:

  • Dependency Update: Changed depends-on to needs in the CI workflow configuration to improve dependency management. (.github/workflows/benchmark.yml)

Submodule Update:

  • Submodule Commit: Updated the submodule commit for 3rdparty/tvm to a new version. (3rdparty/tvm)

@LeiWang1999 LeiWang1999 merged commit 906055d into microsoft:main Aug 5, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant