Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bitnet 1.58 prework, POC, and staging #281

Open
2 of 6 tasks
CoffeeVampir3 opened this issue May 26, 2024 · 2 comments
Open
2 of 6 tasks

Bitnet 1.58 prework, POC, and staging #281

CoffeeVampir3 opened this issue May 26, 2024 · 2 comments
Labels

Comments

@CoffeeVampir3
Copy link
Contributor

CoffeeVampir3 commented May 26, 2024

Bitnet 1.58 Groundwork

After some talks with Saroufim and the cuda mode team working on bitnet, we've outlined a strategy for implementing bitnet 1.58 method into torch. This issue lays the groundwork for 2-bit trinary tensor quantization and bitnet linear work for Bitnet 1.58

I've set up a staging repo Staging with a number of items:

  • To the point minimal lib
  • Training notebook for creating a full model, up to the point where we quantize and pack
  • Cleaned up minimal training example for running as a script
  • Example of the compiled kernel

This covers the initial groundwork for getting working trinary networks into torch.

  • Example Quantization Method
  • POC layer quantization
  • Runnable example model with quantized layers (In progress Dtype and Runnable Model)
  • AO dtype
  • AO layer type (?) for bitnet linear
  • Runnable example model with full dtype + bitnet linear layer, shippable
@msaroufim
Copy link
Member

msaroufim commented May 26, 2024

Very cool! Thank you for writing up such a clear plan. We can start merging the bit-packing logic and the layer quantization so feel free to send a PR whenever you're ready. This is very much following a similar to the playbook for fp6 @gau-nernst followed

Related work

@CoffeeVampir3
Copy link
Contributor Author

👍 #285 I've kept this seperate from Andreas' commit for now, this encapsulates only the working bits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants