Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HuggingFace Integration & Pyproject #8

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

JacobLinCool
Copy link

This PR modernizes the project by transitioning from setup.py to pyproject.toml for configuration, improving compatibility with modern Python packaging standards.

The final0 model has also been published on the Hugging Face Hub as safe-models/beat-this-final0, stored in safetensors FP16 format since all tensor values are within range. This optimization reduces the model size by approximately 50%, bringing it down to ~40MB while maintaining performance. Additionally, the download speed is faster now, thanks to Hugging Face's optimized networking.

file2beats = File2Beats(checkpoint_path="safe-models/beat-this-final0", device="cpu", dbn=False)

Copilot generated summary:

This pull request includes significant updates to the beat_this project, focusing on enhancing model loading functionality, integrating with the Hugging Face Hub, and transitioning the build system to use pyproject.toml.

Model Loading Enhancements:

  • beat_this/inference.py: Modified the load_model function to handle cases where checkpoint_path is None and to use BeatThis.from_pretrained for loading pretrained models. [1] [2]

Hugging Face Hub Integration:

Build System Transition:

  • pyproject.toml: Introduced pyproject.toml for project configuration, specifying dependencies and build system requirements using hatchling.
  • setup.py: Removed setup.py in favor of using pyproject.toml for project configuration.

@JacobLinCool JacobLinCool changed the title Huggingface integration HuggingFace Integration & Pyproject Jan 25, 2025
@f0k
Copy link
Member

f0k commented Jan 26, 2025

Thank you for the PR!

+1 for moving away from setup.py. -1 for removing requirements.txt; the requirements in pyproject.toml are too strict now (e.g., beat_this runs fine with older versions of PyTorch than 2.3.1, and even faster in case your GPU is not supported by flash attention v2, but by v1). I haven't used hatchling before; do the defaults install the same set of files as the original setup.py? setup.py actually detected the launch_scripts directory as a package, which was unintended.

The huggingface integration is surprisingly little code, but comes at the disadvantage of requiring hubbingface_hub as a dependency. Also the way it is implemented, it will prevent instantiating the model from a local checkpoint file in case the file path includes a slash. Would it be possible to make huggingface optional, and define a hf: prefix for huggingface checkpoints?

Storing the models in FP16 would be an option for us as well, but I'd like to compare results when running on CPU.

@JacobLinCool
Copy link
Author

  • The requirements.txt file has been added back, and I've relaxed the version requirement in pyproject.toml.
  • I think Hatchling is working as expected. I created a new environment and installed the package from the Git repository. I can see beat_this in the site-packages but not launch_scripts.
  • The model resolver has been updated to use hf: to load models from Hugging Face.
  • For FP16, after my experiment, I noticed there is still a small difference between FP32 and FP16 weights. For parameters, the largest difference is about 0.0003. For outputs, the number of beats and downbeats remains the same, but the timing shows a maximum difference of 0.02 seconds. So, I think it's better to upload the same final0 model to Hugging Face. I re-uploaded the FP32 version.

It is still possible to push the FP16 version with file2beats.model.half().push_to_hub(".../..."). The weights will be stored in FP16 but will load into FP32 at runtime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants