-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add: audiocraft spectrogram visualization enhancements #459
add: audiocraft spectrogram visualization enhancements #459
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Thanks for contributing to GuidelinesThe examples repo is regularly tested against the ever-evolving ML stack. To facilitate our work, please adhere to the following guidelines:
is acceptable, but
is not. Avoid spaces and the
Before marking the PR as ready for review, please run your notebook one more time. Restart the Colab and run all. We will provide you with links to open the Colabs belowThe following colabs were changed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mratanusarkar
Thanks for the PR.
Can you please change the '
s to "
?
@soumik12345 let me know if any more changes are required. Also FYI, fig, ax = plt.subplots(figsize=(10, 6)) # Adjust the numbers to your preference
...
plt.savefig(output_file, format='png', dpi=300, bbox_inches='tight', pad_inches=0) # add increased dpi but I didn't include them to save image file size, and the current images look good enough in the wandb tables. |
@mratanusarkar |
Almost all the variables & parameters (15+) in The variable names are clear and as per DSP or Audio Engineering terms, Adding configs will make it more complex for most of the common use cases. But still, let me know your views. |
Thanks again for you contribution! |
* add: musicgen example * update: musicgen notebook with basic spectrogram logging * add: audiocraft spectrogram visualization enhancements (#459) * add: dynamic range and better cmap * add: frequency axis scaling based on mean spectrum threshold * chore: changed string quotes to " * update: MusicGen notebook with audio conditioning * update: musicgen notebook * update: musicgen notebook * add: screenshot * add: telemetry * add: audiogen support * update: audiocraft example to support multiband diffusion * update: colab * update: colab * update: colab --------- Co-authored-by: Atanu Sarkar <34891206+mratanusarkar@users.noreply.github.com>
Description:
Enhancing our spectrogram to better capture the spectral signature of generated audio, enabling easier identification of anomalies and a deeper understanding of the model's auditory output.
Having a good spectrogram tweaked to human auditory perception helps to correctly understand the sound signature in human reference, and in turn, understand the model outputs.
Changes:
Impact:
Provides a clearer, more intuitive view of audio content, benefiting audio enthusiasts and general viewers.
To demonstrate, consider the following examples:
Future Scope:
If @wandb implements issue#6224, it would be cool to replace these spectrogram images with an "WandB Interactive Spectrograms"