add: audiocraft spectrogram visualization enhancements #459

mratanusarkar · 2023-09-06T06:13:40Z

Description:

Enhancing our spectrogram to better capture the spectral signature of generated audio, enabling easier identification of anomalies and a deeper understanding of the model's auditory output.

Having a good spectrogram tweaked to human auditory perception helps to correctly understand the sound signature in human reference, and in turn, understand the model outputs.

Changes:

Applied logarithmic scaling for better sound perception alignment.
Dynamic range set to 5th-95th percentiles for improved audio focus.
Adjusted frequency range: Lower limit set at 20Hz; upper limit set dynamically based on audio energy concentration.
Switched to 'magma' colormap for perceptual uniformity.
Improved code structure for maintainability.

Impact:

Provides a clearer, more intuitive view of audio content, benefiting audio enthusiasts and general viewers.

To demonstrate, consider the following examples:

In this (run), just by looking I can say the music will be beautiful with distinct notes and patterns
whereas these runs (run, run), are rhythmic, following beat patterns and little intense and loud
and this (run) turned out to be noisy.

Future Scope:

If @wandb implements issue#6224, it would be cool to replace these spectrogram images with an "WandB Interactive Spectrograms"

review-notebook-app · 2023-09-06T06:13:44Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

github-actions · 2023-09-06T10:29:10Z

Thanks for contributing to wandb/examples!
We appreciate your efforts in opening a PR for the examples repository. Our goal is to ensure a smooth and enjoyable experience for you 😎.

Guidelines

The examples repo is regularly tested against the ever-evolving ML stack. To facilitate our work, please adhere to the following guidelines:

Notebook naming: You can use a combination of snake_case and CamelCase for your notebook name. Avoid using spaces (replace them with _) and special characters (&%$?). For example:

Cool_Keras_integration_example_with_weights_and_biases.ipynb

is acceptable, but

Cool Keras Example with W&B.ipynb

is not. Avoid spaces and the & character. To refer to W&B, you can use: weights_and_biases or just wandb (it's our library, after all!)

Managing dependencies within the notebook: You may need to set up dependencies to ensure that your code works. Please avoid the following practices:
- Docker-related activities. If Docker installation is required, consider adding a full example with the corresponding Dockerfile to the wandb/examples/examples folder (where non-Colab examples reside).
- Using pip install as the primary method to install packages. When calling pip in a cell, avoid performing other tasks. We automatically filter these types of cells, and executing other actions might break the automatic testing of the notebooks. For example,
```
pip install -qU wandb transformers gpt4
```
is acceptable, but
```
pip install -qU wandb
import wandb
```
is not.
- Installing packages from a GitHub branch. Although it's acceptable 😎 to directly obtain the latest bleeding-edge libraries from GitHub, did you know that you can install them like this:
```
!pip install -q git+https://github.com/huggingface/transformers
```
You don't need to clone, then cd into the repo and install it in editable mode.
- Avoid referencing specific Colab directories. Google Colab has a /content directory where everything resides. Avoid explicitly referencing this directory because we test our notebooks with pure Jupyter (without Colab). Instead, use relative paths to make the notebook reproducible.
The Jupyter notebook file .ipynb is nothing more than a JSON file with primarily two types of cells: markdown and code. There is also a bunch of other metadata specific to Google Colab. We have a set of tools to ensure proper notebook formatting. These tools can be found at wandb/nb_helpers.

Before merging, wait for a maintainer to clean and format the notebooks you're adding. You can tag @tcapelle.

Before marking the PR as ready for review, please run your notebook one more time. Restart the Colab and run all. We will provide you with links to open the Colabs below

The following colabs were changed
-colabs/audiocraft/AudioCraft_MusicGen.ipynb

soumik12345

@mratanusarkar
Thanks for the PR.
Can you please change the 's to "?

mratanusarkar · 2023-09-06T10:57:09Z

@mratanusarkar Thanks for the PR. Can you please change the 's to "?

@soumik12345 let me know if any more changes are required.

Also FYI,
The png image size and quality can be improved from matplotlib side with:

fig, ax = plt.subplots(figsize=(10, 6))  # Adjust the numbers to your preference
...
plt.savefig(output_file, format='png', dpi=300, bbox_inches='tight', pad_inches=0) # add increased dpi

but I didn't include them to save image file size, and the current images look good enough in the wandb tables.
users can change them if needed, for offline download of spectrogram images.

soumik12345 · 2023-09-06T11:29:00Z

The png image size and quality can be improved from matplotlib side with:

@mratanusarkar
Can you please add a config for that?

mratanusarkar · 2023-09-06T11:38:07Z

The png image size and quality can be improved from matplotlib side with:

@mratanusarkar Can you please add a config for that?

Almost all the variables & parameters (15+) in get_spectrogram() could be added as config.
But I feel that's unnecessary provided by the fact that it's a column data eventually.

The variable names are clear and as per DSP or Audio Engineering terms,
and anyone interested in it can play around with the function.

Adding configs will make it more complex for most of the common use cases. But still, let me know your views.

soumik12345 · 2023-09-06T11:43:26Z

Thanks again for you contribution!

* add: musicgen example * update: musicgen notebook with basic spectrogram logging * add: audiocraft spectrogram visualization enhancements (#459) * add: dynamic range and better cmap * add: frequency axis scaling based on mean spectrum threshold * chore: changed string quotes to " * update: MusicGen notebook with audio conditioning * update: musicgen notebook * update: musicgen notebook * add: screenshot * add: telemetry * add: audiogen support * update: audiocraft example to support multiband diffusion * update: colab * update: colab * update: colab --------- Co-authored-by: Atanu Sarkar <34891206+mratanusarkar@users.noreply.github.com>

mratanusarkar added 2 commits September 6, 2023 01:00

add: dynamic range and better cmap

2b7a220

add: frequency axis scaling based on mean spectrum threshold

4ceb43c

soumik12345 requested changes Sep 6, 2023

View reviewed changes

chore: changed string quotes to "

144b166

soumik12345 self-requested a review September 6, 2023 11:41

soumik12345 approved these changes Sep 6, 2023

View reviewed changes

soumik12345 merged commit 6661b0b into wandb:example/audiocraft Sep 6, 2023

mratanusarkar deleted the example/audiocraft-spectrogram branch September 6, 2023 11:51

mratanusarkar restored the example/audiocraft-spectrogram branch September 8, 2023 10:08

mratanusarkar deleted the example/audiocraft-spectrogram branch September 15, 2023 10:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add: audiocraft spectrogram visualization enhancements #459

add: audiocraft spectrogram visualization enhancements #459

mratanusarkar commented Sep 6, 2023 •

edited

Loading

review-notebook-app bot commented Sep 6, 2023

github-actions bot commented Sep 6, 2023 •

edited

Loading

soumik12345 left a comment

mratanusarkar commented Sep 6, 2023 •

edited

Loading

soumik12345 commented Sep 6, 2023

mratanusarkar commented Sep 6, 2023

soumik12345 commented Sep 6, 2023

add: audiocraft spectrogram visualization enhancements #459

add: audiocraft spectrogram visualization enhancements #459

Conversation

mratanusarkar commented Sep 6, 2023 • edited Loading

review-notebook-app bot commented Sep 6, 2023

github-actions bot commented Sep 6, 2023 • edited Loading

Guidelines

Before marking the PR as ready for review, please run your notebook one more time. Restart the Colab and run all. We will provide you with links to open the Colabs below

soumik12345 left a comment

Choose a reason for hiding this comment

mratanusarkar commented Sep 6, 2023 • edited Loading

soumik12345 commented Sep 6, 2023

mratanusarkar commented Sep 6, 2023

soumik12345 commented Sep 6, 2023

mratanusarkar commented Sep 6, 2023 •

edited

Loading

github-actions bot commented Sep 6, 2023 •

edited

Loading

mratanusarkar commented Sep 6, 2023 •

edited

Loading