Inference with MIDI samples #1

ChenyuGAO-CS · 2024-05-01T16:26:44Z

Hi Jingwei,

Thanks for this impressive work. I'd like to test the model's orchestration and piano cover generation ability on my own samples. How should I do this if I have a MIDI file with piano as the only instrument, and an orchestrated MIDI file? I would appreciate it if you could help a bit. Although there is a tutorial in ./inference.ipynb, there are only examples of rearranging based on your pre-processed datasets (e.g., POP909 and Slakh2100)...

The text was updated successfully, but these errors were encountered:

zhaojw1998 · 2024-05-02T07:28:54Z

Hi Chenyu,

You can use the following code block to load a MIDI file in our stipulated format. It will give you x_mix, the source piece to be rearranged from. If you also have a reference piece y with the target arrangement form, you can also load it with this block, or you can also follow the tutorial notebook to sample y from Slakh2100 or POP909.

To use this code block, please check the latest version of utils/format_convert.py and dataset.py in our repo.

Hope this helps.

import pretty_midi as pyd
import numpy as np
import scipy.interpolate
from utils.format_convert import midi2matrix # check for latest version in our repo
from dataset import slakh_program_mapping, collate_fn_inference # check for latest version in our repo

PATH = 'path-to-your-MIDI-file.mid'
ACC = 4 # quantize notes at semiquaver (i.e., 16th note) level
DEVICE = 'cuda:0' # if you've got a GPU

midi = pyd.PrettyMIDI(PATH) # load MIDI object
beats = midi.get_beats()  # make sure that the MIDI file has an accurate tempo curve, or the beats will be misaligned
beats = np.append(beats, beats[-1] + (beats[-1] - beats[-2]))
quantize = scipy.interpolate.interp1d(np.array(range(0, len(beats))) * ACC, beats, kind='linear')
semiquaver = quantize(np.array(range(0, (len(beats) - 1) * ACC))) # this is the quantization timesteps

tracks, programs = midi2matrix(midi, semiquaver) # convert MIDI object to Numpy matrices
# Here 'tracks' is of shape (#track, time, 128). Each non-zero cell (k, t, p) represents a note of pitch p in track k at position t
#'programs' is an integer vector of shape (#track,), which stores the MIDI program number of each track

tracks = tracks[:, 0: 16*8, :] # get an 8-bar snippet. You can test for other lengths as well but Q&A is not optimized for long-term structures
programs = slakh_program_mapping(programs) # convert to the 34 supported instrument classes in Slakh2100

# convert to the input format of Q&A model
(x_mix, x_instr, x_fp, x_ft), _, _ = collate_fn_inference(batch = [(tracks, programs, None, '_')], device = DEVICE)

# Now (hopefully) you can follow the rest of the code in the tutorial notebook

ChenyuGAO-CS · 2024-05-02T14:48:27Z

Hi Jingwei,

Thanks for your quick response and kind help. That works!

ChenyuGAO-CS closed this as completed May 2, 2024

zhaojw1998 pinned this issue May 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference with MIDI samples #1

Inference with MIDI samples #1

ChenyuGAO-CS commented May 1, 2024

zhaojw1998 commented May 2, 2024 •

edited

Loading

ChenyuGAO-CS commented May 2, 2024

Inference with MIDI samples #1

Inference with MIDI samples #1

Comments

ChenyuGAO-CS commented May 1, 2024

zhaojw1998 commented May 2, 2024 • edited Loading

ChenyuGAO-CS commented May 2, 2024

zhaojw1998 commented May 2, 2024 •

edited

Loading