- Dataset: JingjuMusicScoresCollection-v3
- Pianoroll representation with limited pitch range
- Beat as the basic unit of Input: retrieve a fixed length of training samples, hopsize = one beat
- Three layer of feedforward Variational AutoEncoder was used
- Output sample visualization and sonification was implemented
Feel free to modify the training parameters:lowest_pitch, n_pitches, expect_len. If you don't want to train the network, you can skip the training block. Instead, uncomment the lines that can load model in the code block below and then run that code block.
model = VAE(input_size, h1_dim=h1_dim, h2_dim=h2_dim, h3_dim=h3_dim, z_dim=z_dim).to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
# Uncomment to load previous model
# checkpoint = torch.torch.load("CMC_final_project/model/best_model_1000epoch_8_beat.pyt",map_location=device)
# model.load_state_dict(checkpoint['model_state_dict'])
# Evaluation
with torch.no_grad():
wish_len = int(input("Enter the length of music that you want to generate(multiple number of expect_len (4) ):"))
generate_len = int(wish_len/ expect_len)
z = torch.randn(generate_len, z_dim).to(device)
out = model.decode(z).reshape(z.shape[0],fixed_len,n_pitches)