Skip to content

Code for the paper Cluster and Separate: A GNN Approach to Voice and Staff Prediction for Score Engraving

License

Notifications You must be signed in to change notification settings

CPJKU/piano_svsep

Repository files navigation

Python Pytorch Paper Conference

Piano Staff and Voice Separation (Piano_SVSep)

Official implementation of the ISMIR 2024 paper "Cluster and Separate: A GNN Approach to Voice and Staff Prediction for Score Engraving, by Francesco Foscarin, Emmanouil Karystinaios, Eita Nakamura, and Gerhard Widmer".

Nominated as Best Paper at ISMIR 2024!

Read the full paper here.

This work approaches the problem of separating the notes from a quantised symbolic music piece (e.g., a MIDI file) into multiple voices and staves. This is a fundamental part of the larger task of music score engraving (or score type-setting), which aims to produce readable musical scores for human performers. We focus on piano music and support homophonic voices, i.e., voices that can contain chords, and cross-staff voices, which are notably difficult tasks that have often been overlooked in previous research. We propose an end-to-end system based on graph neural networks that clusters notes that belong to the same chord and connects them with edges if they are part of a voice. Our results show clear and consistent improvements over a previous approach on two datasets of different styles. To aid the qualitative analysis of our results, we support the export in symbolic music formats and provide a direct visualisation of our outputs graph over the musical score.

Installation

To install Piano_SVSep you first need to install the Pytorch version suitable for your system. You can find the instructions here.

Then clone the repo:

  git clone https://github.com/cpjku/piano_svsep.git

Finally navigate to the root of the project and install:

  cd piano_svsep
  pip install .

Usage

Prediction

Our system is supposed to be run in the final stage of a transcription or generation process, where a musical score is almost ready, but it is still missing voice and staff information.

The best way to simulate such a situation is to input a musical score (with potentially random voice and staff assignments). The system will then return a MEI version of the same musical score with newly produced voice and staff information.

For this, you can use the following command:

  python launch_scripts/predict.py --model_path path/to/model  --score_path path/to/input_score --save_path path/to/output_score

You can try it out on the example score we provide.

  python launch_scripts/predict.py --model_path pretrained_models/model.ckpt --score_path artifacts/test_score.musicxml --save_path artifacts/test_score_pred.mei

This uses the pretrained model we use for the evaluation in our paper.

Train your own model

Unfortunately, the jpop dataset we used in our paper is not public, therefore the results of the paper are not entirely reproducible. We still make all the code available, including the code to handle the jpop dataset in case you should obtain access to it.

We also changed the default parameters in our training script, to train only with the dcml dataset. Specifically, the train_datasets parameter is not set to dcml only, while in the paper we use both dcml and jpop datasets. is now set to dcml only. To train, you can use the following command:

python launch_scripts/train.py 

The default parameters (excluding the train_dataset parameter) are the ones we used in the paper, and can be inspected with

python launch_scripts/train.py --help

Visualization

We provide the MusGViz tool for visualization, to directly inspect the input, output, and ground truth graphs.

Visit the repo and follow the instructions there. You can load the example files in artifacts/ or produce new files.

Cite Us

@inproceedings{piano_SVSep_2024,
  title={Cluster and Separate: a GNN Approach to Voice and Staff Prediction for Score Engraving},
  author={Foscarin, Francesco and Karystinaios, Emmanouil and Nakamura, Eita and Widmer, Gerhard},
  booktitle={Proceedings of the International Society for Music Information Retrieval Conference (ISMIR)},
  year={2024}
}

Acknowledgments

This project receives funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme under grant agreement No 101019375 "Whither Music?".

About

Code for the paper Cluster and Separate: A GNN Approach to Voice and Staff Prediction for Score Engraving

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages