This repo contains code for the extended abstract, "BEAT-ALIGNED SPECTROGRAM-TO-SEQUENCE GENERATION OF RHYTHM-GAME CHARTS", accepted to ISMIR2023 LBD.
You may see the demo page here, but this page is currently under construction as well. Alternatively, here is a folder that has videos of generated charts for a wide variety of genres and difficulties.
We bundled all of the models used to generate the tables in the paper, into one .zip file of size ~300MB
. You can download it here. If you'd like to replicate the tables or generate your own charts, unzip it to goct/ckpts
, as many scripts are written with this arrangement implied.
pip install -U h5py scipy tqdm numpy soundfile librosa hydra-core wandb
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
wandb is only used to log, so it's optional (but you'd have to remove relevant code)
Enclosed within the data
directory are the list of songs and charts used for training/validation/testing. An entry from the list may look like this:
"BIGFOLDER/1003217/nekodex - circles! (FAMoss) [easy!].osu.json": [
105,
421
],
That means it's from beatmapset number 1003217, with the file name nekodex - circles! (FAMoss) [easy!].osu
.
Due to the copyright laws of my country, I most probably can't distribute the raw data directly. Moreover, the means to download data either:
- requires you to log in to
osu.ppy.sh
, the official website of osu!, or - are "mirrors", run by individuals and rarely continued for years, sometimes switching their APIs.
Therefore, it would be meaningless to provide a script for downloading the script. Please somehow obtain the beatmaps yourself.
Some tips on the file format:
.osz
files are actually.zip
files. Just rename them and you will be able to unzip them. even after unzipping, it is recommended to keep the contents of the folder in the same folder.- Likewise,
.osu
files are plaintext. There's even an wiki article for this online, which is fairly detailed. - Fortunately, there are actually some parsers in this repo!
If you cannot go through the steps below, I can provide you with the .h5 files. However, I honestly cannot think of a stable way to transport three files that sum up to (EDIT: on second thought, I am not entirely sure if this is even legal to do in my nation. I do not expect Korean laws to exhibit any sort of decency nor leniency, not anymore. Do we even have "fair use"?)~40GB
, so you'll have to provide me with a good way to do so.
Prerequisite. We strongly recommend that the data be organized like this.
(BIGFOLDER)
ㄴ136881
ㄴ153199
ㄴaudio.mp3
ㄴSHK - Couple Breaking (Sky_Demon) [MX].osu
ㄴSHK - Couple Breaking (Sky_Demon) [NM].osu
ㄴSHK - Couple Breaking (Sky_Demon) [Sakura's HD].osu
ㄴ...
remember which directory the folders are located in. We will call this (BIGFOLDER)
. This is not to be confused with OSUFOLDER
, which is literally the length-9 text "OSUFOLDER".
We strongly recommend that the steps below be taken in a non-Windows operating system.
- Wipe all data from the beatmapsets(
.osz
s) except for the .osu files referenced bydata/train.json
,data/valid.json
, anddata/test.json
. Also leave the audio files referenced by these.osu
files as a background music. - Navigate to
osu-to-ddc/osutoddc/converter
and runpython converter.py (BIGFOLDER) (BIGFOLDER)
. - Navigate to
1_preindex_similarity_matrix
. Runpython cache_similar_beat_index.py (BIGFOLDER) && python cleanup.py (BIGFOLDER)
. This may take very, very long. Please return tomorrow. - Navigate to the top directory and run
python replace_text.py BIGFOLDER (BIGFOLDER)
. Then,cp data/* (BIGFOLDER)
. (This was primarily generated with2_generate_dataset/generate_dataset_peripherals.py
, but the dataset splits were modified to filter out some beatmaps later, to combat a upstream problem that arose much later into the process. As a result the script does not yield the same outputs anymore.) - Navigate to
2_generate_dataset
and runpython h5pyize_dataset.py (BIGFOLDER)/test.json
. Do the same for valid.json and train.json. This will take less time than the previous step, but will take extremely long nonetheless (~30 hours). The generated .h5 files will be~40GB
total. - Now the folder should look like this.
(BIGFOLDER)
ㄴ136881
ㄴ153199
ㄴaudio.mp3
ㄴSHK - Couple Breaking (Sky_Demon) [MX].osu
ㄴSHK - Couple Breaking (Sky_Demon) [MX].osu.json
ㄴSHK - Couple Breaking (Sky_Demon) [MX].osu.json.beat.json
ㄴ...
...
ㄴtrain.h5
ㄴtrain.json
ㄴvalid.h5
ㄴvalid.json
ㄴtest.h5
ㄴtest.json
ㄴsummary.json
Navigate to 0_ddc
and follow instructions there.
Henceforth we will call the big folder for the beatmania data, where dirs such as json_filt
are, as (SMALLFOLDER)
.
Clone my fork of ddc_onset, and and please run
python h5pyize_tree.py do (BIGFOLDER) (BIGFOLDER)/all_ddc.h5
python h5pyize_tree.py onset (BIGFOLDER) (BIGFOLDER)/all_ddc.h5 (BIGFOLDER)/all_onset.h5
This will also generate a ~40GB
file.
navigate to the top directory, and run
python text_replacer.py OSUFOLDER (BIGFOLDER)
python text_replacer.py STEPMANIAFOLDER (SMALLFOLDER)
This is a script that replaces all certain instance of a text within the folder to another, for files with certain extensions. The files modified with this are mainly in the folder conf/
.
now, run any script in scripts
, from the top directory. mel_timing.sh
is used to train models without action tokens; these models are later used to generate the numbers on Table 2. mel.sh
is used to train models with action tokens; these models are used for all other experiments on the paper.
On my system, with a RTX 3060, one epoch of mel.sh
takes two hours. With a RTX 3090, it should take one for each epoch.
Metrics in Table 2 can be genererated using metrics_timing_AR.py
and ddc_eval.py
. Please see these files for notes on how to run the evaluation, and which model to use.
unzip generated.zip
to generate
on the highest directory of this repo, if you need this feature or would like to try replicating Table 3.
metrics_cond_centered_AR.py
is used to generate notes for the models trained on osu!. For the commands and the numbers please see the comments in metrics_timing_AR.py
. which is also used in a similar way. Moreover, generated/millin_and_anmillin.ipynb
was used to generate stats for Table 3. Also in the generated
folder are the files output from metrics_timing_AR.py
and metrics_cond_centered_AR.py
from the beat-aligned and non-beat-aligned models.
The generated intermediate is output both on the console, and on the outputs
folder that will be generated. Either route the output to a folder of your liking or copy the logged output from outputs
folder, as well as the ground truth, generated with generate_ref.py
.
gen_to_beatmap_all.ipynb
has the code, but I would not dare make this into a script, since I the code is very unrefined. The scripts here read in the log files generated, parse them, and translate them into the .osu
format with the help of preexisting .osu files.
I have not implemented making stepmania charts at the moment, nor I have implemented making osu!mania charts from scratch. hopefully I will have the time to work on it in the future and release it to both communities.