Form Schemata and Repetition Grammars

This repository contains code and data related to the paper "Repetition Structure Inference with Formal Prototypes".

Running the Code

Dependencies

The code for extracting the source data is written in Python and can be found in generate-data. The dependencies are listed there in requirements.txt.

The inference code is written in Julia and can be found in inference. Dependencies are listed in a Julia project file, so you can install them by running Julia in the directory, enter package mode by hitting ] and running

(Julia) pkg> activate .
(inference) pkg> instantiate

The inference code uses a solver, which by default is set to the commercial solver Gurobi. To run the code, you can get an academic license. Alternatively, you can use a different solver supported by the JuMP framework, but that will require some changes to the code in optimize_grammar.jl (see comments) and will likely be slower.

The plot notebook requires IJulia, a Jupyter kernel for Julia.

Reproducing the results from the paper

You can run all or only some of the following steps, as the intermediate results are already stored in the repository

Extract the data from the source collection:

$ cd generate-data
$ python extract-essen.py ../data/essen.tsv

Run the experiments:
```
$ cd inference
$ julia -L experiments.jl
```
- compute minimal grammars for Essen melodies:
```
julia> run_long_experiment(3)
```
  The above line runs on the 3 shortest tunes, if you want to run the full experiment as in the paper, use 298 instead of 3 (this will take several days). The code will save the ruleset for every piece in data/melodies/rulesets/ and the inferred grammra in data/melodies/grammars/.
- run Monte-Carlo minimization:
```
julia> run_monte_carlo()
```
  This will take the same melodies as processed in the next step (based on the saved results in data/melodies/grammars), estimate a minimal grammar by sampling random grammars and picking the smallest one, and write the resulting grammar to data/melodies/mc_grammars. By default, 10,000 grammars are sampled, but run_monte_carlo() takes a parameter to control this.
- Generate the plots: Run the notebook inference/plots.ipynb. The resulting plots are stored in plots/.

Using the Code for Custom Data

The code is not designed as a library and rather specialized to the expirements in the paper. If you would like to reuse some of the code, take a look at run_examples() and run_long_experiment() in inference/experiments.jl for some inspiration.

If you are interested in packaging the functionality into a library, please get in touch!

Repo Overview

The repo contains the following code and data:

generate-data/: code for extracting the datasets (Python)
- extract-essen.py: converting the Essen Folksong Collection to tsv
- extract-mtc.py: converting the Meertens Tune Collection to tsv (not used in the paper)
- requirements.txt: Python dependencies
inference/: code for grammar inference (Julia)
- Manifest.toml, Project.toml: Julia dependencies
- parser.jl: parsing sequences into rulesets
- optimize_grammar.jl: inferring minimal grammars from rulesets
- experiments.jl: the experiments described in the paper
- plotting.jl: helper code for plotting
- plots.ipynb: notebook for generating the plots and analyses shown in the paper
data/: data and results
- essen.tsv: Essen Folksong Collection as note lists
- mtc.tsv: Meertens Tune Collection melodies as note lists (not used)
- [grammar_]{xxy,xxyx,example1,example2}.json: rulesets and grammars for example pieces.
- melodies/: computed rulesets and grammars for short tunes from the Essen corpus
  - rulesets/: computed rulesets
  - grammars/: optimal grammars
  - mc_grammars/: Monte-Carlo minimized grammars
plots/: plots for the paper

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
data		data
generate-data		generate-data
inference		inference
plots		plots
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Form Schemata and Repetition Grammars

Running the Code

Dependencies

Reproducing the results from the paper

Using the Code for Custom Data

Repo Overview

About

Releases

Packages

Contributors 2

Languages

DCMLab/form-repetition-ismir23

Folders and files

Latest commit

History

Repository files navigation

Form Schemata and Repetition Grammars

Running the Code

Dependencies

Reproducing the results from the paper

Using the Code for Custom Data

Repo Overview

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages