Vectors of music entities.
This folder is intended to hosts graph embedding for different concepts belonging to music knowledge, from instruments to works to playlist.
The embeddings have been computed on top of the DOREMUS knowledge base, which contains data about over 148.000 works, 26.000 artists, 5.000 concerts, apart of a interesting set of controlled vocabularies.
For each different concept, we provide 3 different files:
*.emb.u
URI file, that contains the URIs of involved resources;*.emb.l
Label file, with the label in English (if present) or in any available language;*.emb
Vector file, which contains the embeddings in a Gensim-compatible format.
For more complex concepts like artists and expressions, we provide also:
*.emb.h
Header file, that contains the ordered list of involved sub-features with the relative number of related dimensions (i.e. in theartist.emb.h
, the first 2 dimensions refer to the period).
Each line of those file represents a single entity, as the parallel line in the other files. Considering musical keys as example, line 3 in the URI file identifies an entity whose label is at line 3 in label file and whose embeddings are at line 3 in vector file
name | n. entities | dimensions | description | source |
---|---|---|---|---|
key | 30 | 100 | Musical keys (e.g. C major) | Key vocabulary |
genre | 530 | 80 | Musical genres (e.g. symphony) | 6 vocabularies: IAML, Redomi, Itema3, Musical Doc of Itema3, Diabolo, Rameau |
mop | 3.278 | 80 | Medium of performance (instruments, voices, ensambles) | 5 vocabularies: MIMO, IAML, Redomi, Itema3, Diabolo |
function | 96 | 100 | Artist function (e.g. composer, conductor) | Function vocabulary |
artist | 24.423 | 14 | Composers, performers, conductors, groups | SPARQL query to endpoint |
expression | 148.177 | 13 | Musical works (i.e. Moonlight Sonata, Bolero, Traviata) | SPARQL query to endpoint |
Details will be published soon :)
As an anticipation:
- We use node2vec [1], in the implementation of entity2vec;
- We apply an L2 normalisation.
[1] node2vec: Scalable Feature Learning for Networks. A. Grover, J. Leskovec. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2016.