Releases: bltlab/paranames
v2024.05.07.0
This is the second official release of the ParaNames corpus, containing parallel entity names in over 400 languages.
The main repository also contains source code and instructions for re-creating the corpus from a raw Wikidata JSON dump.
This release was generated from the following Wikidata JSON dump:
Date and number of bytes from Wikidata server:
latest-all.json.bz2 13-Mar-2024 11:18 87292717562
SHA256 hash:
d1de7ee6da99656be7bc72ef90bc0ef8 latest-all.json.bz2
For information on caveats and how to use the resource, see the README.
Collaborators
v2021.03.04.1
This is the first complete release of the ParaNames corpus, containing parallel entity names in over 400 languages.
This release also contains source code and instructions for re-creating the corpus from a raw Wikidata JSON dump.
The specific Wikidata JSON dump this release was generated from is wikidata-20220110-all.json
.
For notes about the data format and caveats, see the README for this version.