This is the second official release of the ParaNames corpus, containing parallel entity names in over 400 languages.
The main repository also contains source code and instructions for re-creating the corpus from a raw Wikidata JSON dump.
This release was generated from the following Wikidata JSON dump:
Date and number of bytes from Wikidata server:
latest-all.json.bz2 13-Mar-2024 11:18 87292717562
SHA256 hash:
d1de7ee6da99656be7bc72ef90bc0ef8 latest-all.json.bz2
For information on caveats and how to use the resource, see the README.