CLDF Metadata: cldf-metadata.json
Sources: sources.bib.zip
Comprehensive reference information for the world's languages, especially the lesser known languages
property | value |
---|---|
dc:bibliographicCitation | Hammarström, Harald & Forkel, Robert & Haspelmath, Martin & Bank, Sebastian. 2024. Glottolog 5.1. Leipzig: Max Planck Institute for Evolutionary Anthropology. (Available online at https://glottolog.org) |
dc:conformsTo | CLDF StructureDataset |
dc:identifier | https://glottolog.org |
dc:license | https://creativecommons.org/licenses/by/4.0/ |
dcat:accessURL | https://github.com/glottolog/glottolog-cldf |
prov:wasDerivedFrom | |
prov:wasGeneratedBy |
|
rdf:ID | glottolog |
rdf:type | http://www.w3.org/ns/dcat#Distribution |
Table values.csv
property | value |
---|---|
dc:conformsTo | CLDF ValueTable |
dc:extent | 134506 |
Name/Property | Datatype | Description |
---|---|---|
ID | string Regex: [a-zA-Z0-9_\-]+ |
Primary key |
Language_ID | string |
References languages.csv::ID |
Parameter_ID | string |
References parameters.csv::ID |
Value | string |
|
Code_ID | string |
References codes.csv::ID |
Comment | string |
|
Source | list of string (separated by ; ) |
References sources.bib::BibTeX-key |
codeReference |
string |
Table parameters.csv
This table lists parameters (or aspects) of languoids that Glottolog assigns values for, such as the languoid's position on the Glottolog classification or the descriptive status. Refer to the Description
column in the table for details, and to the datatype
columnn for information how values for the parameter should be interpreted.
property | value |
---|---|
dc:conformsTo | CLDF ParameterTable |
dc:extent | 7 |
Name/Property | Datatype | Description |
---|---|---|
ID | string Regex: [a-zA-Z0-9_\-]+ |
Primary key |
Name | string |
|
Description | string |
|
ColumnSpec | json |
|
type |
string Valid choices: categorical sequential other |
Describes the domain of the parameter |
infoUrl |
string |
URL (relative to aboutUrl ) of a web page with further information about the parameter |
datatype |
json |
CSVW datatype description for values for this parameter. I.e. content of the Value column of associated rows in ValueTable should be interpreted/parsed accordingly |
Source | list of string (separated by ; ) |
Source describing the parameter in detail References sources.bib::BibTeX-key |
Table codes.csv
property | value |
---|---|
dc:conformsTo | CLDF CodeTable |
dc:extent | 29 |
Name/Property | Datatype | Description |
---|---|---|
ID | string Regex: [a-zA-Z0-9_\-]+ |
Primary key |
Parameter_ID | string |
The parameter or variable the code belongs to. References parameters.csv::ID |
Name | string |
|
Description | string |
|
numerical_value |
integer |
Integer value associated with a code. Implements ordering for ordered parameter domains. |
Table languages.csv
This table lists all Glottolog languoids, i.e. families, languages and dialects which are nodes in the Glottolog classification - including "non-genealogical" trees as described at https://glottolog.org/glottolog/glottologinformation . Thus, assumptions about the properties of a languoid listed here should be made after including associated information from ValueTable, in particular for languoid level and category. Locations (WGS 84 coordinates) for language groups, i.e. languoids of level "family are computed as recursive centroids as described at https://pyglottolog.readthedocs.io/en/latest/homelands.html#pyglottolog.homelands.recursive_centroids while locations for dialects are simply inherited from the associated languoids of level "language" in most cases.
property | value |
---|---|
dc:conformsTo | CLDF LanguageTable |
dc:extent | 26953 |
Name/Property | Datatype | Description |
---|---|---|
ID | string Regex: [a-zA-Z0-9_\-]+ |
Primary key |
Name | string |
|
Macroarea | list of string (separated by ; ) |
|
Latitude | decimal ≥ -90 ≤ 90 |
|
Longitude | decimal ≥ -180 ≤ 180 |
|
Glottocode | string Regex: [a-z0-9]{4}[1-9][0-9]{3} |
|
ISO639P3code | string Regex: [a-z]{3} |
|
Level |
string Valid choices: language dialect family |
Glottolog languoid level. |
Countries | list of string (separated by ; ) |
ISO 3166-1 alpha-2 country codes for countries a language is spoken in. |
Family_ID |
string |
Glottocode of the top-level genetic unit, the languoid belongs to References languages.csv::ID |
Language_ID | string |
Glottocode of the language-level languoid, the languoid belongs to (in case of dialects) References languages.csv::ID |
Closest_ISO369P3code |
string |
ISO 639-3 code of the languoid or an ancestor if the languoid is a dialect. See also #13 |
First_Year_Of_Documentation |
integer |
The first year that an extinct languoid was documented (in the sense that there is data that pertains to it). Positive numbers are years AD, negative numbers are years BC. |
Last_Year_Of_Documentation |
integer |
The last year that an extinct language was documented. (in the sense that there is data that pertains to it). Positive numbers are years AD, negative numbers are years BC. |
Is_Isolate |
boolean |
Marks a language-level languoid as isolate, i.e. as language with no genetic relationship with other languages. |
Table names.csv
Alternative names for Glottolog languoids from various sources.
property | value |
---|---|
dc:extent | 120134 |
Name/Property | Datatype | Description |
---|---|---|
ID | string |
Primary key |
Language_ID | string |
References languages.csv::ID |
Name | string |
|
Provider | string |
|
lang | string |
Table trees.csv
property | value |
---|---|
dc:conformsTo | CLDF TreeTable |
dc:extent | 247 |
Name/Property | Datatype | Description |
---|---|---|
ID | string Regex: [a-zA-Z0-9_\-]+ |
Primary key |
Name | string |
Name of tree as used in the tree file, i.e. the tree label in a Nexus file or the 1-based index of the tree in a newick file |
Description | string |
Describe the method that was used to create the tree, etc. |
Tree_Is_Rooted | boolean Valid choices: Yes No |
Whether the tree is rooted (Yes) or unrooted (No) (or no info is available (null)) |
Tree_Type | string Valid choices: summary sample |
Whether the tree is a summary (or consensus) tree, i.e. can be analysed in isolation, or whether it is a sample, resulting from a method that creates multiple trees |
Tree_Branch_Length_Unit | string Valid choices: change substitutions years centuries millennia |
The unit used to measure evolutionary time in phylogenetic trees. |
Media_ID | string |
References a file containing a Newick representation of the tree, labeled with identifiers as described in the LanguageTable (the Media_Type column of this table should provide enough information to chose the appropriate tool to read the newick) References media.csv::ID |
Source | list of string (separated by ; ) |
References sources.bib::BibTeX-key |
Table media.csv
property | value |
---|---|
dc:conformsTo | CLDF MediaTable |
dc:extent | 1 |
Name/Property | Datatype | Description |
---|---|---|
ID | string Regex: [a-zA-Z0-9_\-]+ |
Primary key |
Name | string |
|
Description | string |
|
Media_Type | string Regex: [^/]+/.+ |
|
Download_URL | anyURI |
|
Path_In_Zip | string |