-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added names of less-studied languages #4880
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the addition, @BenjaminGalliot.
Just a suggested fix below.
c130481
to
0b1128a
Compare
Added names of less studied languages (with their Glottolog codes) for existing datasets: Yongning Na (yong1288) and Japhug (japh1234).
0b1128a
to
747376b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As pointed out in my comment below, currently we are using IANA language codes, not Glottolog codes.
Also note that in each dataset card, besides the language
tag (validated against this file languages.json
), users can use other tags to give further details about the language:
language_bcp47
: to list BCP47 language tags, plus any of the allowed suffixes (script, region, variant,...)language_details
: to give further details
OK, I removed Glottolog codes and only added ISO 639-3 ones. The former are for the moment in corpus card description, language details, and in subcorpora names. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
Added names of less-studied languages (nru – Narua and jya – Japhug) for existing datasets.