-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encoded Common Chinese Dialectal Ideographs List #84
Comments
use for U+45A2 䖢 https://www.unicode.org/L2/L2016/16385r-uax45-add-ideo.pdf use for U+2028E 𠊎 https://m.sohu.com/a/151136591_164554 use for U+2CCB1 𬲱 http://www.aminoacid-jirong.com/news/9z2996qq9j3j3f9.html use for U+2CCC7 𬳇 http://baoji.mofcom.gov.cn/article/dxsw/201410/20141000762978.shtml use for U+2EA3B 𮨻 https://www.sohu.com/a/205854785_531962 report for U+310F1 𱃪 http://news.sina.com.cn/s/2005-03-22/03115425279s.shtml use for U+310F1 𱃱 |
Some non-G0 or non-G1 common Chinese dialectal ideographs have been included in IICore or UnihanCore. There are two rules in GB/T 22484-2016 for naming the metro and bus stations related to this list, A.3.2 and A.3.9.
|
Ongoing encoding characters are shown as below.
|
The following characters are needed to encode in future.
|
I provide the instable codes for sub-languages or sub-dialects of the tag
|
This list is prepared to include the encoded ideographs which could be used to record the common words in different Chinese dialects, and the characters should be common and familiar for the local people at least. The characters in this list are out of IICore and UnihanCore, so we can treat it as a supplement of IICore and UnihanCore. For example, “唦” is not a G0 character, and it's always used in a Wuhan dialect word “板唦” by Dada Band (达达乐队), but it has been included in UnihanCore which is marked as HMT, so there is no need to include it here. Please click here to see the explanation by Dada Band.
Some people will use a wrong character to replace the right one which is hard to input although they know how to write the right one. Some wrong characters have become more and more popular.
The dialect column is based on ISO 639-3:2007, GB/T 2260-2007 and ISO 3166-1:2020, ISO 3166-2:CN, ISO 3166-2:TW, ISO 3166-2:HK, ISO 3166-2:MO, ISO 3166-2:JP, ISO 3166-2:KR, ISO 3166-2:KP, ISO 3166-2:VN, ISO 3166-2:SG, ISO 3166-2:MY, and we can add more than one value if possible.
T4-2F6B
T4-4D64,
KP1-782B
T4-362C,
K3-3545,
KP1-8820
T3-4556,
KP1-8845
T4-5E5A,
KP1-88C7
T3-476C,
K2-4475,
KP1-55B4
T3-4D6D,
J14-737C,
K2-515B,
KP1-65ED
T5-4B4C,
K2-6E2D,
KP1-8864
T4-5A49,
K2-7067,
KP1-8CD1
T5-5D59
T4-6122
UTC-00041
TE-2F3E
UTC‑00678
UTC-00074
The text was updated successfully, but these errors were encountered: