Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

词库贡献 #666

Open
iDvel opened this issue Feb 5, 2024 · 212 comments
Open

词库贡献 #666

iDvel opened this issue Feb 5, 2024 · 212 comments
Labels
dict 词库相关

Comments

@iDvel
Copy link
Owner

iDvel commented Feb 5, 2024

目前词库已经过脚本检查及大量人工校对,但难免有疏漏。
如果有词汇缺失、错音、错字、初始排序不合理的问题,可以直接 PR 或在此留言。

@iDvel iDvel added the dict 词库相关 label Feb 5, 2024
@iDvel iDvel mentioned this issue Feb 5, 2024
@iDvel iDvel pinned this issue Feb 5, 2024
@iDvel

This comment was marked as resolved.

iDvel added a commit that referenced this issue Feb 5, 2024
@iDvel

This comment was marked as resolved.

iDvel added a commit that referenced this issue Feb 10, 2024
@iDvel

This comment was marked as resolved.

@tao659
Copy link

tao659 commented Feb 13, 2024

主要用的是 蔄山 和 苘山 也读 man shan 这两个 现在 也有一些写𬜬山 但是很少,主要是 前面两个,公文一般是蔄山, 非正式公文一般写苘山 的多

或许将 苘山 独读 man shan 的 只有 本地人

@iDvel
Copy link
Owner Author

iDvel commented Feb 13, 2024

「𬜬山」、「蔄山」都写上得了。
「苘qing」应该是误写,百度百科是一点都不能信的。

汉字在这种规范上老坑爹了。
类推简化了「蔄man」→「𬜬man」,十多年过去了,字典还是「𬜬」,当地人包括当地政府还是用「蔄」。
规范出了没人用,也没有顺从当地习惯更改规范,最后就是混用,摆烂,也没人管。

iDvel added a commit that referenced this issue Feb 13, 2024
iDvel added a commit that referenced this issue Feb 14, 2024
@boomker
Copy link
Contributor

boomker commented Feb 17, 2024

整理出部分错音词条放到附件里
rime-ice_zhuyin-err.txt

iDvel pushed a commit that referenced this issue Feb 18, 2024
@chenbihao

This comment was marked as off-topic.

@iDvel
Copy link
Owner Author

iDvel commented Feb 20, 2024

唵嘛呢叭咪 唵嘛呢嘛呢叭咪吽

唵并没有 ong 这个音,汉语里面也没有 ong 音节的字。 我看好多电视剧里就念 an 的; 或者按外来音,注音为 wong,之类, 或者直接注在英文,或者中英混合词典中

(注 ong 音,会导致编译为词典包 pack 的时候,由于缺少这个音节,报错并 drop 掉这个词汇)

「唵嘛呢叭咪吽」按字典的音来注吧, an ma ni ba mi hong http://www.jiaodui.com/bbs/read.php?tid=10782
目前也有简单的方法,可输入「六字真言」或「六字大明咒」,通过 emoji 来输出。

@iDvel

This comment was marked as off-topic.

iDvel added a commit that referenced this issue Feb 20, 2024
@mavsill

This comment was marked as off-topic.

@iDvel

This comment was marked as off-topic.

@mavsill

This comment was marked as off-topic.

@iDvel

This comment was marked as off-topic.

@mavsill

This comment was marked as off-topic.

@iDvel
Copy link
Owner Author

iDvel commented Feb 23, 2024

还是就这样吧,我试了一下大小是一样的,速度好像也没多大差距。
我把很多同义多音字如「熟、血」之类的也扔到 tencent 词库,让 Rime 自动注了。
平时加词我也是扔到 tencent 里了,不用写注音,方便一点。

@gaboolic
Copy link
Contributor

#703

词库里很多 “犭更犬”,是否要改为 “㹴犬”

luckmoon pushed a commit to luckmoon/rime-ice that referenced this issue Feb 28, 2024
@tansongchen
Copy link

尝试使用雾凇拼音来开发其他输入方案的过程中,发现部分词组的注音中某个字的读音没有包含在它单独的读音中:

dropping entry '陈寅恪' with invalid syllable: que
dropping entry '放饭流歠' with invalid syllable: chu
dropping entry '解州' with invalid syllable: hai
dropping entry '解州关帝庙' with invalid syllable: hai
dropping entry '解州镇' with invalid syllable: hai
dropping entry '亠部' with invalid syllable: jiong
dropping entry '擖哧' with invalid syllable: ka
dropping entry '肋脦' with invalid syllable: de
dropping entry '艋舺' with invalid syllable: jia
dropping entry '将进酒' with invalid syllable: qiang
dropping entry '青玉案' with invalid syllable: wan
dropping entry '青玉案元夕' with invalid syllable: wan
dropping entry '通什镇' with invalid syllable: za
dropping entry '菶菶萋萋' with invalid syllable: yong
dropping entry '鲗鱼涌' with invalid syllable: ze
dropping entry '槁项黄馘' with invalid syllable: xu
dropping entry '黄馘槁项' with invalid syllable: xu
dropping entry '尨眉皓发' with invalid syllable: rong
dropping entry '泥而不滓' with invalid syllable: nie

这一点是否需要修正,即保证词组中的读音一定在单字中也出现过?

@tansongchen
Copy link

tencent 这几个词没有相应的拼音,注不出来

E20240301 15:00:13.292201 232383 entry_collector.cc:135] Encode failure: '李到𬀪'.
E20240301 15:00:14.678122 232383 entry_collector.cc:135] Encode failure: '薄护尾𬶏'.
E20240301 15:00:14.679606 232383 entry_collector.cc:135] Encode failure: '薄身罗马诺𬶋'.

@Lion176
Copy link

Lion176 commented Oct 18, 2024

OpenAI模型词汇:
GPT-4o
GPT-4o mini
GPT-4o with canvas
o1-preview
o1-mini

@adan89lion

This comment was marked as resolved.

@mirtlebot

This comment was marked as resolved.

@Luchangxin-1

This comment was marked as off-topic.

@changzaicl

This comment was marked as resolved.

@Mr54233

This comment was marked as resolved.

@lightumcc

This comment was marked as resolved.

@changzaicl

This comment was marked as resolved.

@Mr54233

This comment was marked as resolved.

@iDvel

This comment was marked as resolved.

@changzaicl

This comment has been minimized.

@kirito41dd

This comment was marked as resolved.

@iDvel

This comment was marked as resolved.

iDvel added a commit that referenced this issue Nov 4, 2024
ansonhex pushed a commit to ansonhex/rime-ice that referenced this issue Nov 5, 2024
ansonhex pushed a commit to ansonhex/rime-ice that referenced this issue Nov 5, 2024
ansonhex pushed a commit to ansonhex/rime-ice that referenced this issue Nov 5, 2024
ansonhex pushed a commit to ansonhex/rime-ice that referenced this issue Nov 5, 2024
@changzaicl

This comment has been minimized.

iDvel added a commit that referenced this issue Nov 7, 2024
@xjkdev

This comment has been minimized.

@iDvel

This comment has been minimized.

@boomker

This comment has been minimized.

@hegotit

This comment has been minimized.

@luxuxl

This comment has been minimized.

iDvel added a commit that referenced this issue Nov 14, 2024
@hegotit

This comment was marked as resolved.

@Nullizer
Copy link

默认配置竟然打不出「姛」。我查了一下,这个字也不在CJK扩展区啊……
也许8105+的默认配置不够全面,应该把非扩展区的「中日韩统一表意文字」两万字都加进去?

@lightumcc
Copy link

默认配置竟然打不出「姛」。我查了一下,这个字也不在CJK扩展区啊…… 也许8105+的默认配置不够全面,应该把非扩展区的「中日韩统一表意文字」两万字都加进去?

你可以试一下uU女同,通过拆字模式打出来

@iDvel
Copy link
Owner Author

iDvel commented Nov 27, 2024

默认配置竟然打不出「姛」。我查了一下,这个字也不在CJK扩展区啊…… 也许8105+的默认配置不够全面,应该把非扩展区的「中日韩统一表意文字」两万字都加进去?

日常使用 99.99% 都在 8105 里面,如果确实偶尔会用或《现汉》有收录的可以加进来,两万字有些没必要了,翻页找字时徒增烦恼。
「姛」这种字没必要吧(不确定,我第一次见这个字),平时有生僻字需求可以直接在 rime_ice.dict.yaml 开启大字表,偶尔打一个不在字表的生僻字可以用拼字 uU

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dict 词库相关
Projects
None yet
Development

No branches or pull requests