Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ToDo:词库清理计划 #115

Open
maboloshi opened this issue Mar 15, 2024 · 8 comments
Open

ToDo:词库清理计划 #115

maboloshi opened this issue Mar 15, 2024 · 8 comments
Labels
重要 优先处理的议题或拉取请求

Comments

@maboloshi
Copy link
Owner

各位协作者大家好,除了适配新的词汇,也要同步清理过时的词条了。
目前,词库文件 1.2Mb 了

@maboloshi maboloshi pinned this issue Mar 15, 2024
@buiawpkgew1
Copy link
Contributor

如果有自动清理过时的就好了

@TC999
Copy link
Contributor

TC999 commented Jun 7, 2024

可以使用正则合并一些重复词条(以下代码片段取自词库文件)

示例1:

  • 原词条
"1 column": "1 列",
"2 columns": "2 列",
"3 columns": "3 列",
"4 columns": "4 列",
"5 columns": "5 列",
  • 合并为一条
[/(\d+) columns?/, "$1 列“],

示例2

  • 原词条
"Upload files": "上传文件",
"Upload file": "上传文件", // Android UA 下才有
  • 合并为一条
[/Upload files?/, "上传文件"],

一些问题

  • 一些页面标题不会翻译,或者一闪而过
  • 议题、拉取请求、通知页面底下的专业提示每次刷新都不一样(很难进行本地测试

@maboloshi
Copy link
Owner Author

maboloshi commented Jun 7, 2024

正则比较费系统资源, 优先使用静态词库

@maboloshi
Copy link
Owner Author

"专业提示" 目前不是我的优先事项, 我一般情况不会去翻译
"页面标题" 目前处于半抛荒状态, 后期可能调整方式, 词条合并到各个页面词库下

@maboloshi
Copy link
Owner Author

最近打算将某个拉取请求页面repository/pulls页面规则中, 单独独立出来使用repository/pull页面

@buiawpkgew1
Copy link
Contributor

如果用爬虫来抓取GitHub的页面并比对词库中的翻译会怎么样?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
重要 优先处理的议题或拉取请求
Projects
None yet
Development

No branches or pull requests

4 participants
@maboloshi @buiawpkgew1 @TC999 and others