Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Japanese text being identified as Kurmanji #7

Open
ftkurt opened this issue Jun 7, 2020 · 3 comments
Open

Japanese text being identified as Kurmanji #7

ftkurt opened this issue Jun 7, 2020 · 3 comments

Comments

@ftkurt
Copy link
Contributor

ftkurt commented Jun 7, 2020

It's probably because of this line.

– Kurmanci 7.2974490546991655

See the following texts:

୨୧譲渡交換୨୧ ツイステ 色紙コレクション vol.1 vol.2 譲┊︎デューストレイケイト ジャミルオルトシルバー 求┊︎同異種リドル or 定価(+送料) 郵送 or 都内手渡し可能 ⿻ 各1BOX予約済みです。 ⿻…

東映HP更新✨ 来週はガルザとクランチュラがジャメンタルを研究🔍録りおろしナレーションたっぷりでお届けします! そしてHPで #キラトーーク 延長戦!? 魔進の声を演じるキャストのテンションMAX!なコメントを掲載しております✨ #キラ…

「DXヒューマギアプログライズキーセット」はご予約受付中!シェスタ、腹筋崩壊太郎、マモル、一貫ニギローのデータを宿したプログライズキーのセットです✨ 別売りのDXなりきりシリーズとも連動します。 URL…

@DanielJDufour
Copy link
Owner

Hi, @ftkurt . Thank you for identifying this issue! This package doesn't support Japanese yet, but it's easy to add a language. Would you like to submit a pull request? The documentation on how to add a language is here: https://github.com/DanielJDufour/language-detector/blob/master/CONTRIBUTING.md

@ftkurt
Copy link
Contributor Author

ftkurt commented Jun 7, 2020

I briefly looked at Japanese character sets, and it seems its a bit different than other languages as they have multiple sets. Therefore, I would rather prefer someone knowledgeable about Japanese do that. However, I am currently working on collecting Sorani and Kurmanji datasets. I might be able to add more data for those two Kurdish dialects in the coming days. I think this will help with making this package more reliable.

@DanielJDufour
Copy link
Owner

That would be great! Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants