Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong separation of "ne" syllable #2

Closed
caseyryan opened this issue Sep 25, 2022 · 2 comments
Closed

Wrong separation of "ne" syllable #2

caseyryan opened this issue Sep 25, 2022 · 2 comments
Labels
needs-native-speaker-input Input from a native speaker is needed to find a feasible solution for this issue

Comments

@caseyryan
Copy link

Hi!
Great solution! But it still has some problem
e.g. a phrase like "wohenhaonine?" it will split into wo hen hao nine
leaving the last "ne" unseparated.
I believe the problems is in this regex:
.replace(new RegExp(`([${vowels}])([^${vowels}nr])`, 'gi'), '$1 $2')
Sorry, I can't make a pull request because I'm not a javascript coder but I'm porting your code to dart and have found this problem.
So, my proposed solution is to add one more match group like this:
.replace(new RegExp(`([${vowels}])(([^${vowels}nr])|(ne))`, 'gi'), '$1 $2')

Not sure if it will not break anything, but as long as I tested it in dart, it works ok

Cheers!

@Connum
Copy link
Owner

Connum commented Sep 26, 2022

Thanks for opening this issue! The thing is, as the script doesn't understand any meaning or context, "nin" "e" would be a possible split as well as "ni" "ne", so it just skips it. You'd have to separate the input using a single quote (there's actually already a similar test case, "Wǒhěnhǎoxièxiènǐ'ne"). I'm not sure if adding this special treatment for "ne" is a good idea or if there are actual use cases for "nin e". Input from a native speaker would be welcome!

@Connum Connum added the needs-native-speaker-input Input from a native speaker is needed to find a feasible solution for this issue label Sep 26, 2022
@caseyryan
Copy link
Author

caseyryan commented Sep 26, 2022

Yes, that's a good point. Unfortunately I'm not a native speaker but I've just checked how Android's pinyin keyboard handles it trying to type 'nine' and it gave me "ni'ne". I also tried to type something like ninema (I think native speakers would laugh at me at this point) but android keyboard also gave me ni'ne'ma

Connum added a commit that referenced this issue Oct 28, 2023
@Connum Connum closed this as completed Oct 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-native-speaker-input Input from a native speaker is needed to find a feasible solution for this issue
Projects
None yet
Development

No branches or pull requests

2 participants