Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop 五段化 potential rule #2047

Merged
merged 2 commits into from
Oct 16, 2024
Merged

Drop 五段化 potential rule #2047

merged 2 commits into from
Oct 16, 2024

Conversation

enellis
Copy link
Contributor

@enellis enellis commented Oct 15, 2024

Follow-up to #2038.

I encountered duplicate results for inputs like 宣せぬ or 宣せず, which stemmed from the following rule:

['せる', 'する', Type.IchidanVerb, Type.SpecialSuruVerb, [Reason.Irregular, Reason.Potential]],

After reading again this resource from JMdict, I realized that this rule is incorrect. Verbs undergoing 五段化 have either し得る or できる as their potential forms, not the す-Godan potential form (せる).

By dropping this incorrect rule, the issue is resolved, and we can also revert the previous commit that was intended to prevent invalid sequencing of potential forms.

Verbs that undergo 五段化 use either できる or し得る as their
potential form, rather than the す-Godan verb potential form (せる).

The rule caused duplicates results for inputs like 宣せぬ / 宣せず.
…tentialOrPassive and Causative"

This is no longer necessary after the removal of the 五段化
potential rule.

Reverts commit f2d6dfc.
@enellis
Copy link
Contributor Author

enellis commented Oct 15, 2024

Sorry for catching this a little bit too late!

I plan to add a rule for masu-stem + 得る/える/うる as a potential form. Or do you think it would be better to have a separate reason like -eru?

@birtles birtles merged commit cc44423 into birchill:main Oct 16, 2024
2 checks passed
@birtles
Copy link
Member

birtles commented Oct 16, 2024

Sorry for catching this a little bit too late!

Not at all. Thank you for catching this!

I plan to add a rule for masu-stem + 得る/える/うる as a potential form. Or do you think it would be better to have a separate reason like -eru?

I think I lean towards more explicit rules letting potential mean the potential form students learn about in classrooms/textbooks and having a separate annotation for -eru/-uru even if the conjugation text is mostly the same.

(Also, since JMdict already has two entries for あり得る, I guess we'll end up with triplicate results when looking up あり得る after adding this new rule? Maybe that's unavoidable?)

@enellis enellis deleted the fix-potential branch October 16, 2024 09:51
@enellis
Copy link
Contributor Author

enellis commented Oct 16, 2024

(Also, since JMdict already has two entries for あり得る, I guess we'll end up with triplicate results when looking up あり得る after adding this new rule? Maybe that's unavoidable?)

Yeah, I'm a bit concerned about 見える as well, but I think as long as it's sorted correctly, it should be fine and not too confusing. What do you think?

@birtles
Copy link
Member

birtles commented Oct 17, 2024

Yeah, I'm a bit concerned about 見える as well, but I think as long as it's sorted correctly, it should be fine and not too confusing. What do you think?

I don't suppose there's any way to explicitly detect and filter out those cases? Alternatively we could just add the +得る rule and not add the +える・+うる rules for now?

@enellis
Copy link
Contributor Author

enellis commented Oct 21, 2024

I don't suppose there's any way to explicitly detect and filter out those cases? Alternatively we could just add the +得る rule and not add the +える・+うる rules for now?

While I believe it would be fairly simple to filter those out, I’m hesitant to do so because it feels somewhat arbitrary to me. I actually quite like it when entries like あり得る are "explained" through deinflection, as long as the "explanatory" entry is placed afterward. I can see directly that あり得る is a form of ある.
However, in cases like 見える, it could be misleading, since the える in 見える isn’t related to 得る.

Just adding 得る and not える and うる would be the way to go then, I think.

@birtles
Copy link
Member

birtles commented Oct 22, 2024

While I believe it would be fairly simple to filter those out, I’m hesitant to do so because it feels somewhat arbitrary to me. I actually quite like it when entries like あり得る are "explained" through deinflection, as long as the "explanatory" entry is placed afterward. I can see directly that あり得る is a form of ある. However, in cases like 見える, it could be misleading, since the える in 見える isn’t related to 得る.

Just adding 得る and not える and うる would be the way to go then, I think.

Sounds good. We can reinvestigate enabling the える・うる patterns later if it proves useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants