-
Notifications
You must be signed in to change notification settings - Fork 252
Yomichan shouldn't prioritize exact match over frequency. #1669
Comments
You seem to be having two different issues here:
I would argue that 1 is the correct behaviour, because how do we know the user doesn't want to see 歩き instead of 歩く? 歩き has the additional noun meaning which could be correct for the context. Compare vs Jisho, which also doesn't list 歩く at the top. And while maybe this is a contrived example, a learner should also be able to intuit that 歩き is a form of 歩く from both the raw text and the definition. 2 is probably the same issue as #105, and you can improve this by decreasing the priority of the names dictionary. |
Yeah I just moved jmnedict to a separate profile so I didn't have to flip through stacks of names when looking for a word |
I was thinking maybe provide an option in the settings to prioritize deinflected form over the inflection, and I think it makes sense because in J-J dictionaries, 90% of the time they will ask us do refer to the base (deinflected form). Another way to deal with this is to place the deinflected form right below the exact match, also controlled by settings of course since I believe it's more of a user preference |
I believe this should be handled by the freq information. For instance, 歩き has a freq of 2 while 歩く has a freq of 601. This freq information is taken from the provided jmdict dict. On most instances I believe it makes more sense showing the de-inflected form but it is true that sometimes the conjugated form is way more frequent than the unconjugated one. ex: 物思い vs 物思う. On another note, where does this freq info come from? I can't seem to find it in the jmdict file itself.
I already have my name dictionary on the lowest priority compared to my other dicts. That is why I believe yomichan displays direct matches higher than deconjugated matches. In this example, all the names are considered as a direct match since the looked up text is in phonetic while 食べる need to be de-conjugated and would be considered as an indirect match. At least, that is what my understanding of the behaviour is. |
This information isn't store in the dictionaries that Yomichan imports, and I'm not sure it would be safe in the general case to assume what is and isn't an inflection.
To clarify: by "freq" do you mean the score for a definition, the green frequency tags, or something else?
https://github.com/FooSoft/yomichan-import/blob/83e3e44f46e344bfe66d9c7181caa5b113f8fb2a/edict.go#L160
Yeah, I see what you mean now; this issue affects kana-only searches moreso than kanji definitions. There is also some discussion in #1539 about updating how dictionary priority is handled internally, and this may fall into that category as well. For reference, this is the current code for sorting dictionary entries: yomichan/ext/js/language/translator.js Lines 1186 to 1228 in e7d349c
|
Here are some examples, I've excluded more extreme examples that would result in images ridiculously long:
The text was updated successfully, but these errors were encountered: