Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speller: words containing the dictionary separator are not handled properly #86

Closed
jaumeortola opened this issue Nov 29, 2016 · 3 comments
Assignees
Labels
Milestone

Comments

@jaumeortola
Copy link
Contributor

We found this bug in LanguageTool using the British English dictionary. See: languagetool-org/languagetool#619

The dictionary has this structure:
<word form><separator><byte containing frequency information A..Z>

When a word like "eta_I" is looked up in the speller, the speller stops working for all the next words. I have written a test here.

The problem is clearly in the method isInDictionary(). Once containsSeparators = false;, it is never initialized to true again and isInDictionary is false for all next words.

An obvious solution is to check if the original word contains the separator and then return false in isInDictionary() even before searching for the word, because it is just impossible to find such a word in the dictionary.

Anyway, I don't understand the logic for the variable containsSeparators, which should be reinitialized to true somewhere. @milekpl

@jaumeortola
Copy link
Contributor Author

The bug doesn't happen with other separator characters, like "+". So perhaps the issue is related to #85.

jaumeortola added a commit to jaumeortola/morfologik-stemming that referenced this issue Apr 23, 2017
@dweiss dweiss closed this as completed in 5eff208 Apr 24, 2017
@dweiss dweiss changed the title bug in speller: words containing the dictionary separator Speller: words containing the dictionary separator are not handled properly Apr 24, 2017
@dweiss
Copy link
Member

dweiss commented Apr 24, 2017

Thanks Jaume! Is there anything else coming or do you want me to publish a point release?

@dweiss dweiss added this to the 2.1.3 milestone Apr 24, 2017
@dweiss dweiss added the bug label Apr 24, 2017
@dweiss
Copy link
Member

dweiss commented Apr 24, 2017

Went ahead and released 2.1.3, all tests passed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants