-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Searching for "digital" finds "digit" #2484
Comments
Ah, this tweet: https://twitter.com/jonintweet/status/637223429175947264 Hmm. We worked a bit on search relevance in #1928 . Maybe the comment from @tercer at #1928 (comment) should be copied to this issue since that was for Beta 15 (but still open in QA). |
Easy to reproduce at https://dataverse.harvard.edu/dataverse/andrewleigh/?q=digital (running 5.11.1): |
FWIW - there is phrase support - searching for "digital code" only returns the first hit above. However, stemming is enabled so the various forms of words - digital/digit, code/codes are found, even in the phrase. |
Yes, stemming. Exactly. What we should probably do is first write some tests to make assertion about how our search works. Then try to fix this issue and hopefully not cause other problems. 😄 If anyone wants to pick this up, here is where we keep the search tests: https://github.com/IQSS/dataverse/blob/v5.12/src/test/java/edu/harvard/iq/dataverse/api/SearchIT.java |
To focus on the most important features and bugs, we are closing issues created before 2020 (version 5.0) that are not new feature requests with the label 'Type: Feature'. If you created this issue and you feel the team should revisit this decision, please reopen the issue and leave a comment. |
Pointed out by someone on Twitter:
"try search "digital" and you find files labelled "digit". add phrase support."
In production, I searched for digital and narrowed down to files. Digital did show up in the description of some files (on page 66, I was logged in) but not in the name of the file as far as I could see in my quick look. The first page of results (and many after) had Digit in the title but not Digital. We should look at the relevancy for searches; would make sense that if the exact term showed up anywhere in the metadata of an object, it would be one of the first results displayed.
The text was updated successfully, but these errors were encountered: