-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalize the search term #1239
Conversation
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
Codecov Report
@@ Coverage Diff @@
## master #1239 +/- ##
=========================================
Coverage 69.07% 69.08%
Complexity 1638 1638
=========================================
Files 32 32
Lines 4016 4017 +1
=========================================
+ Hits 2774 2775 +1
Misses 1242 1242
Continue to review full report at Codecov.
|
I tested this and it works as it should. The only issue is that highlighting (with bold text) the matching parts in the autocomplete box is not working when the input string is decomposed (composed on the left, decomposed on the right): I tried to fix this by adding a Skosmos/resource/js/docready.js Line 804 in 30ce5d0
...but it didn't help; I think it would require changes within typeahead.js as it seems to read the search string directly from the text field, so there is no easy opportunity to normalize it. Anyway this is no big deal; there are other similar problems with missing highlights (e.g. in case of accent folding) and these should be pretty rare cases anyway. I think it's reasonable to place the normalization call in ConceptSearchParameters.getSearchTerm(), as that method also performs other types of search term normalization such as stripping whitespace. |
I think #1182 is also updating typeahead. I can try to test it with the latest version and comment if we have a follow-up issue for that @osma 👍 (I will see if I have some text with surrogate characters or just manually edit a dataset and upload to my test fuseki) |
You don't need very special data for this, just text with non-ASCII characters (e.g. åäöéñ) that are stored as composed Unicode characters (NFC), which is the usual case for RDF data. The problem was when the user enters a search string which contains decomposed (NFD) characters (typically copied and pasted from some other system - in our case an Aleph ILS). This comment in the original issue shows how to trigger it with Linux command line tools and the KANTO/FINAF data set in Finto. |
Thanks for spotting this defect, and thanks for the valuable input. :) I'll make a new issue regarding the autocomplete highlighting bug, and call this one an incremental improvement on the situation (and I'll link this discussion there). |
Fixes #1184