Refactor language search autocomplete and add inner word matching #8160

sbwhitt · 2023-08-04T16:21:51Z

Refactor the current autocomplete_languages util function to include a limit parameter as well as add prefix matching of inner language name words. For example, '/languages/_autocomplete?q=greek' will match with 'Ancient Greek', 'Modern Greek', and 'Greek' rather than just 'Greek' as it does now.

Technical

Consolidates existing match checks into a more generic matching function that allows for optional translation id.

A limit parameter was also added which controls how many results will be added to the returned iterator. Previously autocomplete_languages would return the total number of matches then was truncated down to a default limit of 5 when called in autocomplete.py. This default limit is preserved, but now it is possible to request more than 5 language matches.

Additionally, a small change was made to the supporting get_languages function which adds an optional limit parameter that defaults to the previous hard-coded value of 1000. This should clarify that a limit is being used within that function without changing any existing behavior.

Testing

Go to /languages/_autocomplete and attempt to search with queries like ?q=greek&limit=10. Verify that at least one inner word of the returned language names match the prefix given (q=greek). Also verify that the limit param accurately controls how many languages are returned.

Screenshot

Stakeholders

@cdrini @tfmorris

openlibrary/plugins/upstream/utils.py

cdrini

General logic lgtm! A few code reorg suggestions.

openlibrary/plugins/upstream/utils.py

for more information, see https://pre-commit.ci

openlibrary/plugins/upstream/utils.py

cclauss

language is a dict...

openlibrary/plugins/upstream/utils.py

Co-authored-by: Christian Clauss <cclauss@me.com>

hornc · 2024-05-31T05:39:53Z

@cdrini I think this is good to merge now. I thought there was a problem with case matching, but I think that was just some form of response caching when I tested it locally. Testing this with the full language data in the staging env would be good.

cdrini

Ok lgtm! Team effort on this one thank you both @sbwhitt and @hornc ! Thanks for your patience, I was having trouble determining the refactor needed here, but I think I got it.

I did a refactor to remove the _matches_lang_name method since that was making this rather difficult for me to follow. The method was introducing a confusing bit of misdirection in the logic. I instead introduced a more general method, word_prefix_match and a more specific method, get_names_to_try, for getting the various language names to try.

Appears to be working like a charm!

cclauss reviewed Aug 6, 2023

View reviewed changes

openlibrary/plugins/upstream/utils.py Show resolved Hide resolved

sbwhitt marked this pull request as ready for review August 8, 2023 16:33

sbwhitt changed the title ~~WIP: Refactor language search autocomplete and add inner word matching~~ Refactor language search autocomplete and add inner word matching Aug 8, 2023

mekarpeles assigned cdrini Aug 14, 2023

tfmorris reviewed Aug 24, 2023

View reviewed changes

openlibrary/plugins/upstream/utils.py Outdated Show resolved Hide resolved

openlibrary/plugins/upstream/utils.py Outdated Show resolved Hide resolved

tfmorris approved these changes Aug 24, 2023

View reviewed changes

cdrini added the Priority: 2 Important, as time permits. [managed] label Aug 28, 2023

cdrini requested changes Oct 3, 2023

View reviewed changes

openlibrary/plugins/upstream/utils.py Outdated Show resolved Hide resolved

openlibrary/plugins/upstream/utils.py Outdated Show resolved Hide resolved

sbwhitt and others added 5 commits October 3, 2023 14:52

refactor autocomplete and add inner word matching

a4d753c

add missing return statement

604de4e

[pre-commit.ci] auto fixes from pre-commit.com hooks

c167d13

for more information, see https://pre-commit.ci

update param name

20985ef

add suggested edits

604e926

sbwhitt force-pushed the feat/language-autocomplete branch from 44879a2 to 604e926 Compare October 3, 2023 18:52

sbwhitt commented Nov 27, 2023

View reviewed changes

openlibrary/plugins/upstream/utils.py Outdated Show resolved Hide resolved

cclauss suggested changes Nov 27, 2023

View reviewed changes

openlibrary/plugins/upstream/utils.py Outdated Show resolved Hide resolved

openlibrary/plugins/upstream/utils.py Outdated Show resolved Hide resolved

hornc and others added 4 commits May 31, 2024 16:30

language is a dict

34f002e

Co-authored-by: Christian Clauss <cclauss@me.com>

language is a dict

63509b4

Co-authored-by: Christian Clauss <cclauss@me.com>

fix typo in string quotes

0acca4b

fix. github IDE seems to encourage these typos

cae08a7

hornc mentioned this pull request Aug 12, 2024

Recent (non-MARC) imports are adding deprecated language codes (presumably via language name lookups, not just old codes in the import data) #9504

Open

cdrini force-pushed the feat/language-autocomplete branch 4 times, most recently from 381d58a to 84f4649 Compare August 19, 2024 15:00

Refactor autocomplete_languages to remove _matches_lang_name method

5d6b799

cdrini force-pushed the feat/language-autocomplete branch from 84f4649 to 5d6b799 Compare August 19, 2024 15:03

cdrini approved these changes Aug 19, 2024

View reviewed changes

cdrini merged commit e22c683 into internetarchive:master Aug 19, 2024
4 checks passed

cdrini mentioned this pull request Aug 19, 2024

Greek and Modern Greek are both included on edit form. #8145

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor language search autocomplete and add inner word matching #8160

Refactor language search autocomplete and add inner word matching #8160

sbwhitt commented Aug 4, 2023 •

edited

Loading

cdrini left a comment

cclauss left a comment

hornc commented May 31, 2024

cdrini left a comment •

edited

Loading

Refactor language search autocomplete and add inner word matching #8160

Refactor language search autocomplete and add inner word matching #8160

Conversation

sbwhitt commented Aug 4, 2023 • edited Loading

Technical

Testing

Screenshot

Stakeholders

cdrini left a comment

Choose a reason for hiding this comment

cclauss left a comment

Choose a reason for hiding this comment

hornc commented May 31, 2024

cdrini left a comment • edited Loading

Choose a reason for hiding this comment

sbwhitt commented Aug 4, 2023 •

edited

Loading

cdrini left a comment •

edited

Loading