Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor language search autocomplete and add inner word matching #8160

Merged
merged 10 commits into from
Aug 19, 2024

Conversation

sbwhitt
Copy link
Collaborator

@sbwhitt sbwhitt commented Aug 4, 2023

Closes #8146

Refactor the current autocomplete_languages util function to include a limit parameter as well as add prefix matching of inner language name words. For example, '/languages/_autocomplete?q=greek' will match with 'Ancient Greek', 'Modern Greek', and 'Greek' rather than just 'Greek' as it does now.

Technical

Consolidates existing match checks into a more generic matching function that allows for optional translation id.

A limit parameter was also added which controls how many results will be added to the returned iterator. Previously autocomplete_languages would return the total number of matches then was truncated down to a default limit of 5 when called in autocomplete.py. This default limit is preserved, but now it is possible to request more than 5 language matches.

Additionally, a small change was made to the supporting get_languages function which adds an optional limit parameter that defaults to the previous hard-coded value of 1000. This should clarify that a limit is being used within that function without changing any existing behavior.

Testing

Go to /languages/_autocomplete and attempt to search with queries like ?q=greek&limit=10. Verify that at least one inner word of the returned language names match the prefix given (q=greek). Also verify that the limit param accurately controls how many languages are returned.

Screenshot

Stakeholders

@cdrini @tfmorris

@sbwhitt sbwhitt marked this pull request as ready for review August 8, 2023 16:33
@sbwhitt sbwhitt changed the title WIP: Refactor language search autocomplete and add inner word matching Refactor language search autocomplete and add inner word matching Aug 8, 2023
openlibrary/plugins/upstream/utils.py Outdated Show resolved Hide resolved
openlibrary/plugins/upstream/utils.py Outdated Show resolved Hide resolved
@cdrini cdrini added the Priority: 2 Important, as time permits. [managed] label Aug 28, 2023
Copy link
Collaborator

@cdrini cdrini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General logic lgtm! A few code reorg suggestions.

openlibrary/plugins/upstream/utils.py Outdated Show resolved Hide resolved
openlibrary/plugins/upstream/utils.py Outdated Show resolved Hide resolved
@sbwhitt sbwhitt force-pushed the feat/language-autocomplete branch from 44879a2 to 604e926 Compare October 3, 2023 18:52
Copy link
Contributor

@cclauss cclauss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

language is a dict...

openlibrary/plugins/upstream/utils.py Outdated Show resolved Hide resolved
openlibrary/plugins/upstream/utils.py Outdated Show resolved Hide resolved
hornc and others added 4 commits May 31, 2024 16:30
Co-authored-by: Christian Clauss <cclauss@me.com>
Co-authored-by: Christian Clauss <cclauss@me.com>
@hornc
Copy link
Collaborator

hornc commented May 31, 2024

@cdrini I think this is good to merge now. I thought there was a problem with case matching, but I think that was just some form of response caching when I tested it locally. Testing this with the full language data in the staging env would be good.

@cdrini cdrini force-pushed the feat/language-autocomplete branch from 84f4649 to 5d6b799 Compare August 19, 2024 15:03
Copy link
Collaborator

@cdrini cdrini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok lgtm! Team effort on this one thank you both @sbwhitt and @hornc ! Thanks for your patience, I was having trouble determining the refactor needed here, but I think I got it.

I did a refactor to remove the _matches_lang_name method since that was making this rather difficult for me to follow. The method was introducing a confusing bit of misdirection in the logic. I instead introduced a more general method, word_prefix_match and a more specific method, get_names_to_try, for getting the various language names to try.

Appears to be working like a charm!

image

@cdrini cdrini merged commit e22c683 into internetarchive:master Aug 19, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: 2 Important, as time permits. [managed]
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expand language autocomplete to also prefix match inner words
5 participants