Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HOLD for payment 2024-06-11] [$500] Auto complete emojis does not pop up if a letter with tilde in spanish #38549

Closed
1 of 6 tasks
m-natarajan opened this issue Mar 19, 2024 · 28 comments
Assignees
Labels
Awaiting Payment Auto-added when associated PR is deployed to production Bug Something is broken. Auto assigns a BugZero manager. Daily KSv2 External Added to denote the issue can be worked on by a contributor

Comments

@m-natarajan
Copy link

m-natarajan commented Mar 19, 2024

If you haven’t already, check out our contributing guidelines for onboarding and email contributors@expensify.com to request to join our Slack channel!


Version Number: 1.4.54-0
Reproducible in staging?: y
Reproducible in production?: y
If this was caught during regression testing, add the test name, ID and link from TestRail:
Email or phone of affected tester (no customers):
Logs: https://stackoverflow.com/c/expensify/questions/4856
Expensify/Expensify Issue URL:
Issue reported by: @iwiznia
Slack conversation: https://expensify.slack.com/archives/C049HHMV9SM/p1710785038489889

Action Performed:

  1. Go to staging.expensify.com and sign in
  2. Switch language to Spanish
  3. Open any chat
  4. Type any emoji name which has tilde like (:corazón:) in the composer or search for emoji in the emoji picker

Expected Result:

Should auto complete in composer field or show the results in the emoji picker.

Actual Result:

Does not auto complete in composer and no results found in emoji picker. Not treating o and ó as same

Workaround:

unknown

Platforms:

Which of our officially supported platforms is this issue occurring on?

  • Android: Native
  • Android: mWeb Chrome
  • iOS: Native
  • iOS: mWeb Safari
  • MacOS: Chrome / Safari
  • MacOS: Desktop

Screenshots/Videos

Add any screenshot/video evidence
Screen Shot 2024-03-18 at 8 02 16 PM

Recording.2866.mp4

View all open jobs on GitHub

Upwork Automation - Do Not Edit
  • Upwork Job URL: https://www.upwork.com/jobs/~01191fbf64d1245a9f
  • Upwork Job ID: 1771230550663888896
  • Last Price Increase: 2024-03-22
  • Automatic offers:
    • tienifr | Contributor | 0
Issue OwnerCurrent Issue Owner: @johncschuster
@m-natarajan m-natarajan added Daily KSv2 Bug Something is broken. Auto assigns a BugZero manager. labels Mar 19, 2024
Copy link

melvin-bot bot commented Mar 19, 2024

Triggered auto assignment to @johncschuster (Bug), see https://stackoverflow.com/c/expensify/questions/14418 for more details.

@Krishna2323
Copy link
Contributor

Krishna2323 commented Mar 19, 2024

Proposal

Please re-state the problem that we are trying to solve in this issue.

Auto complete emojis does not pop up if a letter with tilde in spanish

What is the root cause of that problem?

The EMOJI_SUGGESTIONS doesn't match characters with accents.

App/src/CONST.ts

Line 1573 in 22bd9eb

EMOJI_SUGGESTIONS: /:[a-zA-Z0-9_+-]{1,40}$/,

What changes do you think we should make in order to solve the problem?

We need to modify regex to also match characters with accents.

EMOJI_SUGGESTIONS: /:[\p{L}0-9_+-]{1,40}$/u,

We might want to do this in specific places, in that case we can create a new const for /:[\p{L}0-9_+-]{1,40}$/u,.

Result

@johncschuster johncschuster added the External Added to denote the issue can be worked on by a contributor label Mar 22, 2024
@melvin-bot melvin-bot bot changed the title Auto complete emojis does not pop up if a letter with tilde in spanish [$500] Auto complete emojis does not pop up if a letter with tilde in spanish Mar 22, 2024
Copy link

melvin-bot bot commented Mar 22, 2024

Job added to Upwork: https://www.upwork.com/jobs/~01191fbf64d1245a9f

@melvin-bot melvin-bot bot added Overdue Help Wanted Apply this label when an issue is open to proposals by contributors labels Mar 22, 2024
Copy link

melvin-bot bot commented Mar 22, 2024

Triggered auto assignment to Contributor-plus team member for initial proposal review - @thesahindia (External)

@melvin-bot melvin-bot bot removed the Overdue label Mar 22, 2024
@tienifr
Copy link
Contributor

tienifr commented Mar 22, 2024

Proposal

Please re-state the problem that we are trying to solve in this issue.

  1. Does not auto complete in composer and no results found in emoji picker.
  2. Not treating o and ó as same

What is the root cause of that problem?

  1. Currently, when matching for emoji suggestions, we're not matching Latin characters, as can be seen here.

So when we type the ó with the Spanish accent, it's not recognized as emoji suggestion keyword.

  1. When we construct the Trie, we're not taking into account the normalized version of the keyword.

Let's look at an example keyword list here

There's the corazón, so we will only construct the Trie for that word, if the user search for corazon, it won't match this keyword and the result matching does not show. We should be able to search independent of accents (treat o and ó as same), similar to Slack's behavior below
Screenshot 2024-03-23 at 1 36 17 AM

Similarly if we search for corazón, result matching corazon will not show.

What changes do you think we should make in order to solve the problem?

  1. Update the regex here so it will match all Latin characters, we can use \p{L} (similar to what we did with the room name here, so the Regex will become:
EMOJI_SUGGESTIONS: /:[\p{L}0-9_+-]{1,40}$/u,

a. We need to define a method that will "normalize" a string with accented character. We already have such implementation here to normalize the room name, so we can extract it to a new method

const normalizeAccents = (text: string) => text.normalize("NFD").replace(/[\u0300-\u036f]/g, "");

The Regex /[\u0300-\u036f]/g was taken from here, which is better than the [^0-9a-z] currently being used, since the [^0-9a-z] can unintentionally remove other valid non-accent symbols.

b. Then in here, when adding the keyword to the Trie, if the keyword is accented, we should add the non-accented version to the Trie as well

const keywordWithoutAccents = normalizeAccents(keyword);
if (keywordWithoutAccents !== keyword) {
    trie.add(keywordWithoutAccents, {suggestions: [{code: item.code, types: item.types, name}]});
}

c. In here, when we try to extract the keyword location in order to highlight the matching part in the emoji suggestion item, we should normalize the accents as well, to make sure it matches properly

const prefixLocation = normalizeAccents(name).toLowerCase().search(Str.escapeForRegExp(normalizeAccents(prefixLowercase)));

Or instead of defining our own method of comparing strings with accents removed, we can use localeCompare (or Intl.Collator) which will do this for us (more details: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/localeCompare).

What alternative solutions did you explore? (Optional)

Another approach to 2.b is, we can store only the non-accented version in the Trie (like if keyword is corazón, only corazon will be stored in the Trie). Then when we search in the Trie like here, we normalize accents of the searched word before starting to search, so both corazon and corazón will become corazon before we try to search in the Trie.

Also some notes:

  • Besides the changes in 2.b, there could be other changes that need to normalize the accents as well, like when doing update, but the change should be similar.
  • The part where we check if the keyword has accents in 2.b, can be improved a bit by only using Regex.test() with the test for accented character, instead of normalizing then compare with the original string.
  • The Regexes were provided as pseudo-code to illustrate the approach, there're equivalent Regexes for the same purpose, we can consider using one of the alternatives.
  • There might be other places that we need to normalize first then compare, similar to 2.c, like in the EmojiPicker search feature.

Result

Screenshot 2024-03-23 at 2 09 47 AM

@melvin-bot melvin-bot bot added the Overdue label Mar 25, 2024
@thesahindia
Copy link
Member

@tienifr's proposal looks good to me!

🎀 👀 🎀 C+ reviewed

Copy link

melvin-bot bot commented Mar 25, 2024

Triggered auto assignment to @marcaaron, see https://stackoverflow.com/c/expensify/questions/7972 for more details.

@iwiznia
Copy link
Contributor

iwiznia commented Mar 25, 2024

Not sure why this was not added from my report:

should probably be matching using localeCompare or Intl.Collator to treat o and ó the same (and other accented letters in all languages that have them).

I think we should do that. We should not be implementing our own collator, especially since this needs to support internationalization to any language.

@tienifr
Copy link
Contributor

tienifr commented Mar 25, 2024

Proposal updated to add the alternative use of localeCompare instead of our custom logic.

@melvin-bot melvin-bot bot added the Overdue label Apr 1, 2024
@johncschuster
Copy link
Contributor

@thesahindia can you review @tienifr's updated proposal?

@melvin-bot melvin-bot bot removed the Overdue label Apr 1, 2024
@johncschuster
Copy link
Contributor

Bumped in Slack!

@thesahindia
Copy link
Member

@thesahindia can you review @tienifr's updated proposal?

Yes it looks good to me!

@melvin-bot melvin-bot bot removed the Help Wanted Apply this label when an issue is open to proposals by contributors label Apr 3, 2024
Copy link

melvin-bot bot commented Apr 3, 2024

📣 @tienifr 🎉 An offer has been automatically sent to your Upwork account for the Contributor role 🎉 Thanks for contributing to the Expensify app!

Offer link
Upwork job
Please accept the offer and leave a comment on the Github issue letting us know when we can expect a PR to be ready for review 🧑‍💻
Keep in mind: Code of Conduct | Contributing 📖

@iwiznia iwiznia closed this as completed Apr 9, 2024
@iwiznia iwiznia reopened this Apr 9, 2024
@quinthar quinthar moved this from CRITICAL to HIGH in [#whatsnext] #vip-vsb Apr 11, 2024
@tienifr
Copy link
Contributor

tienifr commented Apr 16, 2024

@iwiznia @thesahindia About the feasibility of using localeCompare:

  1. localeCompare only indicates the equality of 2 strings. And that's the abstraction, Intl does not expose any method to nomarlize/remove the locale characters
  2. The current implementation of Trie uses exact match by key which does not allow any custom compare/search function. We can implement that but it would increase the algorithm complexity from O(1) to O(n) considering the search operation:

node = node.children[newWord[0]];

  1. In some places like getStyledTextArray we must find the exact location of the substring withing the emoji name and as 1️⃣ pointed out, we couldn't do it with localeCompare:

const prefixLocation = name.toLowerCase().search(Str.escapeForRegExp(prefixLowercase));

In conclusion, using localeCompare is not feasible considering the large tradeoffs in 2️⃣ and the requirement in 3️⃣. I only found this while implementing with localeCompare. Additionally, the custom normalize logic has already been used in App for a very long time without any problem so I think we can safely reuse it:

App/src/libs/ReportUtils.ts

Lines 1578 to 1581 in ed7029b

const alphaNumeric = workspaceName
.normalize('NFD')
.replace(/[^0-9a-z]/gi, '')
.toUpperCase();

@iwiznia iwiznia moved this from HIGH to MEDIUM in [#whatsnext] #vip-vsb Apr 16, 2024
@melvin-bot melvin-bot bot added Monthly KSv2 and removed Weekly KSv2 labels May 9, 2024
Copy link

melvin-bot bot commented May 9, 2024

This issue has not been updated in over 15 days. @johncschuster, @marcaaron, @thesahindia, @tienifr eroding to Monthly issue.

P.S. Is everyone reading this sure this is really a near-term priority? Be brave: if you disagree, go ahead and close it out. If someone disagrees, they'll reopen it, and if they don't: one less thing to do!

@marcaaron marcaaron assigned iwiznia and unassigned marcaaron May 28, 2024
@iwiznia
Copy link
Contributor

iwiznia commented May 30, 2024

Got it, thanks for the explanation! And sorry that I missed that message

@melvin-bot melvin-bot bot added Weekly KSv2 Awaiting Payment Auto-added when associated PR is deployed to production and removed Monthly KSv2 labels Jun 4, 2024
@melvin-bot melvin-bot bot changed the title [$500] Auto complete emojis does not pop up if a letter with tilde in spanish [HOLD for payment 2024-06-11] [$500] Auto complete emojis does not pop up if a letter with tilde in spanish Jun 4, 2024
@melvin-bot melvin-bot bot removed the Reviewing Has a PR in review label Jun 4, 2024
Copy link

melvin-bot bot commented Jun 4, 2024

Reviewing label has been removed, please complete the "BugZero Checklist".

Copy link

melvin-bot bot commented Jun 4, 2024

The solution for this issue has been 🚀 deployed to production 🚀 in version 1.4.78-5 and is now subject to a 7-day regression period 📆. Here is the list of pull requests that resolve this issue:

If no regressions arise, payment will be issued on 2024-06-11. 🎊

For reference, here are some details about the assignees on this issue:

Copy link

melvin-bot bot commented Jun 4, 2024

BugZero Checklist: The PR fixing this issue has been merged! The following checklist (instructions) will need to be completed before the issue can be closed:

  • [@thesahindia] The PR that introduced the bug has been identified. Link to the PR:
  • [@thesahindia] The offending PR has been commented on, pointing out the bug it caused and why, so the author and reviewers can learn from the mistake. Link to comment:
  • [@thesahindia] A discussion in #expensify-bugs has been started about whether any other steps should be taken (e.g. updating the PR review checklist) in order to catch this type of bug sooner. Link to discussion:
  • [@thesahindia] Determine if we should create a regression test for this bug.
  • [@thesahindia] If we decide to create a regression test for the bug, please propose the regression test steps to ensure the same bug will not reach production again.
  • [@johncschuster] Link the GH issue for creating/updating the regression test once above steps have been agreed upon:

@johncschuster
Copy link
Contributor

@thesahindia, can you complete the BZ Checklist when you have a moment? Thank you!

@johncschuster
Copy link
Contributor

Payment Summary:

Contributor: @tienifr - $500 - paid via Upwork
Contributor+: @thesahindia - $500 - paid via Manual request

@melvin-bot melvin-bot bot added Daily KSv2 and removed Weekly KSv2 labels Jun 10, 2024
@thesahindia
Copy link
Member

thesahindia commented Jun 11, 2024

This was an improvement. It was a case that we weren't handling so we shouldn't be treating it as a regression.

We don't need a test case here since it's very specific, but if we want to add one, here are the steps:

  1. Change language to Spanish
  2. Open emoji picker, search "corazón"
  3. Verify "❤️" appears as the first emoji
  4. Search for corazo (no tilde)
  5. Verify it shows the "❤️" emoji
  6. Type ":corazón" in the main composer
  7. Verify emoji suggestions show "❤️" as the first emoji suggestion

@johncschuster
Copy link
Contributor

johncschuster commented Jun 11, 2024

Payment has been issued to @tienifr 🎉
@thesahindia, go ahead and submit your manual request 👍

What do you think of the QA test steps above, @iwiznia?

@iwiznia
Copy link
Contributor

iwiznia commented Jun 11, 2024

They look good, but it's missing the good case: search for corazo (no tilde) and check it shows the ❤️ emoji

@thesahindia
Copy link
Member

Updated!

@JmillsExpensify
Copy link

$500 approved for @thesahindia

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Awaiting Payment Auto-added when associated PR is deployed to production Bug Something is broken. Auto assigns a BugZero manager. Daily KSv2 External Added to denote the issue can be worked on by a contributor
Projects
No open projects
Development

No branches or pull requests

8 participants