Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emoji using ZWJ are considered invalid #90

Open
DysphoricUnicorn opened this issue Oct 6, 2022 · 1 comment
Open

Emoji using ZWJ are considered invalid #90

DysphoricUnicorn opened this issue Oct 6, 2022 · 1 comment
Labels
bug Something isn't working enhancement Improvements to existing features or smaller new features
Milestone

Comments

@DysphoricUnicorn
Copy link

DysphoricUnicorn commented Oct 6, 2022

The check if a character is printable currently considers the zero width joiner unprintable, even if it is between two characters that turn into a new one with it.

This means that a lot of newer emoji are not supported.
Examples are pride flags, newer family combinations and skin tone variations.

@binaryDiv binaryDiv added bug Something isn't working enhancement Improvements to existing features or smaller new features labels Oct 20, 2022
@binaryDiv binaryDiv added this to the 1.0.0 Release milestone Oct 20, 2022
@binaryDiv
Copy link
Contributor

Just some thoughts about this:

  • Simply allowing the ZWJ anywhere in a string could lead to potential problems. Imagine a string with a ZWJ at the beginning or end without any other character to connect to, which (I believe?) would be an invalid usage of the ZWJ. So a more complex and careful validation definitely is necessary.
  • The StringValidator should get an option to specify the charset. In many cases emoji (in general, not just with ZWJ) might not be desirable in a string, e.g. because it doesn't make sense to use Emoji in the particular input field, or because the underlying database uses a charset that doesn't support emoji. Also, often an ASCII-only StringValidator might be needed.
  • For convenience, subclasses for StringValidators with preconfigured charsets might be useful (e.g. AsciiStringValidator?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement Improvements to existing features or smaller new features
Projects
None yet
Development

No branches or pull requests

2 participants