-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mention does not work with some languages (Hebrew) #4642
Comments
The culprit is this regexp https://github.com/ckeditor/ckeditor5-mention/blob/master/src/mentionui.js#L504 which only considers western languages. We need to come up with something smarter and more inclusive. |
So after quick research the RegExp in JavaScript does not support unicode grapheme matching: const regExp = xRegExp( createPattern( marker, 0 ) );
const re = new RegExp( '(^| )(\\@)([\\pL0-9]*?)$' );
// console.log( re.toString(), ':' );
console.log( 'RegExp: ', ' @foo', re.test( ' @foo' ) );
console.log( 'RegExp: ', ' @שנב', re.test( ' @שנב' ) );
const xre = xRegExp( '(^| )(\\@)([\\pL0-9]*?)$' );
// console.log( xre.toString(), ':' );
console.log( 'xRegExp: ', ' @foo', xre.test( ' @foo' ) );
console.log( 'xRegExp: ', ' @שנב', xre.test( ' @שנב' ) ); logs:
So the above solution has obvious plus: it allows to use simpler notations for regular expressions as in other languages. The drawbacks are as usual:
Technical details of XRegExp library: AFAICS it augments the RegExp object so if you do
which is pretty big. As an alternative I see that we could create similar regex but with unicode ranges: const re2 = new RegExp( '(^| )(\\@)([\\u05D0-\\uFB4Fa-zA-Z0-9]*?)$' );
console.log( 'RegExp2: ', ' @foo', re2.test( ' @foo' ) );
console.log( 'RegExp2: ', ' @שנב', re2.test( ' @שנב' ) ); which also works but would require to deeper dig into each supported language specifics (punctuations, etc):
|
Note: We should use the ES2018 RegExp as @mlewand pointed out here. |
Fix: Mentions should work when different UTF character classes are used in the feed configuration. Closes #38.
Not sure if this is RTL–related or IME–related but this should work anyway.
The text was updated successfully, but these errors were encountered: