Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new: Add bn (Bengali) language. #161

Merged
merged 7 commits into from
Oct 3, 2023
Merged

Conversation

sudipshil9862
Copy link
Contributor

No description provided.

Copy link
Owner

@milesj milesj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start, just a few minor things to fix.

Another thing I forgot to add to the docs, but you'll need to add .d.ts files for the data files, as seen here: https://github.com/milesj/emojibase/tree/master/packages/data/ko


msgctxt "EMOJI GROUP: 7|objects"
msgid "objects"
msgstr "objects"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sudipshil9862 looks like you missed a few translations in this file, or was it intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a mistake. changing this


msgctxt "EMOJI SUB-GROUP: 33|animal-mammal"
msgid "mammals"
msgstr "mammals"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another


msgctxt "EMOJI SUB-GROUP: 71|light-video"
msgid "light, film & video"
msgstr ""
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is empty


msgctxt "EMOJI SUB-GROUP: 91|punctuation"
msgid "punctuation"
msgstr "punctuation"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another

"groups": [
{
"key": "smileys-emotion",
"message": "smileys & emotion",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these should be the translated messages

Copy link
Contributor Author

@sudipshil9862 sudipshil9862 Sep 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how to create these .d.ts like in packages/data/bn

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just copy them

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does data.json mean data.json.d.ts ?
or should I rename it ?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can copy the compact.json.d.ts, data.json.d.ts, and messages.json.d.ts files as-is.

data.json.d.ts is typescript types for the data.json file (which is published).

@sudipshil9862
Copy link
Contributor Author

sudipshil9862 commented Oct 1, 2023

  • added .d.ts files: compact.json.d.ts, data.json.d.ts, messages.json.d.ts
  • added some missing translation in messages.po

@sudipshil9862
Copy link
Contributor Author

@milesj anyhting else to change ?

@milesj
Copy link
Owner

milesj commented Oct 1, 2023

@sudipshil9862 Yeah just need to fix the failing build and then its good to go

@milesj milesj changed the title Added bn (Bengali) new: Add bn (Bengali) language. Oct 2, 2023
@milesj
Copy link
Owner

milesj commented Oct 2, 2023

@sudipshil9862 For the shortcode test, we need to update this file: https://github.com/milesj/emojibase/blob/master/packages/regex/shortcode-native.js

And include the regex for bengali.

@sudipshil9862
Copy link
Contributor Author

now where it's failing ?
should I remove shortcodes folder inside /data/bn/ ?

@milesj
Copy link
Owner

milesj commented Oct 2, 2023

@sudipshil9862 Still on the same thing. Looks like bengali may need multiple unicode ranges? Similar to how russian works.

I'm not sure what the range would be though. Just trial and error. I suggest running the tests locally TEST_LOCALE=bn yarn run test shortcodes

@sudipshil9862
Copy link
Contributor Author

sudipshil9862 commented Oct 3, 2023

in gedit, this line is showing : 1FA83": "বুমের‌্যাঙ"
in vi, this line is showing: 1FA83": "বুমের<200c>যাঙ"

"‌্" contains a ZERO WIDTH NON-JOINER (U+200C) and a VIRAMA (U+09CD, BENGALI SIGN VIRAMA)

To solve this problem we need a Unicode range \u2000-\u206F represents a range of characters and symbols from the "General Punctuation" block within the Unicode Standard.
refer to: https://symbl.cc/en/unicode/blocks/general-punctuation/

@sudipshil9862
Copy link
Contributor Author

@milesj please review my PR. all checks have passed

@milesj
Copy link
Owner

milesj commented Oct 3, 2023

@sudipshil9862 Awesome stuff. I'll publish a new version sometime in the next day.

@milesj milesj merged commit 84ea37c into milesj:master Oct 3, 2023
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants