Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

solve unique emoji IDs spanning different versions of unicode emoji versions #49

Open
eklem opened this issue Jul 14, 2023 · 4 comments
Assignees
Labels
dependencies Pull requests that update a dependency file enhancement New feature or request

Comments

@eklem
Copy link
Owner

eklem commented Jul 14, 2023

A library for handling unique emoji IDs:
https://github.com/eklem/unicode-emojis-unique-id-json

@eklem eklem self-assigned this Jul 14, 2023
@eklem eklem added enhancement New feature or request dependencies Pull requests that update a dependency file labels Jul 14, 2023
@eklem
Copy link
Owner Author

eklem commented Jul 14, 2023

Split text file on \n and process each line with some regex

https://stackoverflow.com/questions/23331546/how-to-use-javascript-to-read-local-text-file-and-read-line-by-line

@eklem
Copy link
Owner Author

eklem commented Jul 14, 2023

if no match on ^#.+, then process line, i.e.:

U+1F62E U+200D U+1F4A8 ; 13.1 # 😮‍💨 face exhaling

For the other lines:

  • Get everything before ; and split and add them to an array
    /.+(?=;)/
  • Number with 1 decimal between ; and #
    \(?<=; )\d+.\d\
  • Emoji
    \(?<=#\s).\
  • Text after emoji
    \(?<=#\s.+\s)[\w\-\’ \s:,]+\

Use latest Node.js to get as many emojis extracted as possible.

@eklem
Copy link
Owner Author

eklem commented Jul 14, 2023

For the first line, grab unicode emoji version number:
/(?<=v)\d+.\d/

Then do if not starting with # until true and do the regex above.

@eklem
Copy link
Owner Author

eklem commented Dec 3, 2023

Can now use the unicode-emojis-unique-id-json library. Need to change key id to plaintext and create the codebook from that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant