Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open Vietnamese Dictionary version 2023 #622

Open
19 tasks
rain1024 opened this issue Jan 3, 2023 · 2 comments
Open
19 tasks

Open Vietnamese Dictionary version 2023 #622

rain1024 opened this issue Jan 3, 2023 · 2 comments

Comments

@rain1024
Copy link
Contributor

rain1024 commented Jan 3, 2023

πŸ‘©β€πŸ’Ό As an NLP engineer, you know that accurate and reliable language resources are essential for building effective natural language processing systems. πŸ“š The Vietnamese Dictionary project is a comprehensive and reliable resource for learners and users of the Vietnamese language that is specifically designed to meet the needs of NLP engineers. With over 10,000 words and phrases, this dictionary provides clear and accurate definitions and example sentences to help you build high-quality language models and applications. In addition to its core function as a reference tool, the Vietnamese Dictionary also includes helpful features such as pronunciation guides, synonyms, antonyms, and etymology.

🌍 As an open data project, the Vietnamese Dictionary is freely available to the public to use and reuse, typically with minimal restrictions. By making the dictionary available as open data, we hope to enable NLP engineers like you to access and use the data in a variety of ways, such as by incorporating it into your own projects or tools, or by building upon it to create new resources.

Planning

  1. Determine the language(s) and audience for the dictionary:
    • Research the language(s) and the needs and interests of the intended audience
    • Decide on the focus and scope of the dictionary (e.g. general usage, technical terms, slang, etc.)
  2. Gather a list of words to include in the dictionary:
    • Consult other dictionaries and language resources
    • Identify commonly used words and specialized terms
    • Consider the needs and interests of the intended audience
  3. Research and verify the definitions and translations for the words:
    • Consult with linguists or language experts
    • Use online resources such as language forums and online dictionaries
    • Verify definitions and translations with multiple sources
  4. Determine the format and structure for the dictionary entries:
    • Consult with other dictionaries and language resources to see how they structure their entries
    • Decide which elements to include in the dictionary (e.g. pronunciation, part of speech, definition, example sentence, synonyms, antonyms, etymology)
    • Consider the needs and interests of the intended audience
  5. Input the words and their definitions or translations into the dictionary:
    • Use a text editor, spreadsheet program, or specialized dictionary software
    • Create columns or fields for different elements of the dictionary entry (e.g. word, pronunciation, part of speech, definition, example sentence)
  6. Review and edit the entries:
    • Check for accuracy and consistency with the chosen format and structure
    • Have multiple people review the entries to catch any mistakes or inconsistencies
  7. Test the dictionary:
    • Use the dictionary to look up words and check their definitions or translations
    • Have other people test the dictionary to see if it is easy to use and the definitions and translations are clear and accurate
  8. Publish the dictionary:
    • Choose a suitable format (e.g. printed book, online database, mobile app)
    • Consider layout, design, formatting, and any legal or copyright issues
@rain1024 rain1024 changed the title Open Vietnamese Dictionary version 2022 Open Vietnamese Dictionary version 2023 Jan 3, 2023
rain1024 added a commit that referenced this issue Jan 19, 2023
rain1024 added a commit that referenced this issue Jan 19, 2023
@rain1024
Copy link
Contributor Author

rain1024 commented Jan 19, 2023

Dictionary Features

  • Initialize Dictionary feature
  • CRUD Word

rain1024 added a commit that referenced this issue Jan 22, 2023
@rain1024 rain1024 mentioned this issue Feb 8, 2023
4 tasks
rain1024 added a commit that referenced this issue Feb 27, 2023
@rain1024
Copy link
Contributor Author

[Update 2023/02/23] I just publish list of Vietnamese words in huggingface via https://huggingface.co/datasets/undertheseanlp/UTS_Dictionary

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant