Skip to content

This is a curated list of samples on NLP preprocessing. You are welcome to make a pull request to contribute!

Notifications You must be signed in to change notification settings

yhuag/NLP-preprocess-tools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

NLP Preprocess Tools

This is a curated list of samples on NLP preprocessing. You are welcome to make a pull request to contribute!

Tools

Extraction

In Between Two Characters

Single Deletion Pairs

  • Extract Word Pair : Extract special word pairs from a group of word pairs (e.g. French, Italian, Portuguese, Spanish, Turkish). The word pair should match the following requirements:

    1. Single deletion distance
    2. Deletion at the exact center of the word

    Example. (bloccare, blocare) and (fellah, felah)
    Incorrect example. (vitamină, vitamin) and (maxi, maksi)

  • Extract Word Pair (Cognate) : Extract special word pairs from a group of word pairs (e.g. French, Italian, Portuguese, Spanish). The word pair should match the following requirements:

    1. Being a cognate pair
    2. Single deletion distance

    Example. (veni, venir) and (rosmarin, romarin)
    Incorrect example. (hanorac, anorak) and (msingur, singolo)

About

This is a curated list of samples on NLP preprocessing. You are welcome to make a pull request to contribute!

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published