Kurdish words corrector

Correct the typos and the Unicode problems in Kurdish (Kurmanji) by brute forcing and comparing with a dictionary.
The brute forcing has 3 different depths and specific most popular typos like writing s instead of ş or e instead of ê.

Incorrect sentence:

Reso cu dur.

Corrected sentence:

Reşo çû dûr.

Usage:

python kurdish-words-corrector.py -t "Reso cu dur." -o "results.txt"

You can read the results in yaml or json formats when you didn't include -o path.
The script will save a states file in the same path which include the json or yaml formated results (it includes some states too).

An example of the resulted json:

{
  "correct_words": [],
  "incorrect_words_with_possible_corrections": [
    {
      "word": "Reso",
      "message": "Is not in our database, and we found similar word/s",
      "status": 1,
      "possibilities": [
        "reşo"
      ]
    },
    {
      "word": "cu",
      "message": "Is not in our database, and we found similar word/s",
      "status": 1,
      "possibilities": [
        "çû"
      ]
    },
    {
      "word": "dur",
      "message": "Is not in our database, and we found similar word/s",
      "status": 1,
      "possibilities": [
        "dûr"
      ]
    }
  ],
  "incorrect_words_without_possible_corrections": [],
  "total_words": 3,
  "total_incorrect": 3,
  "total_incorrect_with_corrections": 3,
  "total_incorrect_without_corrections": 0,
  "incorrect_percentage": 100
}

Arguments:

Argument	Description
-w or --word	A word to only get its corrected form
-t or --text	The entered text to correct its words
-o or --output	The path of output results file with corrected text
-f or --file	The path of the file to correct its text's words
-d or --depth	With the values 1, 2 or 3, to increase the level of brute forcing but also the time it needs to be processed
-p or --parser	The parsing of the outputed file; yaml (default) or json
-wr or --workers	The number of workers (threads) that you want to use, default=100

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.idea		.idea
.gitignore		.gitignore
README.md		README.md
correct_words.txt		correct_words.txt
kurdish-words-corrector.py		kurdish-words-corrector.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kurdish words corrector

Usage:

Arguments:

About

Releases

Packages

Languages

kurd-cc/kurdish-words-corrector

Folders and files

Latest commit

History

Repository files navigation

Kurdish words corrector

Usage:

Arguments:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages