Skip to content

We present a unique algorithm for Bangla that can be used to generate error or misspelled Bangla words, like we often do while writing in Bangla through English keyboard. The purpose of making the error Bangla words or in other words, the Bangla error word dataset, is that, it can be used for evaluating performance of various existing Bangla spe…

Notifications You must be signed in to change notification settings

habibsifat/Algorithm-for-Bengali-Error-Dataset-Generation

Repository files navigation

Algorithm-for-Bengali-Error-Dataset-Generation

We present a unique algorithm for Bengali that can be used to generate an error or misspelled word like we often do while writing Bengali through English Qwerty keyboard.

  1. "ErrorWordGenaratorTool.ipynb": We have developed an error word generation tool. By using this code anyone can generate an error word from correct word. Give a Correct word as Input and Output will be a wrong word in the console.

  2. "Juktakkhor List.txt": We show 170 common constant words (Juktakkhor) of Bengali language.

3."Juktakkhor-handling-rules.pdf": We proposed 20 rules for handling Juktakkhor while generating errors.

4." Phonetic Similar Cluster Replacement.txt": We proposed a Dictionary for which character is phonetically similar.

5." Single Letter Mispressed Cluster Replacement.txt": We proposed a Dictionary for which adjacent character mapping has been done on the QWERTY keyboard layout.

6."Single letter Mispressed cluster insertion.txt": We proposed a Dictionary for which character insertion is possible because of adjacency in the QWERTY keyboard layout.

About

We present a unique algorithm for Bangla that can be used to generate error or misspelled Bangla words, like we often do while writing in Bangla through English keyboard. The purpose of making the error Bangla words or in other words, the Bangla error word dataset, is that, it can be used for evaluating performance of various existing Bangla spe…

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published