Skip to content

Transliterator for Uzbek words with high accuracy (from Latin alphabet to Cyrillic and vice versa).

License

Notifications You must be signed in to change notification settings

diyorbek/lotin-kirill

Repository files navigation

Lotin-Kirill

Transliterator for Uzbek words with high accuracy (from latin alphabet to cyrillic and vice versa). Used at transliterator.uz

Installation

NPM

npm install lotin-kirill --save

Yarn

yarn add lotin-kirill

UNPKG

<script src="https://unpkg.com/lotin-kirill/dist-umd/index.min.js"></script>

Usage

Initialize the engine:

import Transliterator from 'lotin-kirill';

const transliterator = new Transliterator();

When using UNPKG distribution:

<script src="https://unpkg.com/lotin-kirill/dist-umd/index.min.js"></script>
<script>
  const Transliterator = lotinKirill.default;
  const transliterator = new Transliterator();
</script>

Single word transliteration:

toLatin(word: string): string

toCyrillic(word: string): string

Example

const latinWord = transliterator.toLatin('мотивация');
console.log(latinWord); // -> 'motivatsiya'

const cyrillicWord = transliterator.toCyrillic("e'lon");
console.log(cyrillicWord); // -> 'эълон'

Text (multiple words) transliteration:

textToLatin(text: string): string

textToCyrillic(text: string): string

Example

const latinText = transliterator.textToLatin('Жуда узун кириллча текст.');
console.log(latinText); // -> 'Juda uzun kirillcha tekst.'

const cyrillicText = transliterator.textToCyrillic('Juda uzun lotincha tekst.');
console.log(cyrillicText); // -> 'Жуда узун лотинча текст.'

Exceptional words

You can initialize the transliterator object with an exceptional words list:

import Transliterator from 'lotin-kirill';

const transliterator = new Transliterator([
  // [latinWord, cyrillicWord]
  ['oktabr', 'октябрь'],
  ['Google', 'Google'],
]);

const cyrillicWord = transliterator.toCyrillic('oktabr');
console.log(cyrillicWord); // -> 'октябрь' (not 'октабр')

One exceptional pair is enough for both cyrilllic and latin transliterations.

// This also works
const latinWord = transliterator.toLatin('октябрь');
console.log(latinWord); // -> 'oktabr' (not 'oktyabr')

If exceptional pair is a pair of same words, the word is not transliterated.

const cyrillicWord = transliterator.toCyrillic('Google');
console.log(cyrillicWord); // -> 'Google'

Exceptional words with sufixes are also detected.

Variants of a word with prefixes should be added as separate exceptionals!

// This also works
const latinWord = transliterator.toLatin('октябрда');
console.log(latinWord); // -> 'oktabrda' (not 'oktyabrda')

You can extend the exceptionals list after the initialization of the transliterator.

transliterator.extendExceptionals([['nol', 'ноль']]);

Or purge all of the exceptionals added to transliterator.

transliterator.purgeExceptionals();

Pure transliterator functions

There are pure transliterator functions which operate only on basic transliteration rules. These functions don't look up words in exceptionals list.

cyrillicToLatin(word: string): string

latinToCyrillic(word: string): string

import Transliterator, { cyrillicToLatin, latinToCyrillic } from 'lotin-kirill';

const transliterator = new Transliterator([['oktabr', 'октябрь']]);

console.log(transliterator.toLatin('октябрь')); // -> 'oktabr'
console.log(cyrillicToLatin('октябрь')); // -> 'oktyabr'

console.log(transliterator.toCyrillic('oktabr')); // -> 'октябрь'
console.log(latinToCyrillic('oktabr')); // -> 'октабр'

About

Transliterator for Uzbek words with high accuracy (from Latin alphabet to Cyrillic and vice versa).

Topics

Resources

License

Stars

Watchers

Forks

Languages