This is a simple module built for those who need or would like to type in the writings systems for ancient Indo-European languages.
The project is currently working as an web app pieoffice-site.
With npm
:
npm install pieoffice
With yarn
:
yarn add pieoffice
If you desire to import only one of the transliteration schemes, say that for Avestan Script, you could add it on your .js file the following:
import { avestan } from "pieoffice";
After importing, function avestan()
will be available, so for example:
console.log(avestan("mazdA")); // ๐ฌจ๐ฌ๐ฌฐ๐ฌ๐ฌ
There is also a allConverters
array built to work with react-select
components, example from my implementation on pieoffice-site:
import { allConverters } from "pieoffice";
...
const LangSelect = () => {
const handleChange = (e) => {
converter = e.converter;
}
return (
<Select
placeholder={"Select a script"}
onChange={handleChange}
options={allConverters}
/>
);
}
Follows closely the Harvard-Kyoto transliteration scheme for Sanskrit. Resonants are encoded as uppercase, accents as slashes. w' = w; k', g' = แธฑ, วต.
Transliteration scheme based on BETACODE, including all the major diacritics (breathings, acute, grave, perispomenon and diairesis), breve and macron, koppa (including the archaic) and some bits and pieces. The full support for betacode is still a work in progress.
Glyphs with known syllabic values should be written in lower-case, syllabically and numbered if +2. Glyphs with known logographic values should be written in upper-case. The only exception for said rule are the gendered logograms, which should be followed without space by a f or m. Glyphs with unknown value should be written with an asterisk followed by the number (2 or 3 digits).
This conversion scheme supports Aegean numbers and measurements.
Example:
po-ro EQUf 120 --> ๐ก๐ซ ๐ ๐๐
a ๐ | e ๐ | i ๐ | o ๐ | u ๐ |
wa ๐ ฒ | we ๐ ณ | wi ๐ ด | wo ๐ ต | |
za / ga ๐ ผ | zo ๐ ฟ | |||
ja ๐ | jo ๐ | |||
ka ๐ | ke ๐ | ki ๐ | ko ๐ | ku ๐ |
la ๐ | le ๐ | li ๐ | lo ๐ | lu ๐ |
ma ๐ | me ๐ | mi ๐ | mo ๐ | mu ๐ |
na ๐ | ne ๐ | ni ๐ | no ๐ | nu ๐ |
pa ๐ | pe ๐ | pi ๐ | po ๐ ก | pu ๐ ข |
ra ๐ ฃ | re ๐ ค | ri ๐ ฅ | ro ๐ ฆ | ru ๐ ง |
sa ๐ จ | se ๐ ฉ | si ๐ ช | so ๐ ซ | su ๐ ฌ |
ta ๐ ญ | te ๐ ฎ | ti ๐ ฏ | to ๐ ฐ | tu ๐ ฑ |
ksa ๐ ท | kse ๐ ธ |
a ีก | b ีข | g ีฃ | d ีค | e ีฅ |
z ีฆ | ee ีง | e' ีจ | t' ีฉ | |
zh ีช | i ีซ | l ีฌ | x ีญ | c ีฎ |
k ีฏ | h ีฐ | j ีฑ | g. ีฒ | l. ีฒ |
ch. / c'h ีณ | m ีด | y ีต | n ีถ | sh ีท |
o ีธ | ch ีน | p ีบ | jh ีป | r. ีผ |
s ีฝ | v ีพ | t ีฟ | r ึ | c' ึ |
w ึ | p' ึ | k' ึ | o' ึ | f ึ |
u ีธึ | ew ึ | ? ี | . ึ | .' ี |
; ี | ;' ี | ! ี | `` ยซ | '' ยป |
For the particulars of the transliteration schemes, see the article on Wikipedia.
Use the Harvard-Kyoto encoding for
both outputs. Udatta (only for IAST), anudatta and svarita are assigned by /
,
=
, and \
after the vowel (or vowel + M), but the script also converts
text marked with udatta to devanagari with anudatta and svarita notation (BETA).
a a ๐ฌ | A ฤ ๐ฌ | รก รฅ ๐ฌ | ร ฤฬ ๐ฌ | รฃ ฤ ๐ฌ | รฃรฃ ฤ ฬ ๐ฌ |
รฆ ษ ๐ฌ | ร ษฬ ๐ฌ | e e ๐ฌ | E ฤ ๐ฌ | o o ๐ฌ | O ล ๐ฌ |
i i ๐ฌ | I ฤซ ๐ฌ | u u ๐ฌ | U ลซ ๐ฌ | k k ๐ฌ | x x ๐ฌ |
X xฬ ๐ฌ | xw xแต ๐ฌ | g g ๐ฌ | G ฤก ๐ฌ | gh ฮณ ๐ฌ | c ฤ ๐ฌ |
j วฐ ๐ฌ | t t ๐ฌ | th ฯ ๐ฌ | d d ๐ฌ | dh ฮด ๐ฌ | T tฬฐ ๐ฌ |
p p ๐ฌ | f f ๐ฌ | b b ๐ฌ | B ฮฒ ๐ฌก | ng ล ๐ฌข | ngH ลฬ ๐ฌฃ |
ngW ล ๐ฌค | n n ๐ฌฅ | รฑ ล ๐ฌฆ | N แน ๐ฌง | m m ๐ฌจ | M mฬจ ๐ฌฉ |
แบ แบ ๐ฌช | y y ๐ฌซ | v v ๐ฌฌ | r r ๐ฌญ | s s ๐ฌฏ | z z ๐ฌฐ |
sh ลก ๐ฌฑ | zh ลพ ๐ฌฒ | shy ลกฬ ๐ฌณ | S แนฃฬ ๐ฌด | h h ๐ฌต |
If you find it troublesome to type รฆ in your keyboard, try Alt gr + a
, else try using รช.
a ๐ | i ๐ก | u ๐ข | k ๐ฃ | ku ๐ค | x ๐ง | xi ๐ง |
xu ๐ง | g ๐ฅ | gu ๐ฆ | c ๐จ | รง ๐ | j ๐ฉ | ji ๐ช |
t ๐ซ | ti ๐ซ | tu ๐ฌ | th ๐ฐ | d ๐ญ | di ๐ฎ | du ๐ฏ |
p ๐ฑ | f ๐ณ | b ๐ฒ | n ๐ด | ni ๐ด | nu ๐ต | m ๐ถ |
mi ๐ท | mu ๐ธ | y ๐น | v ๐บ | vi ๐ป | r ๐ผ | ri ๐ฝ |
l ๐พ | s ๐ฟ | z ๐ | ลก ๐ | sh ๐ | h ๐ | |
ahuramazda1 ๐ | ahuramazda2 ๐ | ahuramazda3 ๐ | ||||
xshayathia ๐ | dahyaus1 ๐ | dahyaus2 ๐ | ||||
baga ๐ | bumis ๐ |
a ๐ฐ | b ๐ฑ | g ๐ฒ | d ๐ณ | e ๐ด | q ๐ต | z ๐ถ |
h ๐ท | th ๐ธ | i ๐น | k ๐บ | l ๐ป | m ๐ผ | n ๐ฝ |
j ๐พ | u ๐ฟ | p ๐ | q' ๐ | r ๐ | s ๐ | t ๐ |
w ๐ | f ๐ | x ๐ | hw ๐ | o ๐ | z' ๐ |
I tried to keep the system as flexible as possible allowing diacritics (zรก) and numerical typing (za2).
So far it only covers the signs used in Van den Hout's textbook, with many issues concerning the fonts which employ workarounds to cover the Unicode chart's shortcomings.
If you use either HPM or Ullikummi font it should be properly displayed, even if not in the browser.
I strongly recommend to check the file at src/converters
if you can not figure out how to type a value, I'm manually including the HZL numbers for future proofing.
Please feel free to report any inconsistencies.
The rules for 10 ๐, 100 ๐, and 1000 ๐ are currently unavailable. If necessary, use the forms DECEM, CENTUM and MILLE for them.
Example:
input: [ UM-MA 'd-UTU]-SI 'm-mur-si-li LUGAL-GAL LUGAL KUR ha-at-ti UR-SAG [ DUMU 'm-su-up-]pรญ-lu-li-u-ma LUGAL-GAL UR-SAG ku-it-ma-an-za-kรกn ANA GIS-GU-ZA ABI-IA na-[wi5] e-es-ha-at nu-mu a-ra-as-zรฉ-na-as KUR-KUR-MES Lร-KรR hu-u-ma-an-te-es ku-u-ru-ri-ia-ah-he-er nu-za ABU-IA ku-wa-pรญ DINGIR-LIM-is Dร-at 'm-ar-nu-an-da-as-ma-za-kรกn SES-IA ANA GIS-GU-ZA ABI-SU e-sa-at EGIR-an-ma-as ir-ma-li-ia-at-ta-at-pรกt ma-ah-ha-an-ma KUR-KUR-MES Lร-KรR 'm-ar-nu-an-da-an SES-IA ir-ma-an is-ta-ma-as-ser nu KUR-KUR-MES Lร-KรR ku-u-ru-ri-ia-ah-hi-is-ke-u-an da-a-[er] output: [ ๐๐ ๐ญ๐]๐ ๐น๐ฏ๐ ๐ท ๐๐ฒ ๐ ๐ณ ๐ฉ๐๐พ ๐จ๐ [ ๐ ๐น๐๐]๐๐ป๐ท๐๐ ๐๐ฒ ๐จ๐ ๐ช๐๐ ๐ญ๐๐ท ๐๐พ ๐๐๐ ๐๐๐ ๐พ[๐พ] ๐๐๐ฉ๐ ๐ก๐ฌ ๐๐๐ธ๐ข๐พ๐ธ ๐ณ๐ณ๐ฉ ๐ฝ๐ฝ ๐ท๐๐ ๐ญ๐ผ๐ ๐ช๐๐๐๐ ๐ด๐ญ๐ ๐ก๐ ๐๐๐ ๐ช๐ฟ๐ ๐ญ๐ ๐ ๐๐ ๐น๐ ๐ก๐ญ๐๐ธ๐ ๐๐ท ๐๐ ๐๐พ ๐๐๐ ๐๐๐ ๐๐๐ ๐๐ญ๐ ๐ธ ๐ ๐ ๐ท๐ ๐๐ซ๐๐ ๐ ๐ด๐ฉ๐ญ๐ ๐ณ๐ณ๐ฉ ๐ฝ๐ฝ ๐น๐ ๐ก๐ญ๐๐ญ ๐๐ ๐ ๐ ๐ญ ๐ ๐ซ๐ ๐ธ๐ ๐ก ๐ณ๐ณ๐ฉ ๐ฝ๐ฝ ๐ช๐๐๐๐ ๐ด๐ญ๐ ๐ ๐๐ญ ๐๐[๐ ]
Glyphs with known syllabic values should be written in lower-case, syllabically and with the proper diacritic or numbered if +4. Glyphs with known logographic values should be written in upper-case. Variants of known glyphs should be followed by one or more dots (.), generally the undotted variant is the more frequent one. Glyphs with unknown value should be written with an asterisk followed by the number (3 digits, including the 0).
Example:
"MAGNUS.REX MAGNUS-TONITRUS MAGNUS.REX HEROS ka-ra-ka-mi-sร REGIO REX || X-pa-VIR-ti-sa MAGNUS.REX HEROS INFANS-nรญ-mu-za || wa-tu-tรก-a CORNU-ra-ti REGIO LIS arha.-SPHINX || *273"\ ๐ ๐๐ข ๐ ๐ ๐ข๐ท๐ง๐ป๐ถ ๐ ๐ || X๐ธ๐ ๐ฃ๐ ๐ ๐ ๐ฐ๐ต๐พ๐ช || ๐ฌ๐ข๐๐ท ๐๐ฑ๐ฃ ๐ ๐ ๐น๐ || ๐ด
a ๐ค | b,p ๐คก | g ๐คข | d ๐คฃ | e ๐คค | v,w ๐คฅ | i ๐คฆ |
y ๐คง | k ๐คจ | l ๐คฉ | m ๐คช | n ๐คซ | o ๐คฌ | r ๐คญ |
S,ล ๐คฎ | t ๐คฏ | u ๐คฐ | f ๐คฑ | q ๐คฒ | s,sh ๐คณ | T ๐คด |
รฃ ๐คต | A ๐คต | แบฝ ๐คถ | E ๐คถ | L ๐คท | N ๐คธ | c ๐คน |
. ๎คฟ |
a ๐ | b ๐ | g ๐ | d ๐ | i ๐ | w ๐ |
z ๐ | h ๐ | th ๐ | j ๐ | y ๐ | k ๐ |
l ๐ | m ๐ | n ๐ | u ๐ | p ๐ | k ๐ |
r ๐ | s ๐ | t ๐ | e ๐ | รฃ ๐ | แบฝ ๐ |
M ๐ | N ๐ | T ๐ | q ๐ | B ๐ | x ๐ |
a ๐ | b ๐ก | d ๐ข | l ๐ฃ | y ๐ค | y2 ๐ |
r ๐ฅ | L ๐ฆ | L2 ๐ | A2 ๐ง | q ๐จ | b ๐ฉ |
m ๐ช | o ๐ซ | D2 ๐ฌ | t ๐ญ | sh ๐ฎ | sh2 ๐ฏ |
s ๐ฐ | 18 ๐ฑ | u ๐ฒ | N ๐ณ | c ๐ด | n ๐ต |
T2 ๐ถ | p ๐ท | 's,ล ๐ธ | i ๐น | e ๐บ | รฝ,'y ๐ป |
k ๐ผ | k2 ๐ฝ | dh ๐พ | w ๐ฟ | G ๐ | G2 ๐ |
z2 ๐ | z ๐ | ng ๐ | j ๐ | 39 ๐ | T ๐ |
y3 ๐ | r2 ๐ | mb ๐ | mb2 ๐ | mb3 ๐ | mb4 ๐ |
e2 ๐ |
b แ | l แ | w แ | s แ |
n แ | j แ | h แ | d แ |
t แ | k แ | kw แ | c แ |
cw แ | m แ | g แ | gw แ |
S แ | r แ | a แ | o แ |
u แ | e แ | i แ | ,ear, แ |
,or, แ | ,uilleann, แ | ,ifin, แ | ,eam, แ |
,peith, แ | > แ | < แ |
a ๐ | b ๐ | g,k ๐ | d ๐ | e ๐ | v ๐ | z ๐ |
h ๐ | i ๐ | l ๐ | m ๐ | n ๐ | p ๐ | ล ๐ |
r ๐ | s ๐ | t ๐ | u ๐ | f ๐ | รบ ๐ | รญ ๐ |
a โฐ | b โฐ | v โฐ | g โฐ | d โฐ | e โฐ |
zh โฐ | dz โฐ | z โฐ | ii โฐ | iy โฐ | i โฐ |
j โฐ | k โฐ | l โฐ | m โฐ | n โฐ | o โฐ |
p โฑ | r โฐ | s โฐ | t โฐ | u โฐ | f โฐ |
x โฐ | oo โฐ | w โฐ | sht โฐ | ts โฐ | ch โฐ |
sh โฐ | '' โฐ | 'i โฐ | ' โฐ | ya โฐก | yo โฐฆ |
yu โฐฃ | แบฝ โฐค | e~ โฐค | yแบฝ โฐง | ye~ โฐง | รต โฐจ |
o~ โฐจ | yรต โฐฉ | yo~ โฐฉ | th โฐช | v โฐซ |
- Obrigado Alex por indicar alguns erros crassos que tinham passado despercebidos em Grego e Avesta.
- Obrigado Thiago, por notar que o antigo IAST do vรฉdico era um misto de IAST com ISO.