-
Notifications
You must be signed in to change notification settings - Fork 2
Home
Welcome to the hangul project wiki!
- background info on pronouncing the Korean hangul alphabet
- implementation of core transliteration engine
- applications for embedding the engine
- Hangul Unicode code sample in Perl
- Wikipedia Hangul Unicode guide
- Korean government official romanization standard
The Hangul writing system is widely praised for its elegance: it is consistent, compact, and easy to learn.
Yet most westerner's first encounter with the Korean language is not with Hangul letters ("서울"), but in Romanized form ("Seoul"). In this form, they tend to mangle the pronunciation, to the dismay of native Koreans.
Why this discrepancy?
The problem lies with the romanization system for Hangul, fundamentally based on the Mccune-Reischauer, developed in 1932.
Although MR was replaced in 2002 by the Revised Romanization system, the changes mostly focused on consonants, e.g.:
한글 MR => RR
====================
부산 Pusan => Busan
대구 Taegu => Daegu
The MR -> RR change did not sufficiently address the vowels.
While Vowels written in Hangul are consistent, unique and comprehensible, they become more difficult in Romanization for a number of reasons:
- Hangul vowels differences are subtle to untrained western ears (의, 외, 왜, etc)
- Hangul vowels are consistent, but Romanized vowels less so (서 is sometimes "seo", other times "suh")
- Hangul makes syllable alignment is clear-cut, but Romanization does not ("Yeouido" mispronounced as "ye-o-u-i-do" vs. "yeo-ui-do")
For the above reasons, the romanization standard rests on the following features:
- each vowel's romanization is optimized for consistent, accurate, and intuitive pronunciation to a contemporary western speaker (아: "a" => "ah")
- if 2 different vowels have the same pronunciation, they may have the same romanization. We sacrifice uniqueness to gain simplicity
- syllable alignment will be explicitly denoted by hyphenation ("seoul" => "suh-ool")
- setup Tomcat server on your local machine
- see here for instructions on IDE integration
- build war via 'mvn package'
- copy 'hangul.war' to 'webapp' directk ory of tomcat
- go to [http://localhost:8080/hangul/hangul.jsp]
- type and submit text into the form (can span multiple lines and mix hangul with non-hangul)
- repository: Git / GitHub
- language: Java
- build: Maven
- main: com.dragoncrane.hangul.Prototype
- arg1: path to hangul input - must be on classpath (eg "/data/input.txt")
- arg2: absolute path to romanized output (eg "C:\dave\output.txt")
- execution: run "mvn test" on command line (uses exec-maven-plugin)
Basic Vowels
한글 RR New
=== === ===
아 a ah
어 eo uh
오 o oh
우 u oo
으 eu u # pronounce very briefly
이 i ee
Basic Y-vowels
한글 RR New
=== === ===
야 ya yah
여 yeo yuh
요 yo yoh
유 yu yoo
Compound Vowels
한글 RR New
=== === ===
애 ae eh
에 e ay
외 oe weh
위 w wee
Compound Y-Vowels
한글 RR New
=== === ===
얘 yae yeh
예 ye yay
Dipthongs/Tripthongs
한글 RR New
=== === ===
와 wa wah
왜 wae weh
워 wo wuh
웨 we way
의 ui uee
The underlying romanization standard is a simple engine that takes in Hangul and returns Romainzed text.
This engine can be plugged into a variety of contexts:
K-Pop is steadily growing in popularity abroad. Moreover, due to streaming services, musicians are disproportionately dependent on overseas sales for sales revenue (e.g. Psy's "Gangnam Style" earned over $10M in the US, but less than $100,000 in the Korean market).
Many of these fans would like to sing along to their favorite K-pop songs but do not wish to invest the time in learning Hangul. An improved romanization system would make the lyrics more accessible, increasing sales volume.
Tourists in Korea trying to read a street sign or building name or dining at a Korean restaurant but unsure how to pronounce a dish, could snap a photo, scan the image into an OCR text converter, and pass that Hangul text into our engine which would return a pronunciation guide.
Free Hangul OCR Readers:
One of the most popular websites for learning Korean is Talk To Me In Korean.
It presents Korean in a very easygoing and casual manner. Moreover, they are not affiliated with any government agency or university, so probably would be flexible and open-minded about an alternative way of romanizing Hangul if it results in better pronunciation results.
If the romanized output is fed into a text-to-voice reader, then students have a new means of generating audio for listening practice from text. Given that TTMIK is entirely free, this enhance the competitiveness of their service relative to the universities which tend to sell expensive audio CDs along with their textbooks for listening practice.
In 2002, the conversion from MR to RR caused tremendous confusion during the world cup as train tickets using the new spelling "Busan" disagreed with old maps reading "Pusan."
Since then, tourism in Korea as steadily grown, and at this point landmark names like "Seoul", "Gangnam", and "Seorak" are more or less set in stone.
The headache of updating the maps yet again from "Seoul" to "suh-ool" and "Gangnam" to "gahng-nahm" is out of the question.
Rather, the new romanazation standard could be used as a parenthetical aside alongside the official spelling to aide pronunciation.