Learning to Pronounce Chinese Without a Pronunciation Dictionary
Christopher Chu, Scot Fang, Kevin Knight
Phonology, Morphology and Word Segmentation Long Paper
You can open the pre-recorded video in a separate window.
Abstract:
We demonstrate a program that learns to pronounce Chinese text in Mandarin, without a pronunciation dictionary. From non-parallel streams of Chinese characters and Chinese pinyin syllables, it establishes a many-to-many mapping between characters and pronunciations. Using unsupervised methods, the program effectively deciphers writing into speech. Its token-level character-to-syllable accuracy is 89%, which significantly exceeds the 22% accuracy of prior work.
NOTE: Video may display a random order of authors.
Correct author list is at the top of this page.
Connected Papers in EMNLP2020
Similar Papers
Improving Bilingual Lexicon Induction for Low Frequency Words
Jiaji Huang, Xingyu Cai, Kenneth Church,

Mind Your Inflections! Improving NLP for Non-Standard Englishes with Base-Inflection Encoding
Samson Tan, Shafiq Joty, Lav Varshney, Min-Yen Kan,
