Latent Geographical Factors for Analyzing the Evolution of Dialects in Contact

Yugo Murawaki

Linguistic Theories, Cognitive Modeling and Psycholinguistics Long Paper

Gather-1C: Nov 17, Gather-1C: Nov 17 (02:00-04:00 UTC) [Join Gather Meeting]

You can open the pre-recorded video in a separate window.

Abstract: Analyzing the evolution of dialects remains a challenging problem because contact phenomena hinder the application of the standard tree model. Previous statistical approaches to this problem resort to admixture analysis, where each dialect is seen as a mixture of latent ancestral populations. However, such ancestral populations are hardly interpretable in the context of the tree model. In this paper, we propose a probabilistic generative model that represents latent factors as geographical distributions. We argue that the proposed model has higher affinity with the tree model because a tree can alternatively be represented as a set of geographical distributions. Experiments involving synthetic and real data suggest that the proposed method is both quantitatively and qualitatively superior to the admixture model.
NOTE: Video may display a random order of authors. Correct author list is at the top of this page.

Connected Papers in EMNLP2020

Similar Papers

Toward Micro-Dialect Identification in Diaglossic and Code-Switched Environments
Muhammad Abdul-Mageed, Chiyu Zhang, AbdelRahim Elmadany, Lyle Ungar,
When Hearst Is not Enough: Improving Hypernymy Detection from Corpus with Distributional Models
Changlong Yu, Jialong Han, Peifeng Wang, Yangqiu Song, Hongming Zhang, Wilfred Ng, Shuming Shi,