A Bilingual Generative Transformer for Semantic Sentence Embedding
John Wieting, Graham Neubig, Taylor Berg-Kirkpatrick
Semantics: Sentence-level Semantics, Textual Inference and Other areas Long Paper
You can open the pre-recorded video in a separate window.
Abstract:
Semantic sentence embedding models encode natural language sentences into vectors, such that closeness in embedding space indicates closeness in the semantics between the sentences. Bilingual data offers a useful signal for learning such embeddings: properties shared by both sentences in a translation pair are likely semantic, while divergent properties are likely stylistic or language-specific. We propose a deep latent variable model that attempts to perform source separation on parallel sentences, isolating what they have in common in a latent semantic vector, and explaining what is left over with language-specific latent vectors. Our proposed approach differs from past work on semantic sentence encoding in two ways. First, by using a variational probabilistic framework, we introduce priors that encourage source separation, and can use our model's posterior to predict sentence embeddings for monolingual data at test time. Second, we use high-capacity transformers as both data generating distributions and inference networks -- contrasting with most past work on sentence embeddings. In experiments, our approach substantially outperforms the state-of-the-art on a standard suite of unsupervised semantic similarity evaluations. Further, we demonstrate that our approach yields the largest gains on more difficult subsets of these evaluations where simple word overlap is not a good indicator of similarity.
NOTE: Video may display a random order of authors.
Correct author list is at the top of this page.
Connected Papers in EMNLP2020
Similar Papers
Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation
Nils Reimers, Iryna Gurevych,

LNMap: Departures from Isomorphic Assumption in Bilingual Lexicon Induction Through Non-Linear Mapping in Latent Space
Tasnim Mohiuddin, M Saiful Bari, Shafiq Joty,

Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models
Ethan Wilcox, Peng Qian, Richard Futrell, Ryosuke Kohita, Roger Levy, Miguel Ballesteros,

A Simple Approach to Learning Unsupervised Multilingual Embeddings
Pratik Jawanpuria, Mayank Meghwanshi, Bamdev Mishra,
