Seq2Edits: Sequence Transduction Using Span-level Edit Operations

Felix Stahlberg, Shankar Kumar

Machine Learning for NLP Long Paper

Gather-3D: Nov 17, Gather-3D: Nov 17 (18:00-20:00 UTC) [Join Gather Meeting]

You can open the pre-recorded video in a separate window.

Abstract: We propose Seq2Edits, an open-vocabulary approach to sequence editing for natural language processing (NLP) tasks with a high degree of overlap between input and output texts. In this approach, each sequence-to-sequence transduction is represented as a sequence of edit operations, where each operation either replaces an entire source span with target tokens or keeps it unchanged. We evaluate our method on five NLP tasks (text normalization, sentence fusion, sentence splitting & rephrasing, text simplification, and grammatical error correction) and report competitive results across the board. For grammatical error correction, our method speeds up inference by up to 5.2x compared to full sequence models because inference time depends on the number of edits rather than the number of target tokens. For text normalization, sentence fusion, and grammatical error correction, our approach improves explainability by associating each edit operation with a human-readable tag.
NOTE: Video may display a random order of authors. Correct author list is at the top of this page.

Connected Papers in EMNLP2020

Similar Papers

COD3S: Diverse Generation with Discrete Semantic Signatures
Nathaniel Weir, João Sedoc, Benjamin Van Durme,
SLM: Learning a Discourse Language Representation with Sentence Unshuffling
Haejun Lee, Drew A. Hudson, Kangwook Lee, Christopher D. Manning,
Improving AMR Parsing with Sequence-to-Sequence Pre-training
Dongqin Xu, Junhui Li, Muhua Zhu, Min Zhang, Guodong Zhou,