Learning a Cost-Effective Annotation Policy for Question Answering
Bernhard Kratzwald, Stefan Feuerriegel, Huan Sun
Question Answering Long Paper
Abstract:
State-of-the-art question answering (QA) relies upon large amounts of training data for which labeling is time consuming and thus expensive. For this reason, customizing QA systems is challenging. As a remedy, we propose a novel framework for annotating QA datasets that entails learning a cost-effective annotation policy and a semi-supervised annotation scheme. The latter reduces the human effort: it leverages the underlying QA system to suggest potential candidate annotations. Human annotators then simply provide binary feedback on these candidates. Our system is designed such that past annotations continuously improve the future performance and thus overall annotation cost. To the best of our knowledge, this is the first paper to address the problem of annotating questions with minimal annotation cost. We compare our framework against traditional manual annotations in an extensive set of experiments. We find that our approach can reduce up to 21.1% of the annotation cost.
Connected Papers in EMNLP2020
Similar Papers
Exposing Shallow Heuristics of Relation Extraction Models with Challenge Data
Shachar Rosenman, Alon Jacovi, Yoav Goldberg,

New Protocols and Negative Results for Textual Entailment Data Collection
Samuel R. Bowman, Jennimaria Palomaki, Livio Baldini Soares, Emily Pitler,

Context-Aware Answer Extraction in Question Answering
Yeon Seonwoo, Ji-Hoon Kim, Jung-Woo Ha, Alice Oh,
