Pragmatic Issue-Sensitive Image Captioning

Allen Nie, Reuben Cohn-Gordon, Christopher Potts

CoNLL 2020 Workshop Paper

You can open the pre-recorded video in a separate window.

Abstract: Image captioning systems need to produce texts that are not only true but also relevant in that they are properly aligned with the current issues. For instance, in a newspaper article about a sports event, a caption that not only identifies the player in a picture but also comments on their ethnicity could create unwanted reader reactions. To address this, we propose Issue-Sensitive Image Captioning (ISIC). In ISIC, the captioner is given a target image and an issue, which is a set of images partitioned in a way that specifies what information is relevant. For the sports article, we could construct a partition that places images into equivalence classes based on player position. To model this task, we use an extension of the Rational Speech Acts model. Our extension is built on top of state-of-the-art pretrained neural image captioners and explicitly uses image partitions to control caption generation. In both automatic and human evaluations, we show that these models generate captions that are descriptive and issue-sensitive. Finally, we show how ISIC can complement and enrich the related task of Visual Question Answering.
NOTE: Video may display a random order of authors. Correct author list is at the top of this page.