T1: Interpreting Predictions of NLP Models

Eric Wallace, Matt Gardner, Sameer Singh

Live Session 1: Nov 19, Live Session 1: Nov 19 (15:00-19:30 UTC) [Join Zoom Meeting]
Abstract: Although neural NLP models are highly expressive and empirically successful, they also systematically fail in counterintuitive ways and are opaque in their decision-making process. This tutorial will provide a background on interpretation techniques, i.e., methods for explaining the predictions of NLP models. We will first situate example-specific interpretations in the context of other ways to understand models (e.g., probing, dataset analyses). Next, we will present a thorough study of example-specific interpretations, including saliency maps, input perturbations (e.g., LIME, input reduction), adversarial attacks, and influence functions. Alongside these descriptions, we will walk through source code that creates and visualizes interpretations for a diverse set of NLP tasks. Finally, we will discuss open problems in the field, e.g., evaluating, extending, and improving interpretation methods.

Time Event Hosts
Nov 19, (15:00-16:30 UTC) Part 1 Eric Wallace, Matt Gardner and Sameer Singh
Nov 19, (16:30-17:00 UTC) Q&A 1
Nov 19, (17:00-17:30 UTC) BREAK
Nov 19, (17:30-19:00 UTC) Part 2
Nov 19, (19:00-19:30 UTC) Q&A 2
Information about the virtual format of this tutorial: This tutorial has slides that you see can anytime (It does not have any prerecorded talk). It will be conducted entirely live on Zoom and will be livestreamed on this page. It has a chat window that you can use to have discussions with the tutorial teachers and other attendees anytime during the conference.