SustaiNLP: Workshop on Simple and Efficient Natural Language Processing

Angela Fan, Goran Glavaš, Shafiq Joty, Nafise Sadat Moosavi, Vered Shwartz, Alex Wang and Thomas Wolf

Live Session 1: Nov 20, Live Session 1: Nov 20 (08:00-10:00 UTC)
Live Session 2: Nov 20, Live Session 2: Nov 20 (14:00-20:00 UTC)
Live Session 3: Nov 20, Live Session 3: Nov 20 (23:00-02:00 UTC)
Promoting computationally efficient NLP and justified model complexity. Encouraging conceptual NLP novelty, discouraging "more compute".

Time (PDT) Event Hosts
Nov 20, (08:00-09:00 UTC)

QA session1:
Zoom link 1

Large Product Key Memory for Pre-trained Language Models
Exploring the Boundaries of Low-Resource BERT Distillation
Keyphrase Generation with GANs in Low-Resources Scenarios
Domain Adversarial Fine-Tuning as an Effective Regularizer

Shafiq Joty
Nov 20, (09:00-10:00 UTC)

QA session2:
Zoom link 1

Don’t Read Too Much Into It: Adaptive Computation for Open-Domain Question Answering
A comparison between CNNs and WFAs for Sequence Classification
Counterfactual Augmentation for Training Next Response Selection
Pruning Redundant Mappings in Transformer Models via Spectral-Normalized Identity Prior

Shafiq Joty
Nov 20, (14:00-15:00 UTC)

Invited Talk
Zoom link 1
Mona Diab: Non-parameterized sentence encoders: an efficient alternative

Goran Glavaš
Nov 20, (15:00-16:00 UTC)

QA session3-A:
Zoom link 1

Knowing Right from Wrong: Should We Use More Complex Models for Automatic Short-Answer Scoring in Bahasa Indonesia?
Learning Informative Representations of Biomedical Relations with Latent Variable Models
End to End Binarized Neural Networks for Text Classification
Predictive Model Selection for Transfer Learning in Sequence Labeling Tasks

Thomas Wolf
Nov 20, (15:00-16:00 UTC)

QA session3-B:
Zoom link 2

PBoS: Probabilistic Bag-of-Subwords for Generalizing Word Embedding
Unsupervised Few-Bits Semantic Hashing with Implicit Topics Modeling
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation

Marzieh Fadaee
Nov 20, (16:00-17:00 UTC)

Invited speakers QA
Zoom link 1

Heng Ji: Efficient Language Acquisition through Multimedia Curriculum Learning
Graham Neubig: More is Less? Non-parametric Language Models and Efficiency
Alexander Rush: Advances in Sequence Knowledge Distillation
Armand Joulin: Making Pre-trained Models More Sustainable

Goran Glavaš
Nov 20, (17:00-18:00 UTC)

Panel Discussion
Zoom link 1

Panelists: Kyunghyun Cho, Yejin Choi, Mona Diab, Yoav Goldberg, Iryna Gurevych, Heng Ji, Graham Neubig, Alexander Rush, Armand Joulin, Luke Zettlemoyer

Vered Shwartz & Angela Fan
Nov 20, (18:00-19:00 UTC)

QA session4:
Zoom link 1

Early Exiting BERT for Efficient Document Ranking
Quasi-Multitask Learning: an Efficient Surrogate for Obtaining Model Ensembles
Identifying Spurious Correlations for Robust Text Classification
Improving QA Generalization by Concurrent Modeling of Multiple Biases

Nafise Sadat Moosavi
Nov 20, (19:00-20:00 UTC)

QA session5:
Zoom link 1

Load What You Need: Smaller Versions of Mutlilingual BERT
Towards Accurate and Reliable Energy Measurement of NLP Models
Do We Need to Create Big Datasets to Learn a Task?

Nafise Sadat Moosavi
Nov 20, (23:00-00:00 UTC)

QA Session6-A:
Zoom link 1

Rank and run-time aware compression of NLP Applications
Incremental Neural Coreference Resolution in Constant Memory
Efficient Estimation of Influence of a Training Instance
Efficient Inference For Neural Machine Translation

Alex Wang
Nov 20, (23:00-00:00 UTC)

QA Session6-B:
Zoom link 2

Guiding Attention for Self-Supervised Learning with Transformers
Semi-supervised Formality Style Transfer using LanguageModel Discriminator and Mutual Information Maximization
SupMMD: A Sentence Importance Model for Extractive Summarization using Maximum Mean Discrepancy
Probabilstic Case-based Reasoning for Open-World Knowledge Graph Completion

Ameet Deshpande
Nov 21, (00:00-01:00 UTC)

QA Session7:
Zoom link 1

P-SIF: Document Embeddings using Partition Averaging
Sparse Optimization for Unsupervised Extractive Summarization of Long Documents with the Frank-Wolfe Algorithm
Analysis of Resource-efficient Predictive Models for Natural Language Processing
A Little Bit Is Worse Than None: Ranking with Limited Training Data

Vered Shwartz
Nov 21, (01:00-02:00 UTC)

QA Session8:
Zoom link 1

Doped Structured Matrices for Extreme Compression of LSTM Models
A Two-stage Model for Slot Filling in Low-resource Settings: Domain-agnostic Non-slot Reduction and Pretrained Contextual Embeddings
SqueezeBERT: What can computer vision teach NLP about efficient neural networks?
FastFormers: Highly Efficient Transformer Models for Natural Language Understanding
OptSLA: an Optimization-Based Approach for Sequential Label Aggregation

Alex Wang

Pre-recorded Plenary Talks

Invited Talk

Mona Diab

Invited Talk

Heng Ji

Invited Talk

Graham Neubig

Invited Talk

Alexander Rush

Invited Talk

Armand Joulin