SustaiNLP: Workshop on Simple and Efficient Natural Language Processing

Angela Fan, Goran Glavaš, Shafiq Joty, Nafise Sadat Moosavi, Vered Shwartz, Alex Wang and Thomas Wolf

Description Schedule Papers External Website

Live Session 1: Nov 20, Live Session 1: Nov 20 (08:00-10:00 UTC)

Live Session 2: Nov 20, Live Session 2: Nov 20 (14:00-20:00 UTC)

Live Session 3: Nov 20, Live Session 3: Nov 20 (23:00-02:00 UTC)

Time (PDT)	Event	Hosts
Nov 20, (08:00-09:00 UTC)	QA session1: Zoom link 1 • Large Product Key Memory for Pre-trained Language Models • Exploring the Boundaries of Low-Resource BERT Distillation • Keyphrase Generation with GANs in Low-Resources Scenarios • Domain Adversarial Fine-Tuning as an Effective Regularizer	Shafiq Joty
Nov 20, (09:00-10:00 UTC)	QA session2: Zoom link 1 • Don’t Read Too Much Into It: Adaptive Computation for Open-Domain Question Answering • A comparison between CNNs and WFAs for Sequence Classification • Counterfactual Augmentation for Training Next Response Selection • Pruning Redundant Mappings in Transformer Models via Spectral-Normalized Identity Prior	Shafiq Joty
Nov 20, (14:00-15:00 UTC)	Invited Talk Zoom link 1 Mona Diab: Non-parameterized sentence encoders: an efficient alternative	Goran Glavaš
Nov 20, (15:00-16:00 UTC)	QA session3-A: Zoom link 1 • Knowing Right from Wrong: Should We Use More Complex Models for Automatic Short-Answer Scoring in Bahasa Indonesia? • Learning Informative Representations of Biomedical Relations with Latent Variable Models • End to End Binarized Neural Networks for Text Classification • Predictive Model Selection for Transfer Learning in Sequence Labeling Tasks	Thomas Wolf
Nov 20, (15:00-16:00 UTC)	QA session3-B: Zoom link 2 • PBoS: Probabilistic Bag-of-Subwords for Generalizing Word Embedding • Unsupervised Few-Bits Semantic Hashing with Implicit Topics Modeling • Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation	Marzieh Fadaee
Nov 20, (16:00-17:00 UTC)	Invited speakers QA Zoom link 1 • Heng Ji: Efficient Language Acquisition through Multimedia Curriculum Learning • Graham Neubig: More is Less? Non-parametric Language Models and Efficiency • Alexander Rush: Advances in Sequence Knowledge Distillation • Armand Joulin: Making Pre-trained Models More Sustainable	Goran Glavaš
Nov 20, (17:00-18:00 UTC)	Panel Discussion Zoom link 1 Panelists: Kyunghyun Cho, Yejin Choi, Mona Diab, Yoav Goldberg, Iryna Gurevych, Heng Ji, Graham Neubig, Alexander Rush, Armand Joulin, Luke Zettlemoyer	Vered Shwartz & Angela Fan
Nov 20, (18:00-19:00 UTC)	QA session4: Zoom link 1 • Early Exiting BERT for Efficient Document Ranking • Quasi-Multitask Learning: an Efficient Surrogate for Obtaining Model Ensembles • Identifying Spurious Correlations for Robust Text Classification • Improving QA Generalization by Concurrent Modeling of Multiple Biases	Nafise Sadat Moosavi
Nov 20, (19:00-20:00 UTC)	QA session5: Zoom link 1 • Load What You Need: Smaller Versions of Mutlilingual BERT • Towards Accurate and Reliable Energy Measurement of NLP Models • Do We Need to Create Big Datasets to Learn a Task?	Nafise Sadat Moosavi
Nov 20, (23:00-00:00 UTC)	QA Session6-A: Zoom link 1 • Rank and run-time aware compression of NLP Applications • Incremental Neural Coreference Resolution in Constant Memory • Efficient Estimation of Influence of a Training Instance • Efficient Inference For Neural Machine Translation	Alex Wang
Nov 20, (23:00-00:00 UTC)	QA Session6-B: Zoom link 2 • Guiding Attention for Self-Supervised Learning with Transformers • Semi-supervised Formality Style Transfer using LanguageModel Discriminator and Mutual Information Maximization • SupMMD: A Sentence Importance Model for Extractive Summarization using Maximum Mean Discrepancy • Probabilstic Case-based Reasoning for Open-World Knowledge Graph Completion	Ameet Deshpande
Nov 21, (00:00-01:00 UTC)	QA Session7: Zoom link 1 • P-SIF: Document Embeddings using Partition Averaging • Sparse Optimization for Unsupervised Extractive Summarization of Long Documents with the Frank-Wolfe Algorithm • Analysis of Resource-efficient Predictive Models for Natural Language Processing • A Little Bit Is Worse Than None: Ranking with Limited Training Data	Vered Shwartz
Nov 21, (01:00-02:00 UTC)	QA Session8: Zoom link 1 • Doped Structured Matrices for Extreme Compression of LSTM Models • A Two-stage Model for Slot Filling in Low-resource Settings: Domain-agnostic Non-slot Reduction and Pretrained Contextual Embeddings • SqueezeBERT: What can computer vision teach NLP about efficient neural networks? • FastFormers: Highly Efficient Transformer Models for Natural Language Understanding • OptSLA: an Optimization-Based Approach for Sequential Label Aggregation	Alex Wang

Pre-recorded Plenary Talks

Mona Diab

Heng Ji

Graham Neubig

Alexander Rush

Armand Joulin