PatchBERT: Just-in-Time, Out-of-Vocabulary Patching

Sangwhan Moon; Naoaki Okazaki

PatchBERT: Just-in-Time, Out-of-Vocabulary Patching

Sangwhan Moon, Naoaki Okazaki

Abstract Paper Connected Papers Add to Favorites

Machine Learning for NLP Short Paper

Gather-5A: Nov 18, Gather-5A: Nov 18 (18:00-20:00 UTC) [Join Gather Meeting]

You can open the pre-recorded video in a separate window.

Abstract: Large scale pre-trained language models have shown groundbreaking performance improvements for transfer learning in the domain of natural language processing. In our paper, we study a pre-trained multilingual BERT model and analyze the OOV rate on downstream tasks, how it introduces information loss, and as a side-effect, obstructs the potential of the underlying model. We then propose multiple approaches for mitigation and demonstrate that it improves performance with the same parameter count when combined with fine-tuning.

NOTE: Video may display a random order of authors. Correct author list is at the top of this page.

Connected Papers in EMNLP2020