Consistent Unsupervised Estimators for Anchored PCFGs

Alexander Clark, Nathanaël Fijalkow

Syntax: Tagging, Chunking, and Parsing Tacl Paper

Zoom-13D: Nov 18, Zoom-13D: Nov 18 (16:00-17:00 UTC) [Join Zoom Meeting]

You can open the pre-recorded video in a separate window.

Abstract: Learning probabilistic context-free grammars from strings is a classic problem in computational linguistics since Horning (1969). Here we present an algorithm based on distributional learning that is a consistent estimator for a large class of PCFGs that satisfy certain natural conditions including being anchored (Stratos et al., 2016). ** We proceed via a reparameterisation of (top-down) PCFGs which we call a bottom-up weighted context-free grammar. We show that if the grammar is anchored and satisfies additional restrictions on its ambiguity, then the parameters can be directly related to distributional properties of the anchoring strings; we show the asymptotic correctness of a naive estimator and present some simulations using synthetic data that show that algorithms based on this approach have good finite sample behaviour.
NOTE: Video may display a random order of authors. Correct author list is at the top of this page.

Connected Papers in EMNLP2020

Similar Papers

Unsupervised Parsing via Constituency Tests
Steven Cao, Nikita Kitaev, Dan Klein,
Visually Grounded Compound PCFGs
Yanpeng Zhao, Ivan Titov,
Intrinsic Probing through Dimension Selection
Lucas Torroba Hennigen, Adina Williams, Ryan Cotterell,