Multimodal Joint Attribute Prediction and Value Extraction for E-commerce Product

Tiangang Zhu, Yue Wang, Haoran Li, Youzheng Wu, Xiaodong He, Bowen Zhou

Information Extraction Long Paper

Zoom-5A: Nov 17, Zoom-5A: Nov 17 (08:00-09:00 UTC) [Join Zoom Meeting]

You can open the pre-recorded video in a separate window.

Abstract: Product attribute values are essential in many e-commerce scenarios, such as customer service robots, product recommendations, and product retrieval. While in the real world, the attribute values of a product are usually incomplete and vary over time, which greatly hinders the practical applications. In this paper, we propose a multimodal method to jointly predict product attributes and extract values from textual product descriptions with the help of the product images. We argue that product attributes and values are highly correlated, e.g., it will be easier to extract the values on condition that the product attributes are given. Thus, we jointly model the attribute prediction and value extraction tasks from multiple aspects towards the interactions between attributes and values. Moreover, product images have distinct effects on our tasks for different product attributes and values. Thus, we selectively draw useful visual information from product images to enhance our model. We annotate a multimodal product attribute value dataset that contains 87,194 instances, and the experimental results on this dataset demonstrate that explicitly modeling the relationship between attributes and values facilitates our method to establish the correspondence between them, and selectively utilizing visual product information is necessary for the task. Our code and dataset are available at https://github.com/jd-aig/JAVE.
NOTE: Video may display a random order of authors. Correct author list is at the top of this page.

Connected Papers in EMNLP2020

Similar Papers