[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

LaMP-Cap: Personalized Figure Caption Generation With Multimodal Figure Profiles

Created by
  • Haebom

Author

Ho Yin 'Sam' Ng, Ting-Yao Hsu, Aashish Anantha Ramakrishnan, Branislav Kveton, Nedim Lipka, Franck Dernoncourt, Dongwon Lee, Tong Yu, Sungchul Kim, Ryan A. Rossi, Ting-Hao 'Kenneth' Huang

Outline

This paper points out that despite the development of various models, AI-generated figure captions need to be modified because they do not match the author's writing style and the style of the relevant field. To solve this problem, we introduce LaMP-Cap, a dataset for personalized figure caption generation using multi-modal figure profiles. LaMP-Cap provides profiles for each figure, including images, captions, and figure mention paragraphs of other figures in the same document, to characterize their context. Experimental results show that using profile information helps generate captions that are closer to the author's caption, and that the images in the profiles are more informative than the figure mention paragraphs.

Takeaways, Limitations

Takeaways:
Demonstrates the effectiveness of generating personalized image captions by leveraging multi-modal (image, text) profiles.
The LaMP-Cap dataset provides an important resource for research in personalized picture caption generation.
We demonstrate that image information within a profile is more effective than text information in generating captions.
Limitations:
Further research is needed on the size and diversity of the LaMP-Cap dataset.
Further studies are needed to study extensibility to other types of multi-modal profiles (e.g., tables, charts).
Further analysis is needed to determine whether there is bias toward certain fields or styles.
👍