Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

InfoCausalQA:Can Models Perform Non-explicit Causal Reasoning Based on Infographic?

Created by
  • Haebom

Author

Keummin Ka, Junhyeong Park, Jaehyun Jeon, Youngjae Yu

Outline

This paper proposes InfoCausalQA, a new benchmark for evaluating the causal inference capabilities of visual language models (VLMs). InfoCausalQA consists of two tasks: quantitative causal inference and semantic causal inference. InfoCausalQA evaluates causal inference based on infographics, which combine structured visual data with textual information. Using GPT-4, we generated 1,482 multiple-choice question-answer pairs based on 494 infographic-text pairs collected from four publicly available sources. These pairs were manually reviewed to ensure that answers could not be derived solely from superficial clues. Experimental results show that existing VLMs exhibit limited capabilities in both computational and semantic causal inference, significantly outperforming humans. This highlights the need to improve causal inference capabilities using infographic-based information.

Takeaways, Limitations

Takeaways:
We present InfoCausalQA, a new benchmark for evaluating infographic-based causal inference.
Clearly present the limitations of existing VLMs' causal inference capabilities, especially their semantic causal inference capabilities.
Suggesting research directions for improving the causal inference capabilities of multimodal AI systems.
Limitations:
The InfoCausalQA benchmark may be relatively small (due to data set size limitations).
Question generation relies on GPT-4. GPT-4's limitations may affect the results.
Additional review may be required to ensure objectivity, relying on a human manual review process.
👍