Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Tailored Teaching with Balanced Difficulty: Elevating Reasoning in Multimodal Chain-of-Thought via Prompt Curriculum

Created by
  • Haebom

Author

Xinglong Yang, Quan Feng, Zhongying Pan, Xiang Chen, Yu Tian, Wentong Li, Shuofei Qiao, Yuxia Geng, Xingyu Zhao, Sheng-Jun Huang

Outline

To improve the efficiency of Multimodal Chain-of-Thought (MCoT) prompting, we aim to overcome the limitations of randomly or manually selected examples. We point out that performance instability occurs because it fails to account for model-specific knowledge distributions and the inherent complexity of the task. Therefore, we propose a new framework inspired by the principle of "personalized training with balanced difficulty." This framework redefines prompt selection as a prompt curriculum design problem, constructing a set of training examples aligned with the model's current ability. We develop a difficulty-balanced sampling strategy by integrating two signals: prediction discrepancy (active learning), which captures the model's perceived difficulty, and intrinsic sample complexity, which measures the inherent difficulty of problem-image pairs. Experiments with multiple MLLMs across five benchmarks demonstrate consistent performance gains and a reduction in performance variance due to random sampling.

Takeaways, Limitations

Takeaways:
Improving MCoT prompting performance by suggesting a prompt curriculum design method that reflects the model's difficulty and considers the inherent difficulty of the problem.
A new direction in prompt engineering by consistently improving performance across various MLLMs and overcoming the limitations of random sampling.
Difficulty-balanced sampling strategies leveraging active learning and intrinsic sample complexity can be applied to various MLLMs.
Limitations:
Lack of details about the specific algorithm implementation and computational complexity.
Further research is needed to determine whether the new framework can be applied to other fields.
Lack of in-depth analysis of optimization considering model-specific characteristics.
👍