Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Boosting Chart-to-Code Generation in MLLM via Dual Preference-Guided Refinement

Created by
  • Haebom

Author

Zhihan Zhang, Yixin Cao, Lizi Liao

Outline

This paper focuses on the task of converting chart images into executable plotting scripts, namely chart-to-code generation. This task is inherently underconstrained, requiring a multimodal large-scale language model (MLLM) to perform fine-grained visual parsing, accurate code synthesis, and robust cross-modal inference. Multiple valid code implementations can generate the same visual chart, and evaluation must consider both code correctness and visual fidelity across multiple dimensions. This makes it difficult to learn accurate and generalizable mappings using standard supervised fine-tuning. To address this challenge, this paper proposes a dual-preference guidance improvement framework that combines a feedback-based dual-modality reward mechanism with iterative preference learning. Our approach efficiently generates high-quality, aspect-aware preference pairs by introducing a structured variation generation strategy and a visual reward model, thereby increasing the scalability of preference collection and making supervision more goal-oriented. These preferences are then used in an offline reinforcement learning setting to optimize the model for improved multidimensional fidelity. Experimental results demonstrate that the proposed framework significantly improves the performance of general-purpose, open-source MLLMs, generating high-quality plotting code that rivals professional chart-centric models and even some proprietary systems. The code and dataset are publicly available at https://github.com/Zhihan72/Chart2Code .

Takeaways, Limitations

Takeaways:
We significantly improved the chart-to-code generation performance of a general-purpose, open-source MLLM through our dual preference guidance improvement framework.
We present a strategy to efficiently generate high-quality aspect-aware preference pairs, thereby increasing the scalability of preference collection.
We present an offline reinforcement learning setup that optimizes models to improve multidimensional fidelity.
The quality of the generated code has improved to the point where it can compete with professional chart-centric models and some proprietary systems.
We have made our code and datasets publicly available to enhance the reproducibility of our research.
Limitations:
The performance of the proposed framework may depend on the MLLM and dataset used.
Generalization performance for complex or special-shaped charts requires further study.
Developing and improving evaluation metrics that consider both visual fidelity and code correctness may be necessary.
Support for different types of plotting libraries may need to be extended.
👍