Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

CAD2DMD-SET: Synthetic Generation Tool of Digital Measurement Device CAD Model Datasets for fine-tuning Large Vision-Language Models

Created by
  • Haebom

Author

Jo ao Valente, Atabak Dehban, Rodrigo Ventura

Outline

This paper proposes CAD2DMD-SET, a synthetic data generation tool, to address the real-world challenge of large-scale vision-language models (LVLMs) struggling with the simple task of reading values from digital measurement devices (DMDs). CAD2DMD-SET leverages 3D CAD models, advanced renderings, and high-fidelity image synthesis to generate a diverse VQA-labeled synthetic DMD dataset, along with a validation set, DMDBench, for evaluating real-world constraints. Evaluations on three state-of-the-art LVLMs demonstrate significant performance improvements for models trained with CAD2DMD-SET, with InternVL achieving a 200% performance boost. CAD2DMD-SET will be open-sourced in the future.

Takeaways, Limitations

Takeaways:
We present CAD2DMD-SET, a synthetic data generation tool that contributes to improving the performance of reading DMD values of LVLMs.
Generate and evaluate datasets that take into account real-world challenges (noise, occlusion, extreme viewpoints, motion blur).
Demonstrated effectiveness in substantially improving the performance of state-of-the-art LVLMs (200% improvement for InternVL).
Presenting the possibility of utilization by the research community through future open source disclosure.
Limitations:
Currently, CAD2DMD-SET is not open source at the time of publication.
The scale of DMDBench (1,000 images) may be relatively small.
Further research is needed on generalization performance across different DMD types and environments.
👍