[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Revisiting Reliability in the Reasoning-based Pose Estimation Benchmark

Created by
  • Haebom

Author

Junsu Kim, Naeun Kim, Jaeho Lee, Incheol Park, Dongyoon Han, Seungryul Baek

Outline

This paper addresses the reproducibility and quality issues of the Reasoning-based Pose Estimation (RPE) benchmark. The RPE benchmark is widely used as a standard for evaluating pose-aware multimodal large-scale language models (MLLMs). However, we point out that it requires a manual matching process to obtain accurate GT annotations using image indices different from the original 3DPW dataset. We also analyze the limitations of the benchmark quality, such as image overlap, scenario imbalance, simple poses, and ambiguous text descriptions. To address these issues, we improve the GT annotations and open-source them to facilitate consistent quantitative evaluation and MLLM advancement.

Takeaways, Limitations

Takeaways:
Addresses reproducibility issues in RPE benchmarks and provides accurate GT annotations to enable fair and consistent quantitative evaluations
Improving research reproducibility and transparency through open-sourcing improved GT annotations
Contribute to the development of future posture-aware multi-modal inference models
Limitations:
Fundamental limitations of the RPE benchmark still exist, such as image duplication, scenario imbalance, simple poses, and vague text descriptions.
Improved GT annotation may depend on the subjective judgment of the research team and may not be a perfect solution.
Lack of solutions to the fundamental design problems of benchmarks
👍