This is a page that curates AI-related papers published worldwide. All content here is summarized using Google Gemini and operated on a non-profit basis. Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.
A Survey of Multi-sensor Fusion Perception for Embodied AI: Background, Methods, Challenges and Prospects
Created by
Haebom
Author
Shulan Ruan, Rongwei Wang, Xuchen Shen, Huijie Liu, Baihui Xiao, Jun Shi, Kun Zhang, Zhenya Huang, Yu Liu, Enhong Chen, You He
Outline
This paper provides a comprehensive review of multi-sensor fusion perception (MSFP). It points out the limitations of existing studies that are biased toward specific tasks or research areas and do not help researchers in other fields, and systematically organizes MSFP research from a task-independent perspective. It reviews MSFP methods from various technical perspectives (multi-modal fusion, multi-view fusion, time series fusion, multi-modal LLM fusion, etc.) and suggests future research directions. In particular, it also covers multi-modal fusion methods utilizing large-scale language models (LLMs), which have recently attracted attention.
Takeaways, Limitations
•
Takeaways:
◦
We overcome the limitations of existing studies and comprehensively present the MSFP methodology from various perspectives that are independent of the task.
◦
It reflects the latest trends, including multi-modal, multi-view, time series fusion, as well as LLM-based multi-modal fusion methods.
◦
Provides insight into important advances for researchers in the MSFP field and provides guidance for future research.
•
Limitations:
◦
As noted in the paper, the results are still in their early stages and further research may be needed.
◦
It may focus on a broad overview rather than an in-depth analysis of specific methodologies.