Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Traj-MLLM: Can Multimodal Large Language Models Reform Trajectory Data Mining?

Created by
  • Haebom

Author

Shuo Liu, Di Yao, Yan Lin, Gao Cong, Jingping Bi

Outline

This paper addresses the challenge of building a general model capable of analyzing human movement paths across diverse regions and tasks. Existing research has been limited to training in specific regions or being suitable only for a small number of tasks. To address this challenge, we propose the Traj-MLLM framework, leveraging a multimodal large-scale language model (MLLM). Traj-MLLM integrates multi-view contexts to transform raw path data into image-text sequences and leverages the inference capabilities of MLLM to perform path analysis. Furthermore, we propose a prompt optimization technique that generates data-invariant prompts for task adaptation. Experimental results show that Traj-MLLM outperforms the existing best-performing models by 48.05%, 15.52%, 51.52%, and 1.83% on travel time prediction, mobility prediction, anomaly detection, and transportation mode identification tasks, respectively. Traj-MLLM does not require fine-tuning the MLLM backbone or training data.

Takeaways, Limitations

Takeaways:
We present a human movement path analysis model that can be generalized to various regions and tasks using a multimodal large-scale language model (MLLM).
Overcoming the limitations of existing models, it achieves excellent performance in various tasks such as travel time prediction, mobility prediction, anomaly detection, and transportation mode identification.
Leverage MLLM's inference capabilities to achieve performance improvements without separate training data or model fine-tuning.
Improve task adaptability by creating data-invariant prompts using prompt optimization techniques.
Limitations:
It depends on the performance of MLLM, and the limitations of MLLM may affect the performance of Traj-MLLM .
Because of the heavy reliance on prompt engineering, designing optimal prompts can be challenging.
Further validation of generalization performance on different types of path data is needed.
The size and computational cost of the MLLM used can be high.
👍