Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Spatiotemporal Forecasting as Planning: A Model-Based Reinforcement Learning Approach with Generative World Models

Created by
  • Haebom

Author

Hao Wu, Yuan Gao, Xingjian Shi, Shuaipeng Li, Fan Xu, Fan Zhang, Zhihong Zhu, Weiyan Wang, Xiao Luo, Kun Wang, Xian Wu, Xiaomeng Huang

Outline

To address the dual challenges of physical spatiotemporal forecasting—its inherent probabilistic nature and its indifferentiable metrics—this paper proposes Spatiotemporal Forecasting as Planning (SFP), a novel paradigm based on model-based reinforcement learning. SFP enables "imagination-based" environmental simulation by building a novel generative world model that simulates a variety of high-quality future states. Within this framework, the underlying forecast model acts as an agent guided by a beam search-based planning algorithm, which utilizes indifferentiable domain metrics as reward signals to explore high-reward future sequences. These identified high-reward candidates are then used as pseudolabels to continuously optimize the agent's policy through iterative self-learning, significantly reducing prediction errors and demonstrating outstanding performance on critical domain metrics such as extreme event detection.

Takeaways, Limitations

Takeaways:
A new paradigm for solving probabilistic and non-differentiable metric problems using model-based reinforcement learning is presented.
Implementing “imagination-based” environmental simulations through generative world models.
Exploring high-reward sequences using a beam search-based planning algorithm and a non-differentiable domain metric.
Optimizing agent policies through iterative self-learning.
Demonstrated outstanding performance in key domain metrics, including extreme event detection.
Limitations:
The specific Limitations is not specified in the paper (based on the Abstract alone).
👍