[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Solving nonconvex Hamilton--Jacobi--Isaacs equations with PINN-based policy iteration

Created by
  • Haebom

Author

Hee Jun Yang, Min Jung Kim, Yeoneung Kim

Outline

In this paper, we propose a novel method for solving high-dimensional nonconvex Hamilton-Jacobi-Isaacs (HJI) equations. The method uses a mesh-free policy iteration framework that combines classical dynamic programming and physical information neural networks (PINNs). It is applicable to HJI equations arising from stochastic differentiation games and robust control, and it iterates through solving second-order linear partial differential equations under fixed feedback policies and updating the control through point-wise min-max optimization with automatic differentiation. We prove that the value function iteration locally and uniformly converges to a unique viscous solution of the HJI equations under the standard Lipschitz conditions and the uniform elliptic conditions. We establish the iso-Lipschitz regularity of the iteration without requiring the convexity of the Hamiltonian, thereby ensuring provably stable and convergent results. We demonstrate the accuracy and scalability of the method through numerical experiments on stochastic path planning games with two-dimensional moving obstacles and publisher-subscriber differentiation games with five- and ten-dimensional anisotropic noise. The proposed method outperforms the direct PINN solver, providing a smoother value function and lower residuals.

Takeaways, Limitations

Takeaways:
Efficient and accurate solutions to high-dimensional nonconvex HJI equations
Presenting a new approach and providing theoretical basis through combining PINNs and policy iteration
Suggests applicability to various fields such as robotics, finance, and multi-agent reinforcement learning
Shows superior performance and scalability in high-dimensional problems compared to existing methods (finite difference method)
Limitations:
Dependence on standard assumptions such as the Lipschitz condition and the uniform ellipticity condition.
Convergence proof is limited to locally uniform convergence
The experimental results are limited to a specific problem and need to be generalized to a broader problem.
👍