Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

FreqPolicy: Frequency Autoregressive Visuomotor Policy with Continuous Tokens

Created by
  • Haebom

Author

Yiming Zhong, Yumeng Liu, Chuyang Xiao, Zemin Yang, Youzhuo Wang, Yufei Zhu, Ye Shi, Yujing Sun, Xinge Zhu, Yuexin Ma

Outline

Learning effective visuomotor policies for robot manipulation is challenging because it requires generating accurate motions while maintaining computational efficiency. In this paper, we observed that representing motion in the frequency domain more effectively captures structured motion. Low-frequency components reflect global movement patterns, while high-frequency components encode fine details. Furthermore, robot manipulation tasks of varying complexity require different levels of modeling precision in these frequency bands. Inspired by this, we propose a novel visuomotor policy learning paradigm that incrementally models hierarchical frequency components. To further improve precision, we introduce continuous latent representations that maintain smoothness and continuity in the motion space. Extensive experiments on various 2D and 3D robot manipulation benchmarks demonstrate that the proposed approach outperforms existing methods in both accuracy and efficiency, and the frequency-domain autoregressive framework using continuous tokens demonstrates its potential for generalized robot manipulation.

Takeaways, Limitations

Takeaways:
A novel methodology for effectively modeling robot manipulation behavior using frequency-domain representations is presented.
Improve both accuracy and efficiency by incrementally modeling hierarchical frequency components.
Introducing continuous latent representations to ensure smoothness and continuity in the motion space.
Demonstrated superior performance over existing methods in various robot manipulation benchmarks
Limitations:
The specific Limitations presented is not mentioned in the paper (but cannot be analyzed based on the abstract alone)
👍