Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

On-Device Diffusion Transformer Policy for Efficient Robot Manipulation

작성자
  • Haebom

Author

Yiming Wu, Huan Wang, Zhenghao Chen, Jianxin Pang, Dong Xu

Outline

This paper proposes LightDP, a novel framework designed to accelerate the real-time deployment of diffusion policies, which struggles due to limited mobile device resources. LightDP addresses computational bottlenecks through two key strategies: network compression of the denoising module and reduction of required sampling steps. We conduct extensive computational analysis of existing diffusion policy architectures to identify the denoising network as a primary contributor to latency. To overcome the performance degradation associated with existing pruning methods, we introduce an integrated pruning and retraining pipeline that explicitly optimizes the model's post-pruning resilience. Furthermore, we combine pruning techniques with consistency distillation to effectively reduce sampling steps while maintaining action prediction accuracy. Experimental evaluations on standard datasets such as PushT, Robomimic, CALVIN, and LIBERO demonstrate that LightDP achieves competitive performance for real-time action prediction on mobile devices, representing a significant step toward practical deployment of diffusion-based policies in resource-constrained environments. Extensive real-world experiments demonstrate that the proposed LightDP achieves performance comparable to state-of-the-art diffusion policies.

Takeaways, Limitations

Takeaways:
We present a LightDP framework that enables real-time distribution of diffusion policies on mobile devices.
Addressing computational bottlenecks through network compression and reduced sampling steps.
Mitigating post-pruning performance degradation with an integrated pruning and retraining pipeline.
Reduce sampling steps and maintain accuracy using consistency distillation.
Achieve real-time motion prediction and state-of-the-art performance on diverse datasets (PushT, Robomimic, CALVIN, LIBERO).
Limitations:
LightDP's performance improvements may depend on specific datasets and architectures.
Further research is needed on the generalization performance of the proposed pruning and retraining strategy.
Additional experiments are needed to consider the diversity of robot manipulation tasks in real-world environments.
Lack of analysis of energy consumption.
👍