Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Ego-Foresight: Self-supervised Learning of Agent-Aware Representations for Improved RL

Created by
  • Haebom

Author

Manuel Serra Nunes, Atabak Dehban, Yiannis Demiris, Jos e Santos-Victor

Outline

This paper presents Ego-Foresight, a novel method inspired by human movement prediction, to address the sample efficiency problem of deep reinforcement learning (RL). To overcome the large training data requirements of conventional RL, we take an approach that separates the agent and its environment from each other. However, unlike previous studies, we learn the agent-environment interaction using the agent's movements themselves, without any supervised signals. Ego-Foresight enhances the agent's perception ability through self-supervised learning via visual-motor predictions, enabling it to predict agent movements from simulated and real-world robot data. By integrating it with model-free RL algorithms, we demonstrate improved sample efficiency and performance.

Takeaways, Limitations

Takeaways:
We demonstrate that the sample efficiency of reinforcement learning can be improved by enhancing the agent's cognitive ability through self-supervised learning.
We present a novel approach to improve the performance of RL algorithms by mimicking human motion prediction capabilities.
We have verified its effectiveness not only in simulation environments but also in actual robot data, increasing its applicability in practice.
Limitations:
Further research is needed to determine the generalizability of the proposed method. Its applicability across a wider range of environments and tasks needs to be further validated.
Currently, it has been applied to model-free RL algorithms, but integration and performance comparison studies with model-based RL algorithms are needed.
Experimental scale in real-world robotic data may be limited, and experiments on more diverse and complex tasks are needed.
👍