Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Rational Inverse Reasoning

Created by
  • Haebom

Author

Ben Zandonati, Tom as Lozano- Perez, Leslie Pack Kaelbling

Outline

This paper argues that, unlike humans' ability to learn from single examples, robots struggle with generalization, arguing that this is due to their inability to recover the underlying explanation (latent program) of intelligent behavior. To address this, we propose a Rational Inverse Reasoning (RIR) framework that infers latent programs through a hierarchical generative model of behavior. RIR addresses small-shot imitation through a Bayesian program induction approach, where a vision-language model iteratively proposes structured symbolic task hypotheses, and a planner-based inference system evaluates each hypothesis based on the likelihood of observed exemplars. This process yields a posterior probability for a concise and feasible program. We evaluate RIR on a set of continuous manipulation tasks, assessing single-shot and small-shot generalization across a variety of object poses, counts, geometric shapes, and arrangements. We demonstrate that RIR can infer the intended task structure and generalize to new settings with just a single example, outperforming state-of-the-art vision-language model baselines.

Takeaways, Limitations

Takeaways:
We propose the possibility of improving the robot's small-shot learning ability through the RIR framework.
Combining a visual-language model with a planner enables more efficient imitation learning.
Even a single demonstration can contribute to the development of a generalizable robot control system.
Limitations:
Currently, the evaluation is limited to continuous operation tasks, and performance verification in more diverse task domains is required.
Since the performance of RIR depends on the performance of the planner, limitations of the planning algorithm can constrain the performance of RIR.
Generalization to complex tasks or tasks involving multi-object interactions requires further study.
👍