In this paper, we present a new benchmark for human-AI collaboration in physically constrained environments, “Moving Out.” Moving Out reflects diverse collaboration modes that are affected by physical properties and constraints, such as moving heavy objects together or moving objects around corners. We design two tasks and collect human-human interaction data to evaluate the model’s adaptability to diverse human behaviors and unpredictable physical properties. To address the challenges of physical environments, we propose a novel method called Behavior Augmentation, Simulation, and Selection (BASS) to enhance the agent’s diversity and understanding of behavioral outcomes. Experimental results show that BASS outperforms state-of-the-art models in AI-AI and human-AI collaboration.