In this paper, we propose __T87832_____-Sim, a novel framework for learning robot manipulation policies by mimicking human motion. Unlike existing methods that struggle with the differences between humans and robots, __T87833_____-Sim reconstructs realistic simulation environments from RGBD images and trains reinforcement learning (RL) policies using dense reward signals based on object motion. The learned policies are distilled into an image-conditional diffusion policy using synthetic data rendered with various viewpoints and illuminations, and online domain adaptation is used to align real and simulated observations for real-world applications. We demonstrate that our approach outperforms existing methods by an average of 30% on five manipulation tasks without teleoperation data, reduces data collection time by a factor of 10, and generalizes well to new camera viewpoints and testing time variations.