In robotic manipulation using Vision-Language Models (VLMs), existing methods suffer from the problem of reducing information to intermediate representations that lose important details. To address this issue, we propose a novel framework, called AntiGrounding. AntiGrounding directly elevates candidate actions into the VLM representation space, renders trajectories from multiple viewpoints, and performs command-based decision-making via structured visual question answering. This enables zero-shot synthesis of optimal closed-loop robot trajectories for new tasks. In addition, we propose an offline policy improvement module that leverages past experience to improve long-term performance. Simulation and real-world experimental results demonstrate that the proposed method outperforms existing methods on a variety of robotic manipulation tasks.