Nvidia, UC Berkeley and Stanford released a framework showing that state-of-the-art AI models failed at robot control without human-designed building blocks, and demonstrated that agentic scaffolding techniques, including targeted test-time compute scaling, substantially closed the performance gap.