B-MoCA is a new benchmark for evaluating the performance of mobile device control agents. It is based on the Android operating system and contains 131 common tasks. It evaluates generalization performance by randomly changing the configuration of the mobile device, such as the user interface layout and language settings. It benchmarks a variety of agents, including agents using large-scale language models (LLMs) or multi-modal LLMs, and agents trained by imitation learning using expert demonstrations. It shows that agents are good at simple tasks but perform poorly on complex tasks, suggesting important areas for future research. The source code is publicly available.