This paper evaluates the feasibility of building a digital assistant capable of executing complex tasks using large-scale language models (LLMs). Such an assistant generates a task execution program that executes a multi-step goal by combining objects and functions defined in an assistant library based on pre-trained programming knowledge. To this end, we develop the ASPERA framework, which consists of an assistant library simulation and a human-assisted LLM data generation engine. The ASPERA engine guides developers in generating high-quality tasks consisting of complex user queries, simulation states, and corresponding validation programs, thereby addressing data availability and evaluation robustness issues. We also release Asper-Bench, an evaluation dataset consisting of 250 difficult tasks generated using ASPERA, showing that program generation based on a user-defined assistant library significantly increases the difficulty of LLMs compared to code generation without dependencies.