This paper presents a reinforcement learning (RL)-based framework for efficiently developing complex model transformation (MT) sequences in model-based engineering. Complex MT sequences are required for a variety of problems, including model synchronization, automatic model recovery, and design space exploration. However, manually developing them is error-prone and challenging. In this paper, we propose an approach and technical framework that enables an RL agent to find optimal MT sequences using user advice, which may include uncertainty. We map user-defined MTs to RL primitives and execute them as RL programs to find optimal MT sequences. Experimental results demonstrate that even under uncertainty, user advice significantly improves RL performance, contributing to more efficient development of complex MTs. This study advances RL-based human-in-the-loop engineering methodology by addressing the tradeoff between the certainty and timing of user advice.