This paper proposes a solution to the problem that reinforcement learning-based game AI focuses on improving skill, while evolutionary algorithm-based methods generate diverse play styles but suffer from poor performance. We present Mixed Proximal Policy Optimization (MPPO), a method that improves the skill of existing low-performing agents while maintaining their unique styles. MPPO integrates loss objectives for online and offline samples and introduces implicit constraints that approximate the demo agent's policy by adjusting the empirical distribution of the samples. Experimental results on environments of various scales demonstrate that MPPO achieves skill levels similar to or better than purely online algorithms while preserving the play styles of the demo agent. Consequently, we present an effective method for generating highly skilled and diverse game agents that contribute to more immersive gaming experiences.