This paper presents a study to improve the efficiency of Goodness of Pronunciation (GOP), a pronunciation quality measurement metric used in computer-assisted pronunciation training (CAPT) systems. Existing GOPs rely on forced alignment, which is vulnerable to labeling and segmentation errors due to acoustic variations. Alignment-free methods have been proposed, but they are computationally expensive and have performance degradation issues with the length of phoneme sequences and the size of phoneme lists. Therefore, in this paper, we propose a substitution-aware alignment-free GOP that restricts phoneme substitutions based on phoneme clusters and common learner errors. We evaluate the proposed method using two L2 English speech datasets (My Pronunciation Coach (MPC) and SpeechOcean762) and show that it outperforms existing methods.