This paper presents research to address the problem of evaluation data contamination that can arise during the distillation process, which improves the inference capability of large-scale language models. Specifically, considering the partial availability of distillation data, we define the task of distillation data detection and propose a Token Probability Deviation (TBD) method that utilizes token generation probability patterns. TBD analyzes the tendency of a distilled model to generate certain tokens with high probability for previously learned questions and with low probability for new questions. This method detects distillation data by measuring the deviation in token probability. Experimental results show that TBD achieves an AUC of 0.918 and a TPR of 0.470 at a FPR of 1% on the S1 dataset.