Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Evaluating Logit-Based GOP Scores for Mispronunciation Detection

Created by
  • Haebom

Author

Aditya Kamlesh Parikh, Cristian Tejedor-Garcia, Catia Cucchiarini, Helmer Strik

Outline

In this paper, we propose and compare a method that uses logit-based GOP scores instead of the conventional softmax-based probability-based GOP for pronunciation assessment. We conduct experiments on two L2 English corpora of Dutch and Mandarin speakers, and evaluate the correlation between the classification performance and human rater scores. The results show that the logit-based method outperforms the probability-based GOP in classification performance, but the effect varies depending on the dataset characteristics. The maximum logit GOP matches human perception the best, suggesting that a hybrid method that combines various GOP scores is effective in considering both probability and logit features in a balanced way. Our results suggest that a hybrid GOP method that includes uncertainty modeling and phoneme-wise weighting can improve pronunciation assessment.

Takeaways, Limitations

Takeaways:
We show that logit-based GOP scores outperform probability-based GOP scores in pronunciation error detection classification.
The maximum logit GOP shows the highest correlation with human ratings.
We suggest that a hybrid GOP method (combining probabilistic and logit features) can contribute to improving pronunciation assessment performance.
Suggesting that uncertainty modeling and phoneme-specific weighting are important factors in improving pronunciation assessment.
Limitations:
The effectiveness of logit-based methods varies depending on the characteristics of the dataset.
The dataset used is limited to Dutch and Mandarin speakers of L2 English learners, which may limit generalizability to learners of different languages and backgrounds.
Further research is needed on the optimal design and generalization performance of the hybrid GOP method.
👍