Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

MultiGen: Child-Friendly Multilingual Speech Generator with LLMs

Created by
  • Haebom

Author

Xiaoxue Gao, Huayun Zhang, Nancy F. Chen

Outline

This paper focuses on achieving high-quality, child-friendly speech generation across diverse languages and cultural backgrounds, including low-resource languages. We aim to leverage the potential of generative speech models, which have utility in practical applications such as language learning for children. To this end, we propose MultiGen, a multilingual speech generation model that utilizes an LLM architecture for speech generation tailored to low-resource languages. MultiGen aims to facilitate children's communication with AI systems in culturally appropriate contexts, using three low-resource languages: Mandarin, Malay, and Tamil with a Singaporean accent. Experimental results, including objective metrics and subjective evaluations, demonstrate that the proposed MultiGen outperforms baseline methods.

Takeaways, Limitations

Takeaways:
A novel approach to child-friendly multilingual speech generation models for low-resource languages is presented.
Contribute to solving the problem of speech generation for low-resource languages by utilizing the LLM architecture.
Presenting the possibility of interacting with child-friendly AI systems that take cultural context into account.
Validation of model excellence through objective and subjective evaluations.
Limitations:
Consideration is needed for extensibility to languages other than the three low-resource languages used in the paper.
Further research is needed on the objectivity and generalizability of child-friendly criteria-setting and assessment methods.
Lack of detailed information about the size and quality of training data for MultiGen models.
Absence of long-term usability testing results with actual child users.
👍