Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

MemoryTalker: Personalized Speech-Driven 3D Facial Animation via Audio-Guided Stylization

Created by
  • Haebom

Author

Hyung-Kyu Kim, Sangmin Lee, Hak-Gu Kim

Outline

This paper proposes MemoryTalker, which enables realistic and accurate 3D facial motion synthesis that reflects a speaker's speech tone using only audio input. Previous studies have limited practical applications, as they require prior information, such as the speaker's class label or additional 3D face meshes, to adequately reflect speech tone. MemoryTalker utilizes a two-stage training process: Memorizing, which stores and retrieves common motions; and Animating, which synthesizes personalized facial motions using styled motion memories based on audio-based speech characteristics. In this second stage, the model learns which facial motion types should be emphasized for specific audio. Consequently, MemoryTalker can generate reliable personalized facial animations without additional prior information. Through quantitative and qualitative evaluations and user studies, we demonstrate the model's effectiveness and performance improvement over existing state-of-the-art methods for personalized facial animation.

Takeaways, Limitations

Takeaways:
Create personalized 3D facial animations using just audio input
Solving the problem of prior information dependence of existing methods
Create realistic and accurate animations that reflect speech patterns.
Increased practicality in various applications
Limitations:
The paper does not specifically mention Limitations. Additional experiments or analyses are needed to elucidate Limitations.
Potential overfitting to certain audio.
There is a need to evaluate generalization performance across different languages and speech styles.
👍