Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

FaceEditTalker: Controllable Talking Head Generation with Facial Attribute Editing

Created by
  • Haebom

Author

Guanwen Feng, Zhiyuan Ma, Yunan Li, Jiahao Yang, Junwei Jing, Qiguang Miao

Outline

This paper presents the FaceEditTalker framework, which integrates facial attribute editing into audio-based talking head generation. Unlike previous studies that focus on lip synchronization and emotional expression, FaceEditTalker flexibly adjusts visual attributes such as hairstyle, accessories, and fine facial features, enhancing its potential for diverse applications such as personalized digital avatars, online educational content, and brand-specific digital customer service. It consists of an image feature space editing module that extracts semantic and detailed features and controls their properties, and an audio-based video generation module that fuses the edited features with audio-guided facial landmarks to drive a diffusion-based generator. Experimental results demonstrate that FaceEditTalker achieves comparable or superior performance to existing methods in terms of lip synchronization accuracy, video quality, and attribute controllability.

Takeaways, Limitations

Takeaways:
Integrating facial attribute editing capabilities into audio-based talking head generation offers the potential for user customization and expansion into various application areas.
Achieving temporal consistency, visual fidelity, and identity preservation simultaneously through the combination of an image feature spatial editing module and an audio-based video generation module.
Increased potential for use in a variety of applications (digital avatars, online education, customer service, etc.).
Experimentally verified improved performance compared to existing methods.
Limitations:
The paper lacks specific references to Limitations or future research directions.
A detailed description of the dataset and evaluation metrics used is required.
Further research is needed on performance and stability in real-world applications.
👍