[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

StableAnimator++: Overcoming Pose Misalignment and Face Distortion for Human Image Animation

Created by
  • Haebom

Author

Shuyuan Tu, Zhen Xing, Xintong Han, Zhi-Qi Cheng, Qi Dai, Chong Luo, Zuxuan Wu, Yu-Gang Jiang

Outline

StableAnimator++ is an ID-preserving video diffusion framework proposed to solve the problem of maintaining ID consistency in existing human image animation diffusion models, especially when the body size or position of the reference image and the driving video are significantly different. It generates high-quality videos based on reference images and pose sequences without post-processing, and is the first model with learnable pose alignment. It performs pose alignment by predicting the similarity transformation matrix between the reference image and the driving pose using a learnable layer based on Singular Value Decomposition (SVD), and improves the face embedding using image and face embedding. In addition, it introduces a distribution-aware ID adapter to prevent interference in the temporal layer and preserves the ID through distribution alignment. In the inference step, it introduces HJB-based face optimization to improve the fidelity of the face during the noise removal process. Its effectiveness is qualitatively and quantitatively proven through benchmark experiments.

Takeaways, Limitations

Takeaways:
Contributes to solving the ID consistency problem of existing diffusion models
Mitigating mismatch issues between reference images and driving videos with learnable pose alignment
Create high-quality videos without post-processing
Improving facial fidelity through HJB-based facial optimization
Presenting an effective ID preservation strategy using SVD and distribution-aware ID adapter
Limitations:
Lack of specific Limitations or constraints mentioned in the paper.
Further details of the experimental results and comparative analysis of performance under various conditions are needed.
Further research is needed on the generalizability of the method using SVD and HJB and its applicability to other animation methods.
👍