Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Reangle-A-Video: 4D Video Generation as Video-to-Video Translation

Created by
  • Haebom

Author

Hyeonho Jeong, Suhyeon Lee, Jong Chul Ye

Outline

Reangle-A-Video is an integrated framework for generating synchronized multiview videos from a single input video. Unlike mainstream approaches that train multiview video diffusion models on large-scale 4D datasets, our method reframes the multiview video generation task as a video-to-video transformation by leveraging publicly available image and video diffusion priors. Reangle-A-Video operates in two steps. First, it synchronously fine-tunes an image-to-video diffusion transformer in a self-supervised manner to extract view-invariant motion from a set of distorted videos. Second, it warps and fills the first frame of the input video with different camera viewpoints using DUSt3R, following inferred temporal cross-view consistency guidelines, to generate a multiview-consistent starting image. Extensive experiments on static view transfer and dynamic camera control demonstrate that Reangle-A-Video outperforms existing methods, offering a novel solution for multiview video generation. Code and data will be made public.

Takeaways, Limitations

Takeaways:
We present a novel efficient method for generating multi-view videos from a single video input.
Reduced dependence on large 4D datasets.
It shows better performance than existing methods.
Ensuring reproducibility and scalability of research through open code and data disclosure.
Limitations:
Dependency on other models such as DUSt3R.
Further research is needed on generalization performance across different scenarios.
Due to limitations of self-supervised learning methods, there is a possibility of performance degradation in certain situations.
👍