Gen-3 Video Keyframing (Prototype)

Status
2024.12
Summary
Gen-3 announced a keyframing update. It is a waypoint work method that creates and arranges images and videos on an infinitely wide canvas, and connects them organically to complete a single sequence. This function is reminiscent of the canvas creation process of ComfyUI, Ideogram's Canvas, Recraft, Supercraft, and Florafauna. This intuitive 'content pipe on canvas' method will likely become the most important work process in the future. The process of creating and editing images and videos will become much more convenient.
Category
  1. Gen-3
Tag
  1. AI Video
  2. Updates
Dates
2024/12/03
Created by
  • mintbear

Gen-3 Video Keyframing

Mintbear 2024.12.04
Runway teases an update to its 'Gen-3 Video Keyframing' feature.
Video Creating and Editing on White Canvas!

Key Frame? Key Framing?

In RunwayML, we are calling this new feature [KeyFraming] . Why is it called KeyFraming?
Originally, Key-Frame refers to the starting or ending point that determines a movement in the timeline of a movie or animation. By setting the starting and ending reference points of the movement of a character or object on the timeline of the video, the movement of the animation starts or stops based on that key frame. It also serves as an important reference point for cutting video/audio sources or inserting new cuts during the video editing process.
And Key-Framing refers to the process of automatically calculating and connecting each frame of animation using these Key-Frame settings. In other words, by setting the start and end frames of a character's movement or position and automatically creating the remaining changes, it allows for complex animation or video editing processes to be handled much more efficiently.
So Key-Framing is an important tool that combines precise motion control and automation to help us manage complex tasks easily. I think Gen-3 Key-Framing is also trying to provide that convenience.
In fact, Gen-3 already provided a feature called KeyFrames, which creates previous and subsequent videos by inserting a reference image at the starting (FirstFrame) or ending (LastFrame) point . This feature with a similar name is actually an extension of that feature.
[ Existing Gen-3 KeyFrames Function ]

So what is Gen-3 Key Framing?

This feature creates and arranges images and videos on an infinitely wide canvas, and then organically connects them to complete a single sequence. It is a kind of Way-Point-based work that sets each content as a node, connects it, and references it. Each image and video becomes a node and an important keyframe .
The AI generation and editing process creates a huge number of B-Cut images and videos, and it is really, really, really cumbersome to store, find, compare, and version all of these resources. Therefore, this canvas workflow is very intuitive and effective for intuitively comparing a large number of generated contents and finding ways to improve them.
So, I think that this intuitive โ€˜content pipeline on canvasโ€™ method will become the main work process in the future. You can already see the canvas-based organic creation process in tools like ComfyUIโ€™s Pipe-Line, Ideogramโ€™s Canvas, Recraft, Supercraft, Florafauna, etc.
Although this is still a teaser feature and not available for immediate use, it certainly gives us a glimpse into future workflows.

How to generate video in Gen-3 KeyFraming (Preview)

The official documentation above discloses a working example as follows:
To summarize, you can think of the following [Gen-3 video generation scenario] on a white canvas.
[Gen-3 Video Generation Scenario]
Step 01. Create an image on a blank canvas and place it freely.
Step 02. Create a video by connecting images
Step 03. Create Variation of Image Possible
Step 04. Image-to-image, restyling of reference images possible
Step 05. Create a new video by capturing the middle point of the video as a keyframe.
Step 06. Connect paths between multiple videos and output them as a single sequence.
Step 07. Infinite canvas where content can be freely expanded

Step 01. Creating Node: IMAGE on canvas

Create images on a blank canvas and position them freely.

Step 02. Connecting: IMAGE + IMAGE = VIDEO

Create a video by linking images together.

Step 03. Image Serendipity: Variation Upgrade

By creating variations of images, you can create similar but more diverse images with different styles and compositions.

Step 04. Re-Styling: Image-to-Image

Image-to-image (I2I) functionality is supported, allowing you to restyle existing reference images with text prompts.

S tep 05. Non-linear: Branch Video

You can create a new video by capturing the midpoint of the video as a [Keyframe] . This creates a new non-linear branch.

Step 06. Sequencer: VIDEO by Path

Connects paths between multiple video keyframes and numbers their order to output them as a single sequence video.
Thanks to the intuitive interface, you can even create new video structures, perhaps even loop videos!

Step 07. Open Workspace:

Except for the graph structure of the connecting pipelines, all space on the canvas is freely infinitely expandable.
You can spatially categorize and arrange your work based on project objectives and image criteria, visualize multiple versions of references for easy comparison, and imagine and create an overall timeline.

Finish

Among the useful video generation tools currently, the only tool that supports this canvas structure is Florafauna.ai , which resembles ComfyUI. It is a structure that generates images with a pipeline on the canvas, and ultimately leads to video generation. It was the most impressive tool for me.
Recently, Luma Dream Machine introduced an interactive Board , which opened up the conversion and referencing of images and videos very freely, but it was not a structure that could be called a complete canvas structure.
Although Hailuo and Kling are ahead in terms of consistency and stable quality , we are still looking forward to the new challenge of Gen-3 with its various functions and technological solidity.
Thank you
Fn. mintbear ๐Ÿ€๐Ÿงธ

Reference

See below for official Runway documentation and messages.

Creativity as Search: Mapping Latent Space (2024.12.02 by RunwayML)

Posting about Video KeyFraming (2024.12.03 by RunwayML on X)


Today we share an early video keyframing prototype that treats creative exploration as a process of exploration for all potential artistic possibilities, allowing us to simultaneously explore this vast space with precise control and serendipitous nonlinear discovery.


Graph Structure: A Window into the Latent Space


The graph structure is the basis of the prototype. Images are represented as nodes, which act as waypoints in the latent space of the model. These nodes can be connected to other nodes to create edges. Edges are video transitions from the first frame to the last frame through latent space and time.

Balance of control and chance

Precise control helps to limit the vast space of possibilities, but at the same time, variation and unpredictability can lead to โ€œhappy accidentsโ€ โ€“ possibilities that would not have been considered if precise control had been given. To strike this balance, we provide two possibilities for the user to manipulate the image in a โ€œrelationalโ€ way that allows for unpredictability in a consistent dimension.

Users can transform a selected image via โ€œImage to Image,โ€ which changes the style via text prompts while preserving the original composition, while โ€œTransform Imageโ€ changes the composition while maintaining the original style.

Nonlinear search support

Creative exploration rarely follows a straight line. Graph structures naturally encourage exploration by allowing users to branch off at various points, creating new forks of possible alternatives. As more exploration occurs, the graph naturally grows, tracing different paths of experimentation.

This allows users to construct non-linear timelines. We provide a sequencer that allows users to export non-linear timelines as videos with linear timelines, similar to a โ€œchoose your own adventureโ€ experience.


Open workspace


Beyond the graph structure, we do not impose any organizational constraints on the workspace. Users have complete freedom to organize nodes and edges, cluster related explorations according to their process needs, or isolate unique creative experiments.


Thank you. If you found this helpful, please leave a like or comment below.
2024 Mint Bear.
๐Ÿ‘